
lobsteR provides a tidy workflow for requesting,
downloading, and reading LOBSTER high-frequency order
book data directly from R. LOBSTER reconstructs full limit order books
from NASDAQ historical message data and delivers them as pairs of
message files and order book snapshot files. This package handles the
end-to-end pipeline — authentication, request submission, archive
retrieval, and file download — so you can focus on analysis. For
downstream high-frequency econometrics, see the highfrequency
package.
Prerequisites: an active account at lobsterdata.com is required to request and download data.
Install the released version from CRAN:
install.packages("lobsteR")Or the development version from GitHub:
# install.packages("pak")
pak::pak("voigtstefan/lobsteR")library(lobsteR)Store your credentials in .Renviron (open it with
usethis::edit_r_environ()) to avoid hardcoding them in
scripts:
LOBSTER_USER=you@example.com
LOBSTER_PWD=your-password
Then authenticate:
lobster_login <- account_login(
login = Sys.getenv("LOBSTER_USER"),
pwd = Sys.getenv("LOBSTER_PWD")
)request_query() expands a symbol and date range into one
row per trading day, automatically removing weekends and NYSE holidays.
level sets the number of order book price levels included
in the snapshot files (e.g. 10 returns the top 10 bid and
ask levels).
library(lobsteR)
request_query(
symbol = "MSFT",
start_date = "2023-01-02",
end_date = "2023-01-13",
level = 10
)#> symbol start_date end_date level
#> 2 MSFT 2023-01-03 2023-01-03 10
#> 3 MSFT 2023-01-04 2023-01-04 10
#> 4 MSFT 2023-01-05 2023-01-05 10
#> 5 MSFT 2023-01-06 2023-01-06 10
#> 8 MSFT 2023-01-09 2023-01-09 10
#> 9 MSFT 2023-01-10 2023-01-10 10
#> 10 MSFT 2023-01-11 2023-01-11 10
#> 11 MSFT 2023-01-12 2023-01-12 10
#> 12 MSFT 2023-01-13 2023-01-13 10
For large date ranges, use frequency = "1 month" to
submit one request per month rather than one per day, which reduces load
on the LOBSTER server.
request_submit(
account_login = lobster_login,
request = data_request
)LOBSTER processes requests server-side. Depending on the volume of messages, this can take anywhere from a few minutes to several hours. You can safely close your R session while waiting — processing continues on the LOBSTER servers.
Once processing is complete, the files appear in your account archive:
lobster_archive <- account_archive(account = lobster_login)
lobster_archiveThe returned tibble has one row per available dataset with columns
id, symbol, start_date,
end_date, level, size, and
download.
dir.create("data-lobster", showWarnings = FALSE)
data_download(
requested_data = dplyr::filter(lobster_archive, symbol == "MSFT"),
account_login = lobster_login,
path = "data-lobster"
)Downloaded .7z archives are extracted automatically.
Pass unzip = FALSE to keep the raw archives. Note that
extraction runs in a background process, so the function returns before
the files are fully written to disk — check the path
directory before proceeding with analysis.
The archive package used for .7z extraction
requires system libraries. On a Debian/Ubuntu system:
sudo apt-get update
sudo apt-get install -y gpgv gnupg libarchive13t64 liblz4-dev libacl1-dev libext2fs-dev nettle-dev
sudo apt-get install -y libarchive-dev