Package Functions
The package provides the following functions for downloading and parsing historical DEEP data from IEX. The parse functions run a built C++ parser to extract data from the PCAP files. For a detailed explanation of the data format, refer to the Usage section.
Download historical data from IEX
The package provides a function to download historical data from IEX. The function downloads the data for a specific date and saves it to the specified directory. If the file already exists, the function skips the download.
- iex_cppparser.download.download_hist_file(date: str, download_dir: str) bool [source]
Checks the hist_data JSON file for a specific file and downloads it if it doesn’t exist.
- Parameters:
date (str): The date in the format YYYYMMDD.
download_dir (str): The directory to download the file to.
- Returns:
bool: True if the file was downloaded or already existed, False otherwise.
- Output:
The file is downloaded to the specified directory and has the following format: data_feeds_{date}_{date}_IEXTP1_DEEP1.0.pcap.gz
Example Usage:
from iex_cppparser.download import download_hist_file
success = download_hist_file("20231010", "/path/to/download")
if success:
print("File downloaded or already exists.")
else:
print("File download failed.")
Parse historical data from IEX
The package offers several functions for parsing historical data from IEX. The parse_file function allows for parsing any IEX DEEP file. If you need to download and parse data, you can use the parse_date or parse_dates functions. Additionally, these functions support splitting the parsed data into separate files based on the symbols present in the data.
- iex_cppparser.parse_file(file_path: str, parsed_folder: str, symbol: str, split: bool = False)[source]
This function parses a file using the IEX parser and redirects the output to a specified folder.
- Parameters:
file_path (str): The path to the file to be parsed.
parsed_folder (str): The path to the folder where the parsed output should be saved.
symbol (str): Path to a txt file with symbols to parse. Must have one symbol per line. If “ALL”, all symbols are parsed.
split (bool): Whether to split the output files. One file per letter of the anphabet is generated. Default is False.
- Returns:
None
- Output:
Two files are generated:
The file ending in _trd.csv contains the trades data.
The file ending in _prl.csv contains the price level updates.
Example Usage:
from iex_cppparser import parse_file
parse_file("data_feeds_20231010_20231010_IEXTP1_DEEP1.0.pcap.gz", "/path/to/parsed", "symbols.txt", split=True)
- iex_cppparser.parse_date(date_str: str, download_dir: str, parsed_folder: str, symbol: str, download: bool = True, split: bool = False)[source]
This function (can) download and parse the IEXTP1 DEEP1.0 pcap files for a given date.
- Parameters:
date_str (str): The date string to be parsed. Format YYYY-MM-DD
download_dir (str): The directory where the files are downloaded.
parsed_folder (str): The directory where the parsed output should be saved.
symbol (str): Path to a txt file with symbols to parse. Must have one symbol per line. If “ALL”, all symbols are parsed.
download (bool): Whether to download the files. Default is True.
split (bool): Whether to split the output files. One file per letter of the anphabet is generated. Default is False.
- Returns:
None
- Output:
Two files are generated:
The file ending in _trd.csv contains the trades data.
The file ending in _prl.csv contains the price level updates.
Example Usage:
from iex_cppparser import parse_date
parse_date("2023-10-10", "/path/to/download", "/path/to/parsed", "symbols.txt", download=True, split=False)
- iex_cppparser.parse_dates(start_date: str, end_date: str, download_dir: str, parsed_folder: str, symbol: str, download: bool = False, split: bool = False)[source]
This function parses a range of dates and (downloads and) parses the corresponding IEXTP1 DEEP1.0 pcap files.
- Parameters:
start_date (str): The start date string in the format YYYY-MM-DD.
end_date (str): The end date string in the format YYYY-MM-DD.
download_dir (str): The directory where the files are downloaded.
parsed_folder (str): The directory where the parsed output should be saved.
symbol (str): Path to a txt file with symbols to parse. Must have one symbol per line. If “ALL”, all symbols are parsed.
download (bool): Whether to download the files. Default is False.
split (bool): Whether to split the output files. One file per letter of the anphabet is generated. Default is False.
- Returns:
None
- Output:
For each date, two files are generated:
The file ending in _trd.csv contains the trades data.
The file ending in _prl.csv contains the price level updates.
Example Usage:
from iex_cppparser import parse_dates
parse_dates("2023-10-10", "2023-10-12", "/path/to/download", "/path/to/parsed", "symbols.txt", download=False, split=True)