Skip to content

Explores use of text-to-text LLMs for vol prediction, something normally done with number-to-number stochastic volatility model such as the MSM or Heston, with high frequency data. Implementation of nanoGPT, training on high-frequency tick data for JPM and AAPL.

Notifications You must be signed in to change notification settings

g-i-o-r-g-i-o/volgpt

Repository files navigation

volgpt

Code repo for post: Using a GPT for volatility prediction in risk management

  • In this post, I explore the use of LLMs for tasks that are typically performed by models more specific to asset pricing and risk management. I train Karpathy's nanoGPT on high-frequency (tick-by-tick) data for AAPL and JPM. I want to see how nanoGPT performs as a volatility predictor. Volatility prediction is used in risk management to estimate the potential fluctuations in the value of an asset or portfolio over a given period of time. Volatility, the second moment, is a measure of how much the return fluctuates around its mean, and it is a key input in risk management models used by financial institutions and required by regulators.

  • The established model classes for vol prediction include stochastic volatility models such as the MSM of Calvet & Fisher, ARCH and GARCH, and Jump Diffusion models. Deep learning is also used for vol prediction and this post provides colour. However, the application of LLMs to this problem is quite novel and the use of nanoGPT provides a great basis for an under-the-hood examination of the application of text-to-text LLMs to numeric problems.

  • I begin with my earlier implementation of Karpathy's nanoGPT, which I train on a cleansed dataset comprising some 1.079 million rows in total . In order to negate or reduce the impact of microstructure noise, and in particular Bid-Ask bounce, I compute a weighted mid-price using the CloseBid and CloseAsk prices and sizes. I discuss microstructure noise, Bid-Ask bounce, weighted mid-price, and my motivation for using it, in detail in my post high frequency data. I obtain returns from the WMP and, for tractability, log returns. I present the analysis for AAPL in this post, but I process the data for both securities. I don't write up JPM here because it gets a bit repetitive, but the results are broadly in line with AAPL.

  • As always when working with high-frequency data, I have to do a lot of work up-front to prepare the data. I also have to devote quite a bit of effort to processing the generated text output from nanoGPT, in order that it can be used for a numeric purpose.

  • Roughly speaking my code is organized as follows. The volgpt_import function checks if a GPU is available and prints the device and device name, then calls the high_frequency_data function, which reads AAPL and JPM from 500+ daily NYSE TAQ files and combines the data into a single DataFrame and sets appropriate column names and data types. A DateTimeIndex column is created by combining Date and OpenBarTime, set as the index, then the DataFrame is trimmed to only include relevant columns and computes the WeightedMidPrice. At some stage it may be worth reintroducing a larger set of columns as I suspect that may improve model performance, but will require more data preparation steps on both the input side and with regard to the generated text.

  • The high_frequency_data function splits the data into separate DataFrames for AAPL and JPM, and computes raw returns and log returns for both stocks and merges them into their respective DataFrames. Descriptive statistics are computed using the returns. Specified columns are then formatted with a fixed number of decimal place (this helps greatly with respect to the tractability of the text generated by the model) and missing values replaced with "UNK". Then all columns are converted to strings. The function returns two DataFrames (df_data_AAPL, df_data_JPM), raw returns (AAPL_rr, JPM_rr), log returns (AAPL_lr, JPM_lr), and descriptive statistics (AAPL_stats, JPM_stats) for both stocks.

  • The vol_import function identifies and prints any missing values in the AAPL and JPM data, then saves the AAPL and JPM data to text files with a comma delimiter. Finally it checks if the text files were saved correctly by reading them back into dataframes and comparing their shapes with the original data, and returns the AAPL and JPM data, their respective risk ratios, leverage ratios, and statistics.

  • The nanoGPT model is written in the volgpt_model.py file. The train_and_generate function trains the model on a given text file and generates new text. The function accepts several arguments (i.e., text_file_path, the path to the input text file; max_iters, the maximum number of iterations to run the training loop (I set a default: 5000); learning_rate, for the optimizer (default: 1e-3); device, so that operations can be passed to the GPU; and max_new_tokens, the maximum number of new tokens to generate (default: 5000). It then tokenizes the input text, splits it into training, validation, and test sets, and defines a simple bigram language model using Karpathy's transformer architecture with multi-head self-attention. It trains the model in a loop for max_iters iterations and evaluates the loss periodically on the train, validation, and test sets. After training, the function generates new text using the trained model and returns a tuple containing the test data tensor, the generated text, and a mapping from indices to characters (itos).

  • NanoGPT appears to perform well under this setup. The MSE's (0.05078798 and 0.00000192 respectively) and MAEs (0.17065891 and 0.00099668) are low and paired t-tests (raw returns T-stat = 0.69149665, p-value = 0.50499000 and log returns T-stat = 0.71337283, p-value = 0.49192750) indicate there is no significant difference between predicted and true values. This suggests good predictive performance, but it's difficult to determine the overall quality of the predictions without comparing them to the performance of other models or benchmarks in the same context.

  • I think these results are very interesting because they show that it is possible to convert numbers to text and train a LLM to understand patterns in data that enable forward prediction of volatility with a level accuracy comparible to asset pricing models built specifically for the purpose. This suggests that LLMs can be used in a variety of ways for asset pricing and risk management.

About

Explores use of text-to-text LLMs for vol prediction, something normally done with number-to-number stochastic volatility model such as the MSM or Heston, with high frequency data. Implementation of nanoGPT, training on high-frequency tick data for JPM and AAPL.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published