
The Transformer architecture, introduced by Vaswani et al. [4] in 2017, has become
a cornerstone in the field of Natural Language Processing (NLP). Its effectiveness
in managing sequential data, scalability, and proficiency in capturing long-range
dependencies in text have established it as a foundational framework. Large Language
Models (LLMs), such as GPT, utilize the Transformer’s multi-headed self-attention
mechanism to process and generate text. These models are trained on extensive datasets
to comprehend and produce natural language. The term "zero-shot forecasting" refers
to the capability of these models to predict future events or generate insights in
various domains, including finance, without prior specific training on those tasks.
Remarkably, certain LLMs have demonstrated the ability to extrapolate time series
data, such as GPT-3 and LLaMa-2, provided the time series is aptly converted into
textual format, and LLMs then transform the task of time series forecasting into a
problem of predicting the next token in text. Gruver et al. [2] introduced LLMTime,
a methodology for employing pretrained LLMs to forecast continuous time series. Examples
of converting time series data into textual inputs for pretrained LLMs are available
in the LLMTime GitHub repository. In this project, participants are tasked with employing
selected pretrained LLMs on the NYSE Daily TAQ (Trade and Quote) client dataset to
perform zero-shot forecasting concerning Trade Price and Trade Direction (upward vs
downward movement of Trade Price) for subsequent trade data sets. Participants will
have access to trades data for 94 masked stocks over 3 days, and the performance of
your model will be evaluated on a holdout dataset.
Participants are expected to use the intraday data up until 3:40pm to train or fine-tune
their models, and then forecast trading prices (point predictions) for future intervals
of 5 minutes, 10 minutes, 15 minutes, and 20 minutes. This is also called "closing
price prediction". It is important to note that the predictability of high-frequency
returns may diminish within a few minutes [1]. Participants have the option to develop
either univariate or multivariate models depending on the chosen LLMs, taking into
consideration the differences in tokenizers among the LLMs.
View the Data Here