AI Brain Predicting the Stock Market

Can AI Predict the Stock Market? How StockGPT is Changing the Investing Game

Mira Elwood in Tech & Innovation February 2026 • 4 min read.

"Discover how a new AI model, StockGPT, is revolutionizing stock prediction and trading, offering insights beyond traditional methods."

The world of finance is constantly evolving, with new technologies emerging to help investors make smarter decisions. Generative artificial intelligence (GenAI) is one such technology, rapidly transforming industries by creating texts, images, videos, and code from simple instructions. Now, it's making waves in the investment sector. Since the introduction of ChatGPT in November 2022, large language models have found their way into investment, offering trading recommendations based on news and corporate communications.

One approach involves using pre-trained language models like BERT and OPT to generate sentiment scores for companies, which then inform trading decisions. For instance, studies have shown that stock portfolios built on these sentiment scores can yield impressive results. However, a new method is emerging that applies GenAI directly to numeric stock data.

This is where StockGPT comes in. StockGPT is a generative pre-trained transformer (GPT) model trained from scratch on numeric stock data. Unlike previous finance-specific language models trained on financial texts, StockGPT is pre-trained directly on numeric stock return data. This approach offers several advantages for trading, learning price patterns directly from price data, providing predictions for each stock at any time, and predicting the entire distribution of future returns rather than just a single estimate.

How Does StockGPT Predict Stock Returns?

StockGPT operates by predicting the most likely future token (or return) given previous returns. This is similar to how language models predict the next word in a sentence. To achieve this, stock return data is discretized into intervals, transforming continuous numbers into discrete tokens that the model can process. The model is built using a lightweight version of the GPT architecture, consisting of four attention blocks with approximately one million parameters.

The model ingests a sequence of 256 daily returns for each stock, approximating the number of trading days in a year. The training objective is to predict the next return value using the transformer architecture. This architecture uses indexes (or positions) of the tokens, retrieves their vector representations, and models dependencies through an attention mechanism. The training data consists of about 50 million daily U.S. stock returns from 1926 to 2000, encompassing almost all stocks listed during the 20th century. The model is then tested on a separate set of approximately 20 million daily U.S. stock returns from 2001 to 2023.

Direct Learning: Learns price patterns directly from price data.
Continuous Availability: Provides predictions for each stock at any time.
Distribution Prediction: Predicts the whole distribution of future returns.

The model is trained only once and applied off-the-shelf to the out-of-sample period to test its effectiveness and reduce computational costs. During testing, StockGPT uses the 256 daily returns leading up to each trading day to forecast the return for the following day. This forecast is then evaluated through cross-sectional regressions and real-time trading portfolios.

The Future of AI in Stock Market Prediction

StockGPT represents a significant advancement in applying generative AI to stock market prediction. Its ability to learn directly from numeric data and generate strong investment performance highlights the potential for AI to enhance trading strategies. By encompassing common price-based strategies and leading stock factors, StockGPT demonstrates a new AI-driven approach to asset pricing.

About this Article -

This article was crafted using a human-AI hybrid and collaborative approach. AI assisted our team with initial drafting, research insights, identifying key questions, and image generation. Our human editors guided topic selection, defined the angle, structured the content, ensured factual accuracy and relevance, refined the tone, and conducted thorough editing to deliver helpful, high-quality information.See our About page for more information.

This article is based on research published under:

DOI-LINK: https://doi.org/10.48550/arXiv.2404.05101,

Title: Stockgpt: A Genai Model For Stock Prediction And Trading

Subject: q-fin.cp cs.ai q-fin.pm q-fin.pr q-fin.st

Authors: Dat Mai

Published: 07-04-2024

Everything You Need To Know

What is StockGPT and how does it differ from other AI models used in finance?

StockGPT is a generative pre-trained transformer (GPT) model designed for stock market prediction. Unlike many finance-specific AI models that are trained on financial texts, StockGPT is pre-trained directly on numeric stock return data. This direct approach allows it to learn price patterns from the raw data, potentially offering a more nuanced understanding of market dynamics compared to models reliant on textual analysis.

How does StockGPT predict future stock returns?

StockGPT predicts future stock returns by analyzing sequences of past returns. The model discretizes stock return data into intervals, transforming continuous numbers into discrete tokens. By ingesting a sequence of 256 daily returns for each stock, StockGPT predicts the most likely future return, similar to how language models predict the next word in a sentence. It uses a lightweight version of the GPT architecture, including attention blocks, to model dependencies and forecast the return for the following day.

What are the advantages of using StockGPT for stock market predictions compared to traditional methods?

StockGPT offers several advantages: First, it learns price patterns directly from price data rather than relying on textual analysis or human-defined features. Second, it provides predictions for each stock at any time, offering continuous availability of insights. Lastly, it predicts the entire distribution of future returns rather than just a single estimate, giving a more complete picture of potential outcomes. These features allow for a more data-driven and comprehensive approach to investment decisions.

What kind of data was used to train and test StockGPT, and why is this significant?

StockGPT was trained on approximately 50 million daily U.S. stock returns from 1926 to 2000, encompassing nearly all stocks listed during the 20th century. It was then tested on a separate set of roughly 20 million daily U.S. stock returns from 2001 to 2023. This extensive dataset ensures the model is exposed to a wide range of market conditions and historical data, providing a robust foundation for learning and validation. The use of out-of-sample data for testing ensures that the model's performance is evaluated on unseen data, reducing the risk of overfitting and providing a more realistic assessment of its predictive capabilities.

What implications does StockGPT have for the future of AI in stock market prediction and investment strategies?

StockGPT signifies a substantial shift toward leveraging generative AI directly on numeric data for stock market prediction. Its ability to learn from price data and achieve high investment performance highlights the potential for AI to enhance trading strategies. StockGPT's success in encompassing common price-based strategies and leading stock factors suggests a new AI-driven approach to asset pricing, potentially reshaping how investment decisions are made and how financial models are developed in the future.