How to Backtest Stocks with AI

Test any stock strategy against 60+ years of data by asking in plain English. No code, no TradingView, no Python.

← Back to shibui.finance

You can backtest stock strategies in Claude by describing your criteria in plain English. Shibui Finance gives Claude access to 60+ years of US market data, 31M+ daily prices, and 56 technical indicators for nearly 10,000 companies. No code required.

Connect Shibui once, then ask backtesting questions in any conversation. Claude finds every historical instance of your signal, measures what happened afterward, and returns the statistics: win rate, average return, median return, signal count. Real data, computed on the fly, no framework to install.

The code path vs the conversation

The standard approach to backtesting a stock strategy involves choosing a framework (Backtrader, Zipline, QuantConnect), writing Python, connecting to a data feed, handling lookback windows, debugging off-by-one errors in date alignment, and running the simulation. The result is a script that tests one specific strategy against one specific dataset. Changing the signal means editing code. Changing the universe means editing code. Comparing two strategies means writing two scripts.

Shibui Finance is what that project looks like after the data and infrastructure work is done. The database is pre-loaded with 60+ years of daily prices, quarterly financials, daily valuations, and 56 technical indicators. Claude writes the query at runtime based on what you ask. Changing the signal means asking a different question.

A backtest in Claude is a question, not a program. You describe the signal, Claude tests it against real data, and the statistics come back in seconds. The backtesting logic lives in the prompt, not in code.

Example: RSI oversold bounce

On Shibui, you ask

"What was the average 20-day forward return after RSI dropped below 30 for large-cap stocks since 2021, and how does it compare to all large caps over the same period?"

Claude found 30,214 instances where RSI dropped below 30 for stocks with market cap above $10B since 2021, and measured the 20-day forward return after each. The crucial step is the comparison: it ran the same measurement on every large-cap day in the period, signal or not, to get a baseline. The oversold signal returned a 57.7% win rate and a +1.97% average, with a +1.58% median. All large caps over the same window returned 52.3% and +0.99%, median +0.45%. So buying the dip did beat the large-cap market here: about five points more often, and roughly double the average move. (Averages are capped at the 1st and 99th percentile so a few extreme movers do not distort them; about 1% of signals had no forward price because the stock delisted inside the window.)

That comparison is the part most backtests skip, and it is the whole game. A 58% win rate looks strong on its own, but it means nothing until you know what the universe did: if every large cap won 58% of the time too, the signal added zero. Here it cleared the baseline, which is the honest way to read a backtest.

One limit on where this holds. The edge is a large-cap effect. Run the same test across the entire US market, small caps included, and it disappears: oversold small caps bounce no more often than the market, because many are oversold for a reason and keep sliding. The backtest page has that broad-market result. The signal is real, but specific to liquid, large-cap names.

Example: MACD crossover

On Shibui, you ask

"What was the average 20-day return after MACD crossed above the signal line for stocks over $10B market cap since 2021, and how does it compare to all large caps?"

Claude identified 36,582 bullish MACD crossovers across large caps since 2021 and measured the same 20-day forward return against the same large-cap baseline. This time the comparison is the whole story: the crossover won 52.8% of the time with a +0.94% average, while all large caps won 52.2% with +0.99%. The signal landed right on top of the baseline, no better than simply being in the market. On its own, "52.8% win rate" looks like an edge; next to the baseline, it is noise.

That is the opposite of the RSI result, where the oversold signal cleared the baseline by about five points. Same universe, same period, same test, two different verdicts, and the difference only shows up because both are measured against the market. Comparing strategies this way takes two questions and about 30 seconds.

Example: fundamental screen backtest

On Shibui, you ask

"What was the 1-year forward return for stocks with a Piotroski F-Score of 8 or 9 and market cap above $2B, sampled annually since 2015, and how does it compare to all stocks over $2B?"

Claude found 1,287 annual observations of a stock with a Piotroski F-Score of 8 or 9 (strong financial quality) and a market cap above $2B, then measured the one-year forward return against every $2B-plus stock sampled the same way. The high-quality names won 61.2% of the time with a +8.3% median return; the broad $2B-plus universe won 57.1% with a +5.8% median. Both did well, since 2015 to 2026 was a rising market, but the quality signal cleared the baseline by about four points of win rate and two and a half points of median return. (About 10% of observations had no one-year forward price and were excluded.)

This is the test that actually validates a factor: not "did F-Score stocks go up," since most stocks did, but "did they beat the market they were drawn from." Here they did. And it combines a fundamental screen with forward-return measurement in one question. Finviz can filter on Piotroski score but cannot tell you what happened next; a coding framework can measure forward returns but makes you build the fundamental filter yourself.

What backtesting in Claude cannot do

This is directional backtesting for validating ideas. It answers "does this signal correlate with positive forward returns?" It does not model what would happen if you traded on it. The distinction matters.

No transaction costs or slippage. Real-world returns are lower than what the backtest shows, especially for strategies with high turnover or small-cap stocks with wide bid-ask spreads.

Survivorship bias. Stocks that delisted before the forward-return window closed show as excluded signals, not as -100% returns. The backtest reports how many signals were excluded so you can judge the impact, but it does not assign terminal values to delisted stocks.

Not a portfolio simulator. There is no position sizing, no rebalancing, no drawdown management, no portfolio-level risk. If you need that level of simulation, use QuantConnect, Backtrader, or Portfolio123.

End-of-day data only. Entry and exit prices are closing prices. There is no intraday timing, no opening-price entry, no stop-loss simulation within the day.

The backtest tells you whether an idea has historical support. It does not tell you whether to trade on it. For full details on the data behind these results, see the data sources page.

Frequently asked questions

Can I backtest a stock strategy without writing code?

Yes. Connect Shibui Finance to Claude and describe your strategy in plain English. For example, "what happens after RSI drops below 30 for large caps?" Claude tests it against 60+ years of real market data and returns win rates, average returns, and signal counts. No Python, no Pine Script, no framework to install.

Can Claude backtest a trading strategy?

With Shibui Finance connected, Claude can backtest both technical and fundamental strategies against real historical data. Describe your entry signal and Claude computes forward returns across thousands of historical instances. It is directional backtesting for validating ideas, not a full portfolio simulation with position sizing and risk management.

How do I backtest an RSI strategy with Claude?

Ask Claude something like: "What was the average 20-day return after RSI dropped below 30 for stocks over $10B market cap?" Shibui has 56 pre-calculated technical indicators including RSI, MACD, Bollinger Bands, and multiple moving averages. Claude finds every historical instance of your signal and computes the statistics across all of them.

Is AI backtesting accurate?

The data is real and the statistics are computed correctly from verified market data. But all backtests have inherent limitations: no transaction costs, survivorship bias from delisted stocks, and no guarantee that past patterns will repeat. Shibui reports excluded signals and sample sizes so you can judge reliability yourself.

Can I backtest MACD and moving average strategies with AI?

Yes. Shibui includes 56 pre-calculated daily indicators: MACD, RSI, Bollinger Bands, SMA (20, 50, 200-day), EMA (9, 21, 50, 200-day), and more. Ask Claude to test any crossover, threshold, or combination signal. You can combine technical signals with fundamental criteria like P/E or Piotroski F-Score in the same backtest.

Are there free backtesting tools for stocks?

Shibui Finance is free and connects to Claude (free or paid plan). It provides 60+ years of US stock data with 56 technical indicators. Other free options like QuantConnect and Backtrader require writing Python code. Shibui is the only free option that works in plain English without any programming.

Connect Shibui to Claude in 2 minutes

Shibui is free. No code, no API keys. Connect it to Claude and test any stock strategy against 60+ years of real market data.

Connect to Claude →