Document Arena

View overall rankings across AI models in document analysis and long-content reasoning.

Apr 22, 2026
109,664 votes
21 models
Rank Spread
1
14
Anthropic
Anthropic · Proprietary
1528±10
7,962$5 / $251M
2
15
Anthropic
Anthropic · Proprietary
1524±12
2,426$5 / $251M
3
14
Anthropic
Anthropic · Proprietary
1520±8
14,621$5 / $251M
4
15
Anthropic
Anthropic · Proprietary
1515±13
2,315$5 / $251M
5
35
Anthropic
Anthropic · Proprietary
1503±8
20,969$3 / $151M
6
69
OpenAI · Proprietary
1482±9
9,665$2.50 / $151.1M
7
69
Anthropic
Anthropic · Proprietary
1471±10
8,022$5 / $25200K
8
615
Moonshot · Modified MIT
1459±19
828$0.75 / $3.50262.1K
9
615
Meta
Meta · Proprietary
1457±19
844N/AN/A
10
813
Google · Proprietary
1451±7
17,686$2 / $121M
11
814
Anthropic
Anthropic · Proprietary
1450±8
12,940$3 / $15200K
12
817
Moonshot · Modified MIT
1445±9
7,233$0.60 / $3N/A
13
817
Google · Proprietary
1443±9
10,795$2 / $121M
14
1018
Google · Proprietary
1433±8
15,606$1.25 / $101M
15
920
Google · Apache 2.0
1431±13
2,398N/AN/A
16
1219
Anthropic
Anthropic · Proprietary
1430±8
13,849$1 / $5200K
17
1220
1426±11
3,117$2 / $62M
18
1421
Google · Proprietary
1423±9
7,214$0.50 / $31M
19
1521
OpenAI · Proprietary
1416±9
7,116$1.75 / $14400K
20
1621
OpenAI · Proprietary
1412±9
8,299$1.25 / $10400K
21
1821
OpenAI · Proprietary
1407±7
17,804$1.75 / $14400K

Remove Style Control Leaderboard Plots

Confidence Intervals on Model Strength (via Bootstrapping)

Average Win Rate Against All Other Models (Uniform Sampling and No Ties)

Battle Count for Each Combination of Models (without Ties)

Fraction of Model A Wins for All Non-tied A vs. B Battles