Model
Score
GPT 5.1
64.3%
64.3%
GPT 5
64.2%
64.2%
Grok 4
61.3%
61.3%
Gemini 2.5 Flash
60.4%
60.4%
Gemini 2.5 Pro
60.1%
60.1%
The APEX Leaderboard assesses the performance of frontier AI models at performing professional work that drives the economy. The reported scores are their average performance across four roles: Investment Banking Analyst, Management Consulting Associate, Big Law Associate, and General Practitioner (MD).
Model
Score
GPT 5.1
64.3%
64.3%
GPT 5
64.2%
64.2%
Grok 4
61.3%
61.3%
Gemini 2.5 Flash
60.4%
60.4%
Gemini 2.5 Pro
60.1%
60.1%