Pendingbenchmarks

Within five years (by early 2029), AI does well on every single test the computer-science industry can put in front of it

by Jensen Huang (Nvidia CEO) · called 2024-03

Pending · due Mar 2029

1 receiptOpen verificationClose▾

Pending

Pending · due Mar 2029

Coverage of standardized human tests at strong-pass levelunknown — pending a structured read; frontier models already pass many professional exams and olympiad-level math

How it's graded

Met if by March 2029 frontier AI systems achieve strong performance on essentially every standardized human test the field proposes (bar exams, medical boards, olympiads, etc.); failed if significant test categories remain unconquered

Receipts · 1

"If I gave an AI… every single test that you can possibly imagine, you make that list of tests and put it in front of the computer science industry, and I'm guessing in five years time, we'll do well on every single one," Huang said.
Fox Business (Stanford SIEPR Economic Summit remarks) · 2024-03-03

Every verdict on the ledger is graded against dated, archived third-party evidence and blind-verified by two independent models.