OpenAI announces o3 and o3-mini, its next simulated reasoning models

December 21, 2024:

On Friday, during Day 12 of its “12 days of OpenAI,” OpenAI CEO Sam Altman announced its latest AI “reasoning” models, o3 and o3-mini, which build upon the o1 models launched earlier this year. The company is not releasing them yet but will make these models available for public safety testing and research access today.

The models use what OpenAI calls “private chain of thought,” where the model pauses to examine its internal dialog and plan ahead before responding, which you might call “simulated reasoning” (SR)—a form of AI that goes beyond basic large language models (LLMs).

The company named the model family “o3” instead of “o2” to avoid potential trademark conflicts with British telecom provider O2, according to The Information. During Friday’s livestream, Altman acknowledged his company’s naming foibles, saying, “In the grand tradition of OpenAI being really, truly bad at names, it’ll be called o3.”

According to OpenAI, the o3 model earned a record-breaking score on the ARC-AGI benchmark, a visual reasoning benchmark that has gone unbeaten since its creation in 2019. In low-compute scenarios, o3 scored 75.7 percent, while in high-compute testing, it reached 87.5 percent—comparable to human performance at an 85 percent threshold.

OpenAI also reported that o3 scored 96.7 percent on the 2024 American Invitational Mathematics Exam, missing just one question. The model also reached 87.7 percent on GPQA Diamond, which contains graduate-level biology, physics, and chemistry questions. On the Frontier Math benchmark by EpochAI, o3 solved 25.2 percent of problems, while no other model has exceeded 2 percent.

Source link

tags

Abkhazia Afghanistan Africa Albania Algeria Andorra Angola Antigua and Barbuda Argentina Armenia Australia Austria Azerbaijan Bahamas Bahrain Bangladesh Barbados Belarus Belgium Belize Benin Bolivia Bosnia and Herzegovina Botswana Brazil Bulgaria Burkina Faso Burundi Cambodia Cameroon Canada Cape Verde Central African Republic Chad Chile China Colombia Comoros Costa Rica Croatia Cuba Curacao Cyprus Czech Republic Denmark Djibouti Dominica Dominican Republic East Timor Ecuador Egypt El Salvador Equatorial Guinea Eritrea Estonia Ethiopia Fiji Finland France French Polynesia Gabon Gambia Georgia Germany Ghana Greece Grenada Guam Guatemala Guinea Guyana Haiti Honduras Hong Kong Hungary Iceland India Indonesia Iran Iraq Ireland Israel Italy Jamaica Japan Jordan Kazakhstan Kenya Kiribati Kosovo Kuwait Kyrgyzstan Laos Latvia Lebanon Lesotho Liberia Libya Liechtenstein Lithuania Luxembourg Macau Madagascar Malawi Malaysia Maldives Mali Malta Mauritania Mauritius Mexico Moldova Monaco Mongolia Montenegro Morocco Mozambique Myanmar Namibia Nepal Netherlands New Zealand Nicaragua Niger Nigeria North Korea North Macedonia Northern Cyprus Norway Oman Pakistan Palestine Panama Papua New Guinea Paraguay Peru Philippines Poland Portugal Puerto Rico Qatar Republic of the Congo Romania Russia Rwanda Saint Lucia Samoa San Marino Saudi Arabia Senegal Serbia Sierra Leone Singapore Slovakia Slovenia Solomon Islands Somalia South Korea South Sudan Spain Sri Lanka Sudan Sweden Switzerland Syria Taiwan Tajikistan Tanzania Thailand Togo Tonga Trinidad and Tobago Tunisia Turkey Turkmenistan Uganda UK Ukraine United Arab Emirates Uruguay USA Uzbekistan Vatican City Venezuela Vietnam Western Sahara Yemen Zambia Zimbabwe