OpenAI Announces a New AI Model, Code-Named Strawberry, That Solves Difficult Problems Step by Step

September 12, 2024:

OpenAI made the last big breakthrough in artificial intelligence by increasing the size of its models to dizzying proportions, when it introduced GPT-4 last year. The company today announced a new advance that signals a shift in approach—a model that can “reason” logically through many difficult problems and is significantly smarter than existing AI without a major scale-up.

The new model, dubbed OpenAI-o1, can solve problems that stump existing AI models, including OpenAI’s most powerful existing model, GPT-4o. Rather than summon up an answer in one step, as a large language model normally does, it reasons through the problem, effectively thinking out loud as a person might, before arriving at the right result.

“This is what we consider the new paradigm in these models,” Mira Murati, OpenAI’s chief technology officer, tells WIRED. “It is much better at tackling very complex reasoning tasks.”

The new model was code-named Strawberry within OpenAI, and it is not a successor to GPT-4o but rather a complement to it, the company says.

Murati says that OpenAI is currently building its next master model, GPT-5, which will be considerably larger than its predecessor. But while the company still believes that scale will help wring new abilities out of AI, GPT-5 is likely to also include the reasoning technology introduced today. “There are two paradigms,” Murati says. “The scaling paradigm and this new paradigm. We expect that we will bring them together.”

LLMs typically conjure their answers from huge neural networks fed vast quantities of training data. They can exhibit remarkable linguistic and logical abilities, but traditionally struggle with surprisingly simple problems such as rudimentary math questions that involve reasoning.

Murati says OpenAI-o1 uses reinforcement learning, which involves giving a model positive feedback when it gets answers right and negative feedback when it does not, in order to improve its reasoning process. “The model sharpens its thinking and fine tunes the strategies that it uses to get to the answer,” she says. Reinforcement learning has enabled computers to play games with superhuman skill and do useful tasks like designing computer chips. The technique is also a key ingredient for turning an LLM into a useful and well-behaved chatbot.

Mark Chen, vice president of research at OpenAI, demonstrated the new model to WIRED, using it to solve several problems that its prior model, GPT-4o, cannot. These included an advanced chemistry question and the following mind-bending mathematical puzzle: “A princess is as old as the prince will be when the princess is twice as old as the prince was when the princess’s age was half the sum of their present age. What is the age of the prince and princess?” (The correct answer is that the prince is 30, and the princess is 40).

“The [new] model is learning to think for itself, rather than kind of trying to imitate the way humans would think,” as a conventional LLM does, Chen says.

OpenAI says its new model performs markedly better on a number of problem sets, including ones focused on coding, math, physics, biology, and chemistry. On the American Invitational Mathematics Examination (AIME), a test for math students, GPT-4o solved on average 12 percent of the problems while o1 got 83 percent right, according to the company.

Source link

tags

Abkhazia Afghanistan Africa Albania Algeria Andorra Angola Antigua and Barbuda Argentina Armenia Australia Austria Azerbaijan Bahamas Bahrain Bangladesh Barbados Belarus Belgium Belize Benin Bolivia Bosnia and Herzegovina Botswana Brazil Bulgaria Burkina Faso Burundi Cambodia Cameroon Canada Cape Verde Central African Republic Chad Chile China Colombia Comoros Costa Rica Croatia Cuba Curacao Cyprus Czech Republic Denmark Djibouti Dominica Dominican Republic East Timor Ecuador Egypt El Salvador Equatorial Guinea Eritrea Estonia Ethiopia Fiji Finland France French Polynesia Gabon Gambia Georgia Germany Ghana Greece Grenada Guam Guatemala Guinea Guyana Haiti Honduras Hong Kong Hungary Iceland India Indonesia Iran Iraq Ireland Israel Italy Jamaica Japan Jordan Kazakhstan Kenya Kiribati Kosovo Kuwait Kyrgyzstan Laos Latvia Lebanon Lesotho Liberia Libya Liechtenstein Lithuania Luxembourg Macau Madagascar Malawi Malaysia Maldives Mali Malta Mauritania Mauritius Mexico Moldova Monaco Mongolia Montenegro Morocco Mozambique Myanmar Namibia Nepal Netherlands New Zealand Nicaragua Niger Nigeria North Korea North Macedonia Northern Cyprus Norway Oman Pakistan Palestine Panama Papua New Guinea Paraguay Peru Philippines Poland Portugal Puerto Rico Qatar Republic of the Congo Romania Russia Rwanda Saint Lucia Samoa San Marino Saudi Arabia Senegal Serbia Sierra Leone Singapore Slovakia Slovenia Solomon Islands Somalia South Korea South Sudan Spain Sri Lanka Sudan Sweden Switzerland Syria Taiwan Tajikistan Tanzania Thailand Togo Tonga Trinidad and Tobago Tunisia Turkey Turkmenistan Uganda UK Ukraine United Arab Emirates Uruguay USA Uzbekistan Vatican City Venezuela Vietnam Western Sahara Yemen Zambia Zimbabwe