Deep Cogito emerges from stealth with hybrid AI ‘reasoning’ models

A new company, Deep Cogito, has emerged from stealth with a family of openly available AI models that can be switched between “reasoning” and non-reasoning modes. Reasoning models like OpenAI’s o1 have shown great promise in domains like math and physics, thanks to their ability to effectively fact-check themselves by working through complex problems step…

Read More

These researchers used NPR Sunday Puzzle questions to benchmark AI ‘reasoning’ models

Every Sunday, NPR host Will Shortz, The New York Times’ crossword puzzle guru, gets to quiz thousands of listeners in a long-running segment called the Sunday Puzzle. While written to be solvable without too much foreknowledge, the brainteasers are usually challenging even for skilled contestants. That’s why some experts think they’re a promising way to…

Read More

Researchers open source Sky-T1, a ‘reasoning’ AI model that can be trained for less than $450

So-called reasoning AI models are becoming easier — and cheaper — to develop. On Friday, NovaSky, a team of researchers based out of UC Berkeley’s Sky Computing Lab, released Sky-T1-32B-Preview, a reasoning model that’s competitive with an earlier version of OpenAI’s o1 on a number of key benchmarks. Sky-T1 appears to be the first truly…

Read More

Alibaba releases an ‘open’ challenger to OpenAI’s o1 reasoning model

A new so-called “reasoning” AI model, QwQ-32B-Preview, has arrived on the scene. It’s one of the few to rival OpenAI’s o1, and it’s the first available to download under a permissive license. Developed by Alibaba’s Qwen team, QwQ-32B-Preview contains 32.5 billion parameters and can consider prompts up ~32,000 words in length; it performs better on…

Read More

OpenAI’s new model is better at reasoning and, occasionally, deceiving

In the weeks leading up to the release of OpenAI’s newest “reasoning” model, o1, independent AI safety research firm Apollo found a notable issue. Apollo realized the model produced incorrect outputs in a new way. Or, to put things more colloquially, it lied. Sometimes the deceptions seemed innocuous. In one example, OpenAI researchers asked o1-preview…

Read More