Google is shipping Gemini models faster than its AI safety reports

More than two years after Google was caught flat-footed by the release of OpenAI’s ChatGPT, the company has dramatically picked up the pace. In late March, Google launched an AI reasoning model, Gemini 2.5 Pro, that leads the industry on several benchmarks measuring coding and math capabilities. That launch came just three months after the…

Read More

Researchers suggest OpenAI trained AI models on paywalled O’Reilly books

OpenAI has been accused by many parties of training its AI on copyrighted content sans permission. Now a new paper by an AI watchdog organization makes the serious accusation that the company increasingly relied on non-public books it didn’t license to train more sophisticated AI models. AI models are essentially complex prediction engines. Trained on…

Read More

A new, challenging AGI test stumps most AI models

The Arc Prize Foundation, a nonprofit co-founded by prominent AI researcher François Chollet, announced in a blog post on Monday that it has created a new, challenging test to measure the general intelligence of leading AI models. So far, the new test, called ARC-AGI-2, has stumped most models. “Reasoning” AI models like OpenAI’s o1-pro and…

Read More

Google DeepMind’s new AI models help robots perform physical tasks, even without training

Google DeepMind is launching two new AI models designed to help robots “perform a wider range of real-world tasks than ever before.” The first, called Gemini Robotics, is a vision-language-action model capable of understanding new situations, even if it hasn’t been trained on them. Gemini Robotics is built on Gemini 2.0, the latest version of…

Read More

Google Gemini: Everything you need to know about the generative AI models

Google’s trying to make waves with Gemini, its flagship suite of generative AI models, apps, and services. But what’s Gemini? How can you use it? And how does it stack up to other generative AI tools such as OpenAI’s ChatGPT, Meta’s Llama, and Microsoft’s Copilot? To make it easier to keep up with the latest Gemini…

Read More

The hottest AI models, what they do, and how to use them

AI models are being cranked out at a dizzying pace, by everyone from Big Tech companies like Google to startups like OpenAI and Anthropic. Keeping track of the latest ones can be overwhelming.  Adding to the confusion is that AI models are often promoted based on industry benchmarks. But these technical metrics often reveal little…

Read More

These researchers used NPR Sunday Puzzle questions to benchmark AI ‘reasoning’ models

Every Sunday, NPR host Will Shortz, The New York Times’ crossword puzzle guru, gets to quiz thousands of listeners in a long-running segment called the Sunday Puzzle. While written to be solvable without too much foreknowledge, the brainteasers are usually challenging even for skilled contestants. That’s why some experts think they’re a promising way to…

Read More

OpenAI’s o3 suggests AI models are scaling in new ways — but so are the costs

Last month, AI founders and investors told TechCrunch that we’re now in the “second era of scaling laws,” noting how established methods of improving AI models were showing diminishing returns. One promising new method they suggested could keep gains was “test-time scaling,” which seems to be what’s behind the performance of OpenAI’s o3 model —…

Read More