R1 Reinforcement Learning

Open-source DeepSeek-R1 uses pure reinforcement learning to match OpenAI o1 — at 95% less cost

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Chinese AI startup DeepSeek, known for challenging leading AI vendors ...

Geeky Gadgets

How DeepSeek R1 was Designed and Created

Imagine tackling a complex math problem, debugging a tricky piece of code, or navigating a challenging scientific question. Frustrating, right? We’ve all been there—staring at the issue, wishing for a ...

Morning Overview on MSN

How DeepSeek’s new training method could disrupt advanced AI again

DeepSeek’s latest training research arrives at a moment when the cost of building frontier models is starting to choke off ...

NextBigFuture

Open Source DeepSeek R1 Matches OpenAI O1 Math, Code and Reasoning

DeepSeek R1 is an open sourced model. DeepSeek is a Chinese AI research company backed by High-Flyer Capital Management, a quant hedge fund focused on AI applications for trading decisions. They have ...

CoinDesk

The DeepSeek-R1 Effect and Web3-AI

The artificial intelligence (AI) world was taken by storm a few days ago with the release of DeepSeek-R1, an open-source reasoning model that matches the performance of top foundation models while ...

Hosted on MSN

DeepSeek-R1-beating perf in a 32B package? El Reg digs its claws into Alibaba's QwQ

Hands on How much can reinforcement learning - and a bit of extra verification - improve large language models, aka LLMs? Alibaba's Qwen team aims to find out with its latest release, QwQ.… Despite ...

Nasdaq

GPTBots.ai Integrates DeepSeek R1 to Enhance Enterprise AI Capabilities

GPTBots.ai has announced the integration of the DeepSeek R1 large language model (LLM) into its enterprise AI agent platform, enhancing its range of advanced AI capabilities alongside models like ...

True agentic AI is years away - here's why and how we get there

Today's AI agents are a primitive approximation of what agents are meant to be. True agentic AI requires serious advances in reinforcement learning and complex memory.

GIGAZINE

Qwen releases open source AI model 'QWQ-32B' that achieves performance almost on par with DeepSeek, and a demo page that anyone can run for free is also available

DeepSeek-R1 utilizes reinforcement learning (RL) to achieve higher performance than conventional pre-training and post-training methods. Its performance was so high that when DeepSeek-R1 was released ...

VentureBeat

Beyond math and coding: New RL framework helps train LLM agents for complex, real-world tasks

Researchers at the University of Science and Technology of China have developed a new reinforcement learning (RL) framework that helps train large language models (LLMs) for complex agentic tasks ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results