reinforcement learning

News

Why Reinforcement Learning Could Be AI’s Biggest Flaw Yet

Explore the hidden trade-offs of reinforcement learning in AI and why base models might hold the key to true intelligence.

Devdiscourse1d

Deep reinforcement learning could redefine insulin delivery for diabetes patients

Read more about Deep reinforcement learning could redefine insulin delivery for diabetes patients on Devdiscourse ...

Is ‘The Era of Experience’ Upon Us? Researchers Propose AI Agents Learn From the World

Computer scientist David Silver was a key developer behind AlphaGo, the pivotal Go-playing program that defeated world ...

Communications of the ACM2d

Developing the Foundations of Reinforcment Learning

Let’s move on to temporal difference learning (TD learning), which is a subset of reinforcement learning that was the focus ...

Communications of the ACM2d

A Rewarding Line of Work

Turing Award recipients Richard Sutton and Andrew Barto believe reinforcement learning will play a role in artificial general ...

SWiRL: The business case for AI that thinks like your best problem-solvers

Researchers from Stanford University and Google DeepMind have unveiled Step-Wise Reinforcement Learning (SWiRL), a technique designed to enhance the ability of large language models (LLMs) to tackle ...

SecurityWeek1d

All Major Gen-AI Models Vulnerable to ‘Policy Puppetry’ Prompt Injection Attack

A new attack technique named Policy Puppetry can break the protections of major gen-AI models to produce harmful outputs.

11d

How Auto-Classifying Feedback Can Improve Reinforcement Learning

By categorizing and filtering user input, you can better focus on driving AI improvement. This iterative process—blending automation with human review—ensures AI learns from high-quality data, leading ...

Department of Computer Science - University of Texas at Austin2d

Jiaheng Hu Earns 2025 Two Sigma Ph.D. Fellowship

Third-year doctoral student, Jiaheng Hu is one of two recipients selected for a Ph.D. fellowship with Two Sigma, a New ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results