Reinforcement Learning

37 readers

3 users here now

A community dedicated to discussions on reinforcement learning, a subdiscipline of machine learning that tackles sequential decision making problems.

founded 6 months ago

MODERATORS

howrar@lemmy.ca

-1

OpenAI: Learning to Reason with LLMs (openai.com)

submitted 3 hours ago* (last edited 3 hours ago) by howrar@lemmy.ca to c/reinforcement_learning@lemmy.ca

0 comments fedilink

OpenAI just put out a blog post about a new model trained via RL (I'm assuming this isn't the usual RLHF) to perform chain of thought reasoning before giving the user its answer. As usual, there's very little detail about how this is accomplished so it's hard for me to get excited about it, but the rest of you might find this interesting.

Introducing SIMA, a Scalable Instructable Multiworld Agent (deepmind.google)

submitted 6 months ago by howrar@lemmy.ca to c/reinforcement_learning@lemmy.ca

0 comments fedilink