Exploration Hacking: Can LLMs Learn to Resist RL Training?

(alignmentforum.org)

2 points | by Prof_Sigmund 4 days ago ago

No comments yet.