3 comments

  • nithisha2201 4 days ago ago

    Interesting, how do you handle the observability side during training? One thing I ran into with multi-agent RL is that reward signals alone don't tell you much about why an agent is failing. Curious if you've built any tooling around that.

  • Remi_Etien 4 days ago ago

    [dead]

  • georaa 4 days ago ago

    [flagged]