Code World Model

(github.com)

11 points | by tosh 6 hours ago ago

2 comments

2001zhaozhao 32 minutes ago ago

It would be interesting to see if they have an updated version of a model that employs this training technique. According to the paper it scored well on release (65.8% on SWE bench), but by now it no longer scores competitively against the latest generation open coding models (e.g. Devstral Small 2).
I wonder whether other labs have implemented something similar to this approach. Perhaps code world modeling isn't actually necessary (relative to other simpler techniques) to achieve the kind of deep environment understanding that the paper touts as being important to improve agentic coding performance.
chid an hour ago ago

Given the high bar of entry 160VRAM GPU - is there anything practical one can use this for?