2 comments

  • teamcubitflow 7 hours ago ago

    I'm surprised that kind of captioning came from a 2B model; glad the fine tuning process actually shows a deliberate approach to making qwen 3.5 into essentially a new model of it's kind.

    • HappyPablo 7 hours ago ago

      hey this is shubham, yeah Qwen3.5VL is awesome and it's training vocab is quiet strong so with the right data curation you can prolly take it into a bunch of other narrow tasks eg: we trying to fine-tune it to use SAM3 in a loop for segmentation tasks in the videos