Very similar to chain of draft but seems more thorough
It matches R1 in the given benchmarks. R1 has 671B params (36 activated) while this only has 32
insane, absolutely insane
good luck trying to run a video model locally
Unless you have top tier hardware
what is the license? The link on hf just 404s