I’ll try QVQ-Max when I get home. It’s kinda fascinating how these models can interpret an image and actually understand what is in it and reason around it. So cool.
> We are a group of people with diverse talents and interests.
Experimental research model with enhanced visual reasoning capabilities.
Supports context length of 128k.
Currently, the model only supports single-round dialogues and image outputs. It does not support video inputs.
Should be capable of images up to 12 MP.
That's an earlier version released some months ago. They even acknowledge it.
The version they present in the blog post and you can run in their chat platform is not open or available to download.
I'm not complaining; it is useful to get an aggregated signal. In a sense, I like the downvotes, because it means there are people I might be able to persuade.
So how do I make the case? Remember, I'm not even making an argument for one side or the other! My argument is simply: be curious. If appropriate, admit to yourself e.g. "you know, I haven't actually studied all sides of the issue yet; let me research and write down my thinking..."
Here's my claim: when it comes to AI and society, you gotta get out of your own head. You have to get out of the building. You might even have to get out of Silicon Valley. Go learn about arguments for and against open-weights models. You don't have to agree with them.