The user expresses a desire for Cursor to integrate or leverage open-source large language models (LLMs) such as Ollama and GLM. This would be beneficial for users who might hit usage limits with proprietary models like Claude and Codex.
Cursor 2.0 is their big shot to regain relevance in coding agents, and there are some generally interesting AI product patterns in here. I have a soft spot for Cursor, they're the scrappy $10b startup fighting against much bigger players. They'll probably lose overall, but they might be able to carve out a niche for themselves in the process. This release doesn't look like it does that though, unfortunately. Some nice things but I don't quite see it: - They built their own model - "Composer". (see attached chart) This had to happen. They were stuck paying token costs + profit for Anthropic and OpenAI - their direct competitors. There's no way to survive at scale in this market without vertical integration, because your costs will keep rising relentlessly. But the challenge with Composer is the same as for any of us building or finetuning a model. Will it actually outperform the state-of-the-art models? (And even if it does now, will it keep doing that into the future?) It doesn't, even on their own benchmark. But it is apparently fast, which is an interesting edge in this space, because it could fundamentally change user workflow. It goes from: "prompt model, alt-tab, break flow, watch youtube, eventually get back to see the results"... to: "prompt model, see it progressing, stay in flow, course correct model, work together" So we'll see how that feels for people. - Multi-model queries. This is a weird feature. Basically you can give the same prompt to multiple models and then pick the results you like the best. If you think this sounds a) super expensive b) a huge amount of effort for probably minimal return then my gut feel is that you're right. I don't know how you effectively choose between multiple competing implementations without spending a great deal of time and effort assessing the nuance of each, which isn't currently the way people work. This might turn out to be a good pattern but I don't really see it. - Built-in browser. We know that for all types of reasoning models and agentic features the feedback loop is crucial. Especially in coding. It's way better if the model can tell itself when it's done a bad job, rather than relying on you taking screenshots and pasting them in and saying "fix". So they've got this built-in browser which at least saves that bit of effort. It's been in testing for a while, I haven't tried it but people say it's a nice incremental improvement. The problem is that LLMs still aren't good at understanding visual stuff, so we'll see how much this _actually_ helps. - They changed the UI. But everyone does that all the time these days so _shrug_? I just fundamentally don't think they have the capital to keep up. The fact that their specially trained coding model didn't even get to state-of-the-art on their own benchmark is a tacit acknowledgement of that - they clearly didn't have the time/GPUs/money to keep training it. But someone try it please and let me know how it feels!