Cohere has launched North Mini Code, an open-source model built specifically for agentic software engineering. It runs on a single Nvidia H100 GPU and provides teams with a concrete alternative to managed agents like Claude Fable 5.
North Mini Code is a 30 billion parameter mixture-of-experts model with only 3 billion parameters active per token, designed for sub-agent orchestration, code review, and terminal work. It supports a 256,000 token context window and can generate up to 64,000 tokens per session, and is available on Hugging Face under an Apache 2.0 license.
In independent testing, the model generated three times the output tokens of comparable models — a verbosity cost that compounds in high-volume production workloads. This tradeoff makes it less suited for latency-sensitive or repetitive tasks where concise output is critical.
By open-sourcing North Mini Code, Cohere positions itself as a key player in the growing agentic coding market, competing with providers like Anthropic and others. The move signals a shift toward specialized, smaller models that can run on localized hardware, reducing reliance on cloud APIs and lowering operational costs for engineering teams.
The launch reinforces Cohere's strategy of offering customizable, enterprise-friendly AI tools. While open-weight models offer flexibility, the verbosity penalty and the need for in-house optimization may limit adoption in high-throughput environments.