Cohere Releases Open Model North Mini Code for Development on a Single H100 GPU

Photo: VentureBeat
Quick answer
Cohere has released North Mini Code, an open model for agentic software development that runs on a single H100 GPU.
Cohere has introduced the open model North Mini Code, designed to automate software development processes. The model is optimized to run on a single NVIDIA H100 GPU and supports a context of up to 256,000 tokens, enabling the analysis of large projects in a single pass. The solution is available under the Apache 2.0 license and has already been published on the Hugging Face platform.
North Mini Code is a mixture-of-experts (MoE) model with 30 billion parameters, of which only 3 billion are actively used per token. This reduces computational resource requirements during inference. The model is trained on over 70,000 verifiable tasks from 5,000 repositories, ensuring high accuracy in agentic development scenarios such as code review, architecture analysis, and terminal interaction.
In independent tests, North Mini Code demonstrated high generation speeds—up to 210 tokens per second—placing it in the top 10 among open models. However, experts note that the model generates three times more tokens than competitors, which may increase inference costs in high-load pipelines. This makes it more suitable for local deployment rather than cloud-based solutions with per-token pricing, such as Claude Fable 5* or GitHub Copilot*.
Cohere founder Nick Frost emphasized that North Mini Code is an alternative to proprietary models, offering transparency and data control. The model already supports operation on local devices, including Mac Studio, making it attractive for teams focused on data sovereignty and reducing dependence on cloud services.
Common questions
- How does North Mini Code differ from other development models?
- North Mini Code is specifically trained for agentic development tasks, including terminal interaction and architecture analysis, rather than being adapted from general-purpose models. It also supports local deployment on a single H100 GPU.
- What tasks can North Mini Code perform?
- The model can conduct code reviews, analyze project dependencies, interact with command lines, and generate code within agentic development pipelines.
- What are the limitations of North Mini Code?
- A key drawback is its high verbosity: the model generates three times more tokens than competitors, increasing inference costs in industrial scenarios.
Dzen feed: /feed/dzen.xml · RSS: /feed.xml