Google’s Coral Board: Local Gemma 3 Exec…

Remember when the Raspberry Pi promised to replace the desktop PC and we mostly just used them to run Pi-holes and basic home automation? Google is attempting a similar play with the new Coral Board.

The announcement, detailed by The Decoder, introduces a compact single-board computer designed specifically to run Gemma 3 locally. On the surface, it looks like a win for the “local-first” crowd. We get a dedicated piece of silicon optimized for the model, avoiding the usual struggle of trying to squeeze a model into a limited VRAM budget. But there is a difference between “it runs” and “it is useful.” For the developer who already has a 4090 or a Mac M4 Ultra, a tiny board is a curiosity, not a tool. For the edge-computing dev, it’s a gamble on whether Google’s software stack will actually be accessible or if it will be a walled garden with a very small fence.

Google is attempting a vertical integration play by bundling the model with the silicon. This isn’t just about hardware; it is about control. By providing the board and the model, Google ensures that Gemma 3 has a “home” where it performs exactly as intended. (And likely at a price point that favors enterprise deployments over hobbyists). The problem is that this approach ignores the reality of how the open-weights community actually works. We don’t want a curated experience; we want raw access.

The license for Gemma 3 remains a sticking point. It is not Apache 2.0. While it is permissive enough for most commercial uses, it is a custom license that keeps Google’s lawyers in the loop. When you compare this to the freedom of a truly open license, the Coral Board starts to feel less like a developer kit and more like a tether. It is like trying to run a professional kitchen out of a toaster—it might technically cook the food, but you are limited by the dimensions of the slot.

The open weights pecking order is currently dominated by Qwen and Llama. For the Coral Board to matter, Gemma 3 needs to beat Llama 3.3 or Qwen 2.5 in the small-parameter bracket. If the 4B or 8B versions of Gemma 3 don’t provide a significant jump in reasoning or coding capability, the board is just a fancy way to run a mediocre model. Most of us aren’t looking for a dedicated board; we are looking for a GGUF or EXL2 quant that we can throw into Ollama or llama.cpp and get 50+ tokens per second.

If you are running a 3090 or 4090, you already have the ultimate “Coral Board.” You have the VRAM to run the quantized versions of almost any small-to-mid-sized model with negligible latency. The real question is whether the Coral Board can offer something those GPUs can’t—like power efficiency at the extreme edge. But if the latency is high and the tokens per second are abysmal, the efficiency doesn’t matter. Why would anyone buy this over a used Jetson or a NUC with a low-profile GPU?

The friction will be in the SDK. Google’s hardware tools have a history of being clunky and poorly documented for the average dev. If the board requires a proprietary compiler to get any real performance out of Gemma 3, the community will ignore it. We want MLX support for the Macs and vLLM or sglang support for the Linux boxes.

Gemma 3 will be surpassed by a Qwen update by Q4.

It’s a fancy paperweight until the SDK catches up.

Related coverage

Running DeepSeek-V4-Flash on AMD MI300X: Hardware and Software Challenges

Nvidia RTX Spark: Breaking the VRAM Wall for Local AI Agents

LetinAR and the Hardware Bottleneck of AI Glasses

Figure AI’s Humanoid Robots: Marketing Performance vs. Technical Reality