AI inference should run
everywhere.
We believe large language models shouldn't be locked behind cloud APIs. Every device — from a mobile phone to a bare-metal server — should be able to run AI locally. That's why we're building the full inference stack, open source.
The Problem
- Cloud LLM APIs create vendor lock-in and unpredictable costs
- Sensitive data must leave your infrastructure for every inference call
- Existing local tools only solve one layer of the stack
- Most languages have no native LLM bindings — HTTP is the only option
- Mobile and edge devices are underserved by current inference tooling
Our Approach
- Build every layer of the inference stack as open source
- Provide native bindings for 6+ programming languages
- Target every deployment surface: servers, desktops, mobile, bare metal
- Maintain API compatibility with existing ecosystems (Ollama, OpenAI)
- Invest in education to grow the community of local-AI practitioners
- Explore open hardware reference designs purpose-built for local inference
The Cognisoc Stack
Five projects, each purpose-built for a specific layer of the inference problem.
Modular Rust inference runtime — the engine that powers everything
Local LLM server with polyglot bindings — making inference accessible
Mobile inference via Flutter — AI in every pocket
Bare-metal unikernel — pushing inference to the silicon
Educational implementation — growing the next generation of ML engineers
By the Numbers
What's Next: Open Hardware
Software is only half the stack. We're exploring open hardware reference designs optimized for local LLM inference — purpose-built boards and configurations designed to run cognisoc software from boot.
Inference Accelerators
Single-board designs with NPUs and RISC-V cores, running cllm directly on bare metal.
FPGA Capes
Reconfigurable accelerator boards for custom quantization and novel attention mechanisms.
Cluster Blueprints
GPU cluster rack designs with optimized networking for distributed inference with unillm.
Open schematics. Open firmware. Open software. If you're in the hardware space, reach out.
Open Source, Open Future
Every project in the Cognisoc ecosystem is open source under MIT or Apache-2.0 licenses. We believe the infrastructure layer for AI inference should be a public good — not a proprietary moat.
We welcome contributions from developers, researchers, and organizations who share our vision of democratizing LLM inference. Whether it's adding a new model architecture to unillm, improving mobile performance in llamafu, or writing educational content for zigllm — there's a place for you.
Get Involved
Whether you're a developer, investor, or cloud architect — we'd love to hear from you.