EMPHOS Group — New Product

Near Release

The AI code editor
that never calls home.

CAMS Code is a local-first AI code editor powered by Qwen2.5-Coder and the AICL protocol stack. No cloud. No subscription. No one reading your code. Buy it once, own it permanently, and write code faster on hardware you already have.

2,036msFaster Than Raw

<300μsQuery Routing

6Hardware Tiers

0Cloud Calls

Why CAMS Code Exists

Cursor is renting you access
to your own workflow. We're not.

Every major AI code editor today routes your code through a cloud server. That means your proprietary logic, your unreleased features, your private repositories — all of it leaves your machine on every completion request. And then you pay a monthly fee for the privilege.

Feature

Cursor / Copilot

CAMS Code

Inference location

Cloud server

Your machine

Pricing model

Monthly subscription

One-time purchase

Code privacy

Leaves your machine

Never leaves

Works offline

Always

Protocol overhead

None — raw inference

AICL — 2s faster

The AICL Advantage

Your code editor should be faster
than raw inference. CAMS Code is.

CAMS Code runs every query through the AICL protocol stack — the same three patent-pending protocols powering Haven. The result: completions arrive faster, responses are tighter, and the model does less unnecessary work on every request.

PSIP — Signal-Aware Contracts

Before the model sees your query, PSIP classifies it — topic, tone, urgency, verbosity — and builds a response contract that tells the model exactly how to answer. No preamble. No "Certainly! Here is your code..." filler. Just the output.

Saves 47–61 tokens per response

FTIP — Persistent State Channel

FTIP uses the KV cache as a persistent communication channel — encoding session state, tool intent, and reconstruction cues into a 12-token fractional sequence. The model does not need to re-establish context on every turn.

Eliminates 52–65 tokens per inference

AICL — The Routing Layer

AICL's compiler decides in under 500 microseconds which combination of protocols to fire for each query. Simple completions get PSIP alone. Complex, stateful queries get the full PSIP + FTIP stack. The model always gets exactly what it needs.

Sub-300μs classifier · Sub-500μs compiler

Measured across 585 real inference runs: CAMS Code with AICL active is 2,036 milliseconds faster on average than raw inference on the same hardware. That is not a latency reduction. That is a different experience.

The Model Stack

Qwen2.5-Coder. Built for this.

CAMS Code runs on Qwen2.5-Coder — Alibaba's state-of-the-art code-specialized model family, quantized for local inference via llama-cpp-python. Six hardware tiers ensure the right model loads for your machine automatically.

CPU · Low VRAM

Qwen2.5-Coder 3B Instruct — Q4_K_M

Entry tier. Runs on CPU or integrated graphics. Designed for machines without discrete GPU. Fast for simple completions, inline suggestions, and short context tasks.

T2
T3

GPU Mid · 6–8GB VRAM

Qwen2.5-Coder 7B Instruct — Q4_K_M

Primary tier for most developers. Runs comfortably on RTX 3060 / RTX 4060 class hardware. Strong code generation, refactoring, and multi-file awareness.

T4
T5

GPU High · 12–16GB VRAM

Qwen2.5-Coder 14B Instruct — Q5_K_M

High-performance tier for serious development work. Handles large contexts, complex refactors, architectural reasoning, and documentation generation with precision.

GPU Pro · 24GB+ VRAM

Qwen2.5-Coder 32B Instruct — Q5_K_M

Maximum capability tier. RTX 4090 / workstation class. Full architectural understanding, large codebase navigation, and production-grade code generation at scale.

Hardware tier is detected automatically on first launch. CAMS Code reads your GPU, VRAM, and RAM configuration and loads the optimal model for your machine — no manual configuration required.

How It Works

CLI-first. No friction, no ceremonies.

CAMS Code ships with a clean command-line interface that gives you direct access to hardware detection, query classification, and the full inference pipeline. The GUI wraps it — but you can always go straight to the source.

CAMS Code Terminal

$ cams cli hardware

✓ GPU detected: NVIDIA GeForce RTX 4060 Laptop GPU

✓ VRAM: 8.0 GB RAM: 31.6 GB CUDA: 12.6

✓ Tier assigned: gpu_mid (Tier 2/3)

✓ Model selected: qwen2.5-coder-7b-instruct.Q4_K_M.gguf

$ cams cli classify "refactor this auth module to use JWT"

✓ Topic: code Lane: AP (AICL-PSIP)

✓ Signals: recon_cue=code_block tool_intent=code_exec

✓ Contract: ceiling=256 tokens FTIP: firing

✓ Classifier: 247μs Compiler: 318μs

Session Memory

CAMS Code maintains session state across your coding session — tracking task history, active project root, and turn count. The FTIP state channel means the model always knows where you are in a refactor without you restating context.

Offline First

No internet connection required after installation. CAMS Code ships with an offline USB package option — installer plus model weights on a drive, for air-gapped development environments, secure facilities, or simply working on a plane.

Ownership

You buy it. You own it. Full stop.

CAMS Code is not a subscription. It is not a seat. It is not a cloud plan that can be cancelled, price-hiked, or discontinued when the company decides to pivot. You buy CAMS Code once. It is yours.

Every update improves what you already paid for. The model weights ship with the product. The license is verified offline with Ed25519 cryptographic signing — your machine checks the key, nothing checks the cloud. If the EMPHOS servers went dark tomorrow, your copy of CAMS Code would keep working.

That is not the industry standard. That is a choice. It is the same choice EMPHOS made with Haven. A tool built for developers should be owned by developers — not rented to them at the company's discretion.

Coming SoonOne-time · Own it forever · No subscription

Technical Specs

Everything under the hood.

Inference Engine

llama-cpp-python with CUDA acceleration. GGUF model format. Supports GPU offloading with automatic layer calculation based on available VRAM. Streaming inference with per-token callbacks.

Protocol Stack

Full AICL pipeline: PSIP signal-aware contracts, FTIP v2 fractional state encoding, 5-lane compiler (RAW, PSIP, AP, APF, AUTO), unified decoder with scaffolding stripping and code-safe enforcement.

Platform

Windows 10/11, 64-bit. Python 3.12 runtime bundled. PySide6 GUI. PyInstaller frozen build (~850MB). NSIS installer with EV code-signing. SHA-256 hash published with every release.

Hardware Requirements

Minimum: any modern CPU, 8GB RAM (3B model, CPU inference). Recommended: NVIDIA GPU with 6GB+ VRAM, CUDA 12.x, 16GB RAM. Optimal: RTX 4070 or better for the 14B tier.

Licensing

Ed25519 offline license key signing. Verification happens locally — no server call required. Keys are cryptographically bound to the product version. Works permanently, even offline.

Models Included

USB offline package includes 3B and 7B weights as minimum. 14B and 32B weights available for download post-install. All weights are Q4_K_M or Q5_K_M quantized GGUF — optimized for local inference.

CAMS Code is nearly ready.

While you wait — Haven is available now. The same AICL protocol stack, the same local-first philosophy, the same one-time ownership model. On your desktop, in your workflow, today.

Explore Haven All Products

The AI code editorthat never calls home.

Cursor is renting you accessto your own workflow. We're not.

Your code editor should be fasterthan raw inference. CAMS Code is.