D-Central Technologies — decentralized computing powering your digital sovereignty.
A local LLM is a large language model that runs on a computer you own — in your home or office, here in Canada — with no prompt and no document ever leaving the machine. For a growing number of Canadians and Canadian businesses, that is the whole point: the difference between renting an AI capability a foreign vendor can re-price, restrict, or switch off, and owning one that runs offline, runs next year, and answers only to you. This page is the practical starting point — what you can run, what hardware it takes, and how D-Central helps you stand it up.
Why Canadians are bringing AI in-house
Three forces are pushing Canadian individuals, SMBs, and regulated organizations toward local AI at once:
- Reliability. On 12 June 2026, a US export-control directive forced the disabling of two frontier models for foreign nationals overnight. A capability you rent from another country’s vendor is only as durable as that country’s policy allows. We unpacked the lesson in Own Your AI.
- Data sovereignty. The US CLOUD Act can reach data held by US-based providers even when it sits in Canada, and Quebec’s Law 25 expects you to assess cross-border transfers of personal information. A fully local deployment removes the transfer entirely. (Orientation, not legal advice — see Law 25 & on-premise AI.)
- Cost and control. Cloud AI is a recurring bill that never stops; local AI is mostly an up-front hardware cost. For steady, daily, or team-wide use the math increasingly favours owning. The honest version of the trade-off is in Local AI vs Cloud AI.
The bigger picture — owning your money, intelligence, connectivity, and power as layers no single actor can switch off — is our Sovereign AI in Canada pillar.
What you can actually run locally
More than most people expect. Open-weight models — families like Llama, Qwen, Gemma, Mistral, DeepSeek, and gpt-oss — run well on hardware you can buy in Canada and handle the bulk of real work: drafting and editing, summarizing long documents, classification and extraction, coding assistance, and retrieval-augmented generation (RAG) over your own files. Credit where it is due: that capability exists because the open-model community ships strong weights anyone can use, and tools like Ollama, llama.cpp, and LM Studio made running them genuinely easy.
We will be honest about the gap: for the very hardest reasoning, the longest contexts, and top-end multimodal work, the best hosted models still lead, by roughly a few months — and that gap has been closing. The useful question is not “is it as smart as the frontier?” but “is it smart enough for the job?” For most day-to-day business work, the answer is yes.
How much hardware do you need?
Think of it as a VRAM ladder — bigger models need more video memory (figures are for 4-bit quantization, a sensible default):
| Model class | Approx. VRAM | Good fit for |
|---|---|---|
| 7–8B | ~8 GB | A capable everyday assistant on a modest modern GPU |
| 13–14B | ~12 GB | Stronger general work |
| 30–32B | ~24 GB | A used RTX 3090 (24 GB) is the classic dollars-per-VRAM value pick |
| 70B | ~48 GB | Two GPUs, or Apple Silicon with 128 GB unified memory |
| gpt-oss-120b | ~80 GB | Serious silicon for the largest open models |
The takeaway: a genuinely useful private assistant is within reach of a single well-chosen GPU. Match the memory to your workload rather than buying the biggest card you can. The full build-out and model-matching walkthrough is in Replace Cloud AI With a Local LLM.
How to get started
Starting is easy; doing it well is the harder part. Install Ollama or LM Studio, pull a small 7–8B model, and you will be chatting in minutes — full credit to those projects for lowering the bar. The hard 80% is everything around the model: power draw, heat, quiet 24/7 operation, RAG over your own documents, and tuning for real workloads. That is where most home and office setups quietly struggle, and it is exactly the part we help with.
Let D-Central build it — Canada-wide, hand-built in Laval
If you would rather not assemble and tune it yourself, we do the build, the model selection, the power-and-heat planning, and the hardening, so what you receive is already running. Quote-only and build-to-order: priced to the job, not to a shelf.
- Book a Sovereignty Briefing — a focused session that ends with a written, plain-English recommendation you keep whether or not you buy anything.
- Browse Sovereign AI — hand-built local-AI machines and personal AI computers (including the NVIDIA DGX Spark) that ship with an open-weight model preloaded and running.
We will also tell you honestly when the cloud is the better answer for your use case — that is the whole point of asking people who run both.
Frequently asked questions
Can a local LLM really replace ChatGPT or Claude in Canada?
For most real work — writing, editing, summarizing, document Q&A, and a capable coding assistant — yes. For the hardest reasoning, the longest contexts, and absolute top-end quality, a frontier hosted model still wins, and we will not pretend otherwise. A local model is not smarter; it is yours — it cannot be revoked, and for the bulk of day-to-day tasks that trade is worth it.
What hardware do I need to run a local LLM?
It scales with model size at 4-bit quantization: about 8 GB of VRAM for a 7–8B model, ~12 GB for 13–14B, ~24 GB for 30–32B (a used RTX 3090 is the value pick), and ~48 GB or Apple Silicon with 128 GB unified memory for a 70B. A capable private assistant is within reach of a single well-chosen GPU.
Does running an LLM locally help with Quebec’s Law 25?
A fully local deployment, where personal information never leaves your premises, is the cleanest posture available because it removes the cross-border transfer and third-party-access problems by design. It is a strong posture, not an automatic guarantee — compliance depends on your whole set of practices, so confirm specifics with qualified counsel. This is orientation, not legal advice.
Which open models are best to run locally?
The open-weight families people reach for most are Llama, Qwen, Gemma, Mistral, DeepSeek, and gpt-oss. The right pick depends on your job and your hardware budget — we match a model to your use case during a Sovereignty Briefing rather than chasing a leaderboard.
Can D-Central set this up for me, or sell me a box that is already running?
Both. We build, select and tune the model, plan the power and heat, and harden the setup so what you receive already works — or we guide you through doing it yourself. Everything is quote-only and build-to-order, hand-built in Laval, Quebec, and shipped Canada-wide.
Own your AI: the sovereign path
Move from understanding the risk to owning your compute: read the pillar, compare local against cloud, check the Quebec Law 25 angle, then have D-Central build or guide your on-premise setup.
