Skip to content

Bitcoin accepted at checkout  |  Ships from Laval, QC, Canada  |  Expert support since 2016

Model Backdoor / Trojan

Sovereign AI

Definition

A model backdoor, also called a neural trojan, is a hidden malicious behavior planted in a machine learning model during training. The compromised model behaves normally on ordinary inputs, preserving high accuracy, but when a specific trigger pattern appears in the input it produces an attacker-chosen output. The danger is precisely this stealth: standard validation accuracy looks fine, so the backdoor passes ordinary testing undetected.

BadNets and the trigger mechanism

The foundational BadNets work by Gu and colleagues showed that poisoning a small fraction of the training data with a chosen trigger, for example a small pixel patch, teaches the model a strong link between that trigger and a target label. A classic demonstration attaches a sticker to a stop sign so a sign-recognition model reads it as a speed-limit sign, while every clean stop sign is still classified correctly. Later trojan attacks construct the trigger to maximally activate chosen internal neurons, and can operate even without access to the original training data.

Why sovereign operators should care

Backdoors enter through the supply chain: poisoned public datasets, tampered pre-trained weights downloaded from a model hub, or a compromised fine-tuning step. Anyone who deploys a third-party model inherits whatever was baked into it. Detection techniques such as Neural Cleanse try to reverse-engineer minimal triggers, but no defense is complete.

The practical defenses are provenance and verification: prefer weights you can attest to, retrain or fine-tune from trusted snapshots, and test against suspected triggers. This attack is closely tied to data poisoning; see also adversarial examples for inference-time manipulation.

In Simple Terms

A model backdoor, also called a neural trojan, is a hidden malicious behavior planted in a machine learning model during training. The compromised model behaves…

Explore the Full Glossary

Browse all Bitcoin mining terms from A to Z. Whether you are a beginner or expert, deepen your understanding of the mining ecosystem.

Mining Glossary

ASIC Miner Database

Compare 500+ miners with real-time profitability data, home mining scores, and detailed specs.

Compare Miners