Current

AMD Strix Halo (Ryzen AI Max+ 395)

AMD · laptop/mini-PC · Released January 2025

AMD's mobile/mini-PC APU with up to 128 GB unified LPDDR5X — the AMD answer to Apple's unified-memory approach.

Hardware spec sheet

Vendor	AMD
Category	APU
VRAM / memory	128 GB
Memory bandwidth	256 GB/s
FP16 TFLOPS	—
INT8 TOPS	—
TDP	120 W
Architecture	Zen 5 + RDNA 3.5
Form factor	laptop/mini-PC
Release date	January 2025
Street price (USD)	2000+ (system)
120V note	Fits in sub-200W laptop/mini envelope — runs on any outlet or USB-C PD adapter.

AMD Strix Halo (retail name Ryzen AI Max+ 395) launched in 2025 as AMD’s answer to Apple’s unified-memory inference story. A Zen 5 CPU combined with a 40-CU RDNA 3.5 iGPU and an XDNA 2 NPU, all sharing up to 128 GB of soldered LPDDR5X-8000 memory at 256 GB/s. AMD’s path here stands on decades of x86 work plus the Radeon RDNA lineage (RDNA 1 on the 5700 XT in 2019) and the Xilinx-acquired NPU lineage from the XDNA team.

Who it’s for: plebs who want one box that does development, inference, and general computing without the fan noise of a GPU rig. Mini-PC form factor (Framework Desktop, HP ZBook variants) and high-end laptops (ASUS ROG Flow, Razer Blade).

Models it runs comfortably: with 128 GB unified, Llama 3 70B at Q8, Mixtral 8x22B at Q4, Qwen 2.5 72B at Q5_K_M. ROCm and llama.cpp Vulkan backend are the practical runners on Linux; DirectML and ONNX Runtime on Windows. Expect 5–15 tok/s on 70B-class models — slower than a 4090 but the VRAM ceiling is ~5× higher.

Hashcenter notes: fits in sub-200W laptop/mini envelope (configurable TDP 45–120 W). Completely silent or near-silent in most chassis. Runs on USB-C PD or standard barrel adapters. Credit to AMD for bringing unified-memory inference to the x86/Linux ecosystem, and to the ROCm and llama.cpp communities for making the software stack usable.

Further reading: This card is a core component of a pleb-grade AI Hashcenter. Pair it with the sovereignty argument in the Sovereign AI for Bitcoiners Manifesto, or look at how the same 120V envelope powers a Bitcoin space heater in our mining catalog. Running both workloads on one rig? See Heating Your Home With Inference.

Models that run on this hardware

Command R+ Command runs at Q8 / FP16 comfortably Llama 3.2 Llama runs at Q8 / FP16 comfortably Qwen 2.5 Qwen runs at Q8 / FP16 comfortably Llama 3.3 Llama runs at Q8 / FP16 comfortably Mixtral 8x7B Mistral runs at Q8 / FP16 comfortably Gemma 3 Gemma runs at Q8 / FP16 comfortably Gemma 2 Gemma runs at Q8 / FP16 comfortably Mistral Small 3 Mistral runs at Q8 / FP16 comfortably

Get it running

01 Install Ollama →
Ten-minute local LLM runtime. One binary, zero cloud.
02 Give it a UI →
Open-WebUI turns Ollama into a self-hosted ChatGPT.
03 Which runner? →
LM Studio vs Ollama vs llama.cpp — pick the right runtime for your rig.

Further reading: Heating your home with inference for turning this card into a winter-heat source, and the Sovereign AI for Bitcoiners Manifesto for the bigger picture on owner-operated AI.

AMD Strix Halo (Ryzen AI Max+ 395)

Hardware spec sheet

Models that run on this hardware

Get it running

Related products, repair, and setup paths