Skip to content

We're upgrading our operations to serve you better. Orders ship as usual from Laval, QC. Questions? Contact us

Bitcoin accepted at checkout  |  Ships from Laval, QC, Canada  |  Expert support since 2016

Current

AMD Strix Halo (Ryzen AI Max+ 395)

AMD · laptop/mini-PC · Released January 2025

AMD's mobile/mini-PC APU with up to 128 GB unified LPDDR5X — the AMD answer to Apple's unified-memory approach.

Hardware spec sheet

VendorAMD
CategoryAPU
VRAM / memory128 GB
Memory bandwidth256 GB/s
FP16 TFLOPS
INT8 TOPS
TDP120 W
ArchitectureZen 5 + RDNA 3.5
Form factorlaptop/mini-PC
Release dateJanuary 2025
Street price (USD)2000+ (system)
120V noteFits in sub-200W laptop/mini envelope — runs on any outlet or USB-C PD adapter.

AMD Strix Halo (retail name Ryzen AI Max+ 395) launched in 2025 as AMD’s answer to Apple’s unified-memory inference story. A Zen 5 CPU combined with a 40-CU RDNA 3.5 iGPU and an XDNA 2 NPU, all sharing up to 128 GB of soldered LPDDR5X-8000 memory at 256 GB/s. AMD’s path here stands on decades of x86 work plus the Radeon RDNA lineage (RDNA 1 on the 5700 XT in 2019) and the Xilinx-acquired NPU lineage from the XDNA team.

Who it’s for: plebs who want one box that does development, inference, and general computing without the fan noise of a GPU rig. Mini-PC form factor (Framework Desktop, HP ZBook variants) and high-end laptops (ASUS ROG Flow, Razer Blade).

Models it runs comfortably: with 128 GB unified, Llama 3 70B at Q8, Mixtral 8x22B at Q4, Qwen 2.5 72B at Q5_K_M. ROCm and llama.cpp Vulkan backend are the practical runners on Linux; DirectML and ONNX Runtime on Windows. Expect 5–15 tok/s on 70B-class models — slower than a 4090 but the VRAM ceiling is ~5× higher.

Hashcenter notes: fits in sub-200W laptop/mini envelope (configurable TDP 45–120 W). Completely silent or near-silent in most chassis. Runs on USB-C PD or standard barrel adapters. Credit to AMD for bringing unified-memory inference to the x86/Linux ecosystem, and to the ROCm and llama.cpp communities for making the software stack usable.

Further reading: This card is a core component of a pleb-grade AI Hashcenter. Pair it with the sovereignty argument in the Sovereign AI for Bitcoiners Manifesto, or look at how the same 120V envelope powers a Bitcoin space heater in our mining catalog. Running both workloads on one rig? See Heating Your Home With Inference.

Get it running

  1. 01 Install Ollama →

    Ten-minute local LLM runtime. One binary, zero cloud.

  2. 02 Give it a UI →

    Open-WebUI turns Ollama into a self-hosted ChatGPT.

  3. 03 Which runner? →

    LM Studio vs Ollama vs llama.cpp — pick the right runtime for your rig.

Further reading: Heating your home with inference for turning this card into a winter-heat source, and the Sovereign AI for Bitcoiners Manifesto for the bigger picture on owner-operated AI.