Skip to content

We're upgrading our operations to serve you better. Orders ship as usual from Laval, QC. Questions? Contact us

Bitcoin accepted at checkout  |  Ships from Laval, QC, Canada  |  Expert support since 2016

Current

FLUX.1 dev

Black Forest Labs · FLUX family · Released August 2024

Black Forest Labs' August 2024 flagship — 12B rectified-flow transformer that set the new quality bar for open image generation.

Model card

DeveloperBlack Forest Labs
FamilyFLUX
LicenseNon-Commercial
Modalityimage-gen
Parameters (B)12
Context window0
Release dateAugust 2024
Primary languagesen
Hugging Faceblack-forest-labs/FLUX.1-dev
OllamaNot on Ollama registry

Black Forest Labs released FLUX.1 today, and the open-weight image generation landscape just shifted in a way that Stable Diffusion’s original 2022 release felt like. Three variants ship: FLUX.1 [pro] (API-only, closed), FLUX.1 [dev] (open weights under a non-commercial research license), and FLUX.1 [schnell] (4-step distilled, Apache 2.0 licensed). For sovereign plebs who generate images on their own hardware, FLUX.1 [dev] is the most capable open-weight image model ever released to the public.

The team behind this release matters. Black Forest Labs is the founding team of Stable Diffusion itself—Robin Rombach, Andreas Blattmann, and Dominik Lorenz—reassembled after leaving Stability AI. The company announcement frames FLUX.1 as the successor to the SD lineage with a clean slate: new architecture, new training data, and a funding round that lets them build without the corporate drama that plagued Stability.

What’s in the weights

FLUX.1 is a 12 billion parameter flow-matching model—a different objective from the denoising diffusion that Stable Diffusion and SDXL used, though the inference loop looks similar from the outside. Flow matching, developed in Meta and Princeton research from 2022 onwards, learns a continuous transformation between noise and data that’s mathematically cleaner than diffusion and, in practice, trains more stably and converges faster for a given compute budget.

The architecture is a hybrid Multi-Modal Diffusion Transformer (MMDiT) plus parallel attention, described in the FLUX.1 repository. Text conditioning comes from two encoders running in parallel—a T5-XXL encoder for detailed semantic content and a CLIP encoder for the style-and-concept signal that SDXL users will find familiar. The text tokens and image tokens are concatenated and attended jointly in a DiT backbone. This is similar in spirit to Stable Diffusion 3’s architecture but at much larger scale and with Black Forest Labs’ own improvements.

Three variants today:

  • FLUX.1 [pro]: API-only via Replicate, fal.ai, and the Black Forest Labs API. The highest-quality variant, not downloadable.
  • FLUX.1 [dev]: 12B parameters, open weights, non-commercial research license. 50-step generation, near-[pro] quality. This is the pleb model.
  • FLUX.1 [schnell]: 12B parameters, Apache 2.0 licensed, distilled to 4 steps via a guidance-distillation technique. Much faster inference at the cost of some detail. Usable commercially without license headaches.

The license distinction matters. [dev] is non-commercial only—you can run it for yourself, generate personal images, use it in research, but you cannot build a commercial product on [dev] weights without licensing. [schnell] is Apache 2.0, fully permissive, and is Black Forest Labs’ answer to plebs who need commercial use. The [pro] variant exists for anyone who wants top-tier quality in a commercial SaaS context.

Training data: Black Forest Labs has been less forthcoming about the corpus than some competitors. The release notes cite "large-scale web image-text data" without enumerating sources. This is a reasonable caveat to flag—the data provenance question is an active one across the industry, and FLUX.1 doesn’t resolve it more transparently than Stable Diffusion did.

What it does well on release

Benchmarking image generation is notoriously fuzzy, but the release day community reaction tells a consistent story. FLUX.1 [dev] nails several things SDXL struggled with:

  • Text rendering: Words in images come out coherent, readable, and in the style you asked for. SDXL’s infamous scrambled-text failure mode is largely gone. This is the single most visible quality improvement.
  • Human anatomy: Hands have five fingers. Eyes match. Limbs connect to bodies at anatomically plausible angles. This has been the pleb benchmark for image model quality since 2022, and FLUX.1 is the first fully-open model to consistently clear it.
  • Prompt adherence: Complex compositional prompts ("a red ball on top of a blue cube next to a yellow pyramid") are respected at a level SDXL required careful prompt engineering to achieve.
  • Photorealism: Skin texture, lighting, and material response are noticeably improved over SDXL and SD 3.
  • Aesthetic coherence: The model produces consistent style across a single image in a way that SDXL often fragmented.

Where it’s still limited at release: LoRA and ControlNet ecosystems don’t exist yet (they will, quickly, but on day one you’re using base FLUX without the community tooling). The non-commercial license on [dev] means commercial Hashcenter workflows need to use [schnell] or negotiate with Black Forest Labs. And inference is slow on consumer hardware—a 12B parameter model doing 50 steps is not a fast generation.

What it means for the sovereign pleb

For sovereign AI workflows, FLUX.1 [dev] replaces SDXL as the default open-weight image model on an RTX 3090 or 4090. The quality gap is large enough that the slower inference is worth it for most pleb use cases—personal art, research visualization, product mockups, concept iteration.

VRAM requirements:

  • FLUX.1 [dev] FP16: ~24GB VRAM — tight on a single RTX 3090/4090, requires CPU offloading or int8 quantization for comfortable use
  • FLUX.1 [dev] FP8: ~12–14GB VRAM — runs on a 16GB card with some CPU offload, comfortable on a 24GB card
  • FLUX.1 [dev] Q4_0 (GGUF): ~6–8GB VRAM — community quants will land within a week or two, enabling 12GB cards
  • FLUX.1 [schnell]: Same VRAM as [dev] but 4-step inference means ~10× faster per image

Inference speed on a single RTX 3090 at FP16: roughly 30–60 seconds per 1024×1024 image for [dev] at 50 steps, 4–8 seconds for [schnell] at 4 steps. On a 4090, cut those numbers roughly in half. This is meaningfully slower than SDXL, but the quality delta justifies the wait for almost every workflow.

For the used RTX 3090 pleb rig, FLUX.1 [schnell] at FP8 is the new default image model. For 16GB cards, [schnell] with CPU offload works. For 12GB and below, wait for community GGUF quants—they’ll be available within weeks and will enable broad pleb accessibility. Our quantization explainer covers the tradeoffs; for FLUX specifically, FP8 is the sweet spot when you have the VRAM for it.

Hashcenter integration makes particular sense for image work. Generating thousands of product shots, art variations, or research visualizations is the kind of batch GPU workload that pairs well with inference heating—constant compute load produces consistent heat output, which is exactly what you want from a heating source. For plebs converting retired ASIC hashcenters to AI inference, FLUX.1 is a strong candidate for the image-generation workload in a mixed LLM/image stack.

If you’re already running ComfyUI for Stable Diffusion workflows, FLUX.1 nodes are already appearing in the ComfyUI community repos today. The migration from SDXL to FLUX workflows is straightforward for plebs who have existing pipelines—swap the model node, adjust the sampler to work with flow matching, and most everything else carries over.

How to run it today

ComfyUI is the fastest way for plebs to get FLUX.1 running. Pull the weights from Hugging Face:

black-forest-labs/FLUX.1-dev

black-forest-labs/FLUX.1-schnell

You’ll need to accept the license on Hugging Face for [dev]. Schnell is fully open, no acceptance needed. Drop the weights into your ComfyUI models/checkpoints directory, restart, and grab one of the example workflows from the ComfyUI examples repo—FLUX workflows are being added today.

For plebs who prefer Automatic1111, native FLUX support is not there yet on release day, but Forge WebUI (the pleb-favored A1111 fork) typically adds new model support within a week. For command-line generation, the official Black Forest Labs repo has a minimal Python CLI. Diffusers library support is arriving in version 0.30 per their release notes, shipping any day.

Ollama and LM Studio are text-model focused and don’t run image models today. For troubleshooting image generation setups, our self-hosted AI troubleshooting guide covers the common VRAM, driver, and workflow issues. The pleb self-hosted AI guide pairs image models with LLMs for multimodal workflows—FLUX.1 for generation, Ollama-served LLMs for prompt crafting and caption generation.

What comes next

Black Forest Labs has been explicit that image is the first product, not the only product. The announcement mentions video as a roadmap item—"state-of-the-art text-to-video" is listed in their longer-term goals. Given the founding team’s track record with diffusion research, expect a video model from Black Forest Labs within the next 12 months that applies the flow-matching approach to temporal data.

The SD3 release earlier this year was a disappointment for many plebs—the weights were watered down, the license was aggressive, and Stability AI’s ongoing corporate turbulence made long-term bets on the SD lineage feel shaky. FLUX.1 is the clearest signal yet that the open-weight image generation momentum has moved to Black Forest Labs. For plebs, that’s welcome—the team has the research pedigree, the release is clean, and the [schnell] variant’s Apache 2.0 license provides a genuinely permissive path for commercial use.

Pull the weights, spin up ComfyUI, and generate on your own hardware. Your images, your prompts, your Hashcenter. That’s the sovereignty play—and today it just got a lot prettier. If you’re planning broader infrastructure—mixed LLM-plus-image workloads, paid inference for other plebs, the Hashcenter economic pivot—FLUX.1 is now a required component of the stack.

Recommended hardware

Runs on 16 GB VRAM — 4070 Ti or M3 Pro. Quantized Q4 fits comfortably.

Buying guide: used RTX 3090 for LLMs (2026) →

Get it running

  1. 01 Install Ollama →

    Ten-minute local LLM runtime. One binary, zero cloud.

  2. 02 Give it a web UI →

    Open-WebUI turns Ollama into a self-hosted ChatGPT.

  3. 03 Understand quantization →

    GGUF Q4/Q8/FP16 — which weights fit your GPU, explained.

Further reading: the Sovereign AI for Bitcoiners Manifesto for why sovereign inference matters, and From S19 to Your First AI Hashcenter for repurposing your mining rack into a Hashcenter that runs models like this one.