What is an AI Bill of Materials (AIBOM)?
An AI Bill of Materials, or AIBOM, is a machine-readable inventory of every component that goes into an AI system: foundation models, fine-tuned weights, training datasets, evaluation datasets, software dependencies, and the licenses that govern each. It's the AI-system equivalent of a Software Bill of Materials (SBOM) โ except SBOMs were designed for compiled binaries, and AI systems have fundamentally different supply-chain shapes (a 70B-parameter model trained on a trillion tokens of web data has very different provenance questions than a Python wheel).
Two formats dominate by 2026: SPDX 3.0's AI profile (ratified by the Linux Foundation, ISO/IEC 5962:2021 successor) and CycloneDX 1.6 ML-BOM (OWASP, the de facto SBOM format in security tooling). Most regulators accept either; large enterprises usually generate both. This tool produces both at the click of a button.
Why every AI team needs an AIBOM in 2026
Three regulatory forces converged in 2025โ2026 to make AIBOMs effectively mandatory for production AI systems:
- EU AI Act technical documentation (Annex IV). High-risk providers must document the "general logic of the AI system" plus "the data used to train, validate, and test the system." An AIBOM is the structured form most providers use to satisfy this. Article 53 requires general-purpose AI providers to publish a "sufficiently detailed summary" of their training data โ also satisfiable via AIBOM.
- NIST AI Risk Management Framework (AI RMF 1.0 + Generative AI Profile). The MAP function explicitly calls for documenting model lineage, dataset provenance, and component licenses. CISA's 2024 SBOM-for-AI guidance is now the template federal contractors use.
- US Executive Order 14110 + OMB M-24-10. Federal agencies and contractors must inventory their AI systems with sufficient granularity to support red-teaming and incident response. AIBOMs satisfy the inventory requirement.
Beyond compliance, AIBOMs have practical value: when a base model gets recalled (Meta's Llama Guard 2 issue in early 2025) or a dataset is shown to contain copyrighted material (the Books3 / LAION lawsuits), you need to know โ within minutes, not weeks โ which of your production systems are affected.
What this generator produces
You declare three categories of components: the system itself (your product), the models (foundation, fine-tuned, or hybrid), and the datasets (training, fine-tuning, evaluation, RAG). The tool emits a fully-formed JSON document in either SPDX 3.0 with the AI profile (using spdx:AI and spdx:Dataset classes) or CycloneDX 1.6 ML-BOM (using component[type=machine-learning-model] and component[type=data]). Both formats include the relationship graph the EU AI Act requires (system โ uses โ models โ trainedOn โ datasets).
SPDX 3.0 AI profile vs. CycloneDX 1.6 ML-BOM โ which to pick
SPDX 3.0 has the deeper AI vocabulary: it knows about training-energy consumption, model bias evaluations, and intended use cases as first-class fields. It's preferred by the Linux Foundation, the EU Commission's draft delegated acts, and most academic citations. CycloneDX is more widely adopted in security tooling (Snyk, Dependency-Track, Anchore, GitHub Dependabot), so if you already publish SBOMs in CycloneDX, generating ML-BOM in the same format keeps one toolchain. There's no wrong answer; produce both if your auditor's preference is unclear.
Fields this generator covers
- System level: name, version, vendor, license, description, intended use, primary purpose.
- Per model: name, version, type (foundation / fine-tuned / classifier / embedding / multimodal), source URL, license, parameter count, training compute (FLOPs), modality, evaluation metrics.
- Per dataset: name, version, source URL, license, size, modality, collection method, time range, content type, known issues.
- Relationships:
USES,TRAINED_ON,FINE_TUNED_FROM,EVALUATED_WITH. - Provenance hashes: SHA-256 placeholder slots for each component, fillable from your registry of record.
FAQ
How is an AIBOM different from a model card? A model card describes one model in narrative, human-readable form. An AIBOM is a structured graph of every component in a system, machine-readable, designed for automated tooling. They complement each other.
Do I have to publish my AIBOM? No. Most companies treat it as internal compliance documentation. The EU AI Act requires that you can produce it on request from a regulator. Some open-source projects (like Hugging Face's "transparency reports") publish it voluntarily.
What if I use a closed model like GPT-5 or Claude Opus? You list the API endpoint, version, vendor, and license-by-reference (the API terms). You aren't expected to know the foundation model's training data โ but you must declare that the foundation model is third-party and identify it.
What's a SHA-256 hash for a model? Cryptographic fingerprint of the model weights file (or files). Lets a downstream auditor verify the same weights you declared are the same weights actually deployed. sha256sum model.safetensors on the file does it.
Does the EU AI Act demand SPDX specifically? No โ the Act is technology-neutral on format. But the Commission's harmonised standards (under preparation by CEN/CENELEC JTC 21) name SPDX 3.0 and CycloneDX 1.6 as conforming examples. Producing one of these gets you the rebuttable presumption of conformity.
Pre-loaded examples
Click "Load LLM-RAG example" to see a typical retrieval-augmented chatbot built on a third-party foundation model plus a fine-tune plus an embedding model plus three datasets. Most production AI systems look something like that โ the example is faster to edit than a blank form.
Limitations
- This tool is a manifest generator, not a discovery tool. If you don't know which model your team uses, this won't tell you โ talk to engineering.
- SPDX 3.0 and CycloneDX 1.6 schemas evolve; this tool tracks the November 2025 specs. Always validate the output against the latest schema using the official validators (
spdx-tools/cyclonedx-cli validate) before submitting to a regulator. - The tool cannot sign your AIBOM. SPDX 3.0 supports detached signatures via
spdx:Signature; sign with your code-signing certificate after generation.