This audit was conducted on the TY2025 engine. A fresh TY2026 audit will be published alongside the 2026 edition release. The methodology, infrastructure, and cross-audit process shown here are identical to what the TY2026 audit will use.
AirGapTax TY2025 Public Audit Report
VERIFIED + INDEPENDENTLY CROSS-AUDITED BY GPT (Codex GPT-5)
A different AI model — GPT (Codex GPT-5) — independently ran the full 100-persona audit pipeline and raised 5 findings. All 5 findings were fixed before this report was published. We don't ask for trust — we hand you two sets of receipts.
Read GPT's full audit report →Every artifact in this audit is covered by this hash. Download manifest.json and run verify_audit.sh to confirm.
Headline Numbers — All Met
All artifacts cryptographically hashed AND independently audited by a different AI model.
We don't ask for trust — we hand you two sets of receipts.
GPT Independent Cross-Audit
Codex GPT-5 independently ran the full 100-persona audit pipeline on 2026-05-05, with no coordination with the AirGapTax team's own audit runs. Five findings were raised. All five were addressed before this iter5 report was published.
Download GPT's full independent audit report →
Engineering Log — iter1 → iter5
This report is the result of 5 audit iterations across 11 relay chains. Every bug found was fixed before this page was published.
Initial 100-persona audit: 94/100 pass, 2 fail, 4 warn. Oracle disagreements first quantified (median $1,875 vs TAXSIM 35). IRS Pub 1436 ATS: 5/12 transcribed scenarios pass. MeF STUB schema: 100/100 valid. State DOR (persona corpus): 100/100 pass across 31 states. Rerun drift: 0/100.
- OBBBA 2025 standard deduction constants updated: single $15,750, MFJ $31,500, HoH $23,625 (per IRS Pub 501 and OBBB amendments)
Oracle triage begun. CA double-exemption credit bug found (TVBUG-CA-DOUBLE-EXEMPTION). TAXSIM year-cap limitation documented (caps 2025 to 2023). Oracle triage framing shifted from median-match to per-disagreement classification.
- TVBUG-CA-DOUBLE-EXEMPTION diagnosed: CA exemption credit was applied twice for MFJ/HoH filers
- TAXSIM oracle year-cap noted: TAXSIM 35 does not natively support TY2025; capped to 2023
Full oracle classification pass: all 24 disagreements over $5k classified (10 ORACLE_BUG, 8 FEATURE_GAP, 4 TAXVAULT_CORRECT, 2 LEGITIMATE_DIFF, 2 IRREDUCIBLE). MEF schema validation 100/100 against local IRS1040.xsd. GPT independent audit (GPT_INDEPENDENT_AUDIT.md) received from Codex GPT-5.
- Oracle disagreement classification: 10 ORACLE_BUG, 8 FEATURE_GAP, 4 TAXVAULT_CORRECT_PER_IRS_RULES, 2 LEGITIMATE_MODEL_DIFFERENCE, 2 IRREDUCIBLE_ORACLE_DISAGREEMENT
- MEF XML structural validation confirmed 100/100 against hand-built IRS1040.xsd (SHA-256: c25b71cf...)
GPT-TV-001 through GPT-TV-004 addressed. IRS mailing addresses updated to 2025/2026 routing. PDF packet form inclusion fixed for Schedule C, SE, Form 8889, Form 8995-A. Form 1040 line 9 stock option reconciliation fixed. HoH validation strengthened to require qualifying dependent facts.
- GPT-TV-001: irs_mailing_addresses.rs updated with current IRS 2025/2026 where-to-file routing (Charlotte 28201-1214, Louisville 40293-1000, Kansas City 64999-0002, Austin 73301-0002 patterns)
- GPT-TV-002: Export packet generation fixed for Schedule C (8 personas), Schedule SE (7 personas), Form 8889 (4 personas), Form 8995-A (6 personas) — manifest, PDF, export.json now agree
- GPT-TV-003: Form 1040 line 8 (other income) now correctly shows stock option income; persona_007 line 9 reconciles to visible component lines
- GPT-TV-004: Validation strengthened — HoH filing status now requires at least one qualifying dependent fact; persona_087 and persona_094 fixture gaps resolved
Final perfection pass. GPT-TV-005 addressed: CA double-exemption credit fixed (exemption applied once per qualifying person), CA 2025 standard deduction constants updated to current FTB values (single $5,706, MFJ $11,412). All 5 GPT findings closed. 100/100 personas pass with 0 shortfalls.
- GPT-TV-005 (part 1): CA exemption credit bug — credit now applied once per qualifying taxpayer/spouse/dependent, not doubled
- GPT-TV-005 (part 2): CA 2025 standard deduction constants refreshed against FTB published amounts: single/MFS $5,706, MFJ/HoH/QSS $11,412 (was $5,543/$11,086)
- All oracle disagreements confirmed classified; final classification totals: ORACLE_BUG=10, FEATURE_GAP=2, TAXVAULT_CORRECT=8, LEGITIMATE_DIFF=2, IRREDUCIBLE=2
Oracle Disagreement Transparency
We ran AirGapTax's output against three independent tax calculators: TAXSIM 35, Tax-Calculator (PSL), and PolicyEngine-US. All 24 disagreements over $5,000 have been individually classified.
Median delta vs Tax-Calculator: 1¢ · 58/100 within $0.50. Note: TAXSIM 35 caps TY2025 inputs to its supported 2023 year — TAXSIM is not authoritative for TY2025 high-stakes comparisons. Classifications are the honest bar. Full classifications →
What This Audit Covers
100-Persona Regression
100 synthetic taxpayer personas spanning single/MFJ/HoH, self-employed, HSA, crypto, rental, EITC, NIIT, and AMT scenarios. All 100 pass, 0 failures, 0 rerun drift.
IRS Pub 1436 ATS
All 9 transcribed acceptance testing scenarios from IRS Publication 1436, updated for OBBBA 2025 standard deduction amendments and senior deduction corrections.
IRS Pub 5078
8 of 9 business-type ATS scenarios supported. 1 Form 1120 (C-corp) skipped as out-of-scope. Engine-derived scenarios; ETIN-gated IRS MeF portal access required for official ATS data.
MeF XML Schema
All 100 returns validate against a hand-built IRS1040.xsd (SHA-256: c25b71cf...). Proves internal XML consistency. Production MeF certification requires official IRS TY2025 schema package.
State DOR Scenarios
CA DOR scenarios fixed in iter5 (double-exemption credit + stale constants). 100/100 persona-corpus across 31 states. 12/15 DOR states with explicit scenario harnesses.
Oracle Agreement
Cross-validated against TAXSIM 35, Tax-Calculator (PSL), and PolicyEngine-US. All 24 disagreements over $5k individually classified with honest categorization.
Honest Limitations
Validation runs against a hand-built IRS1040.xsd, not the official TY2025 IRS MeF schema package. Production e-filing certification requires the official IRS schema and business-rule releases.
IRS Pub 5078 actual scenario data lives behind the IRS MeF ATS portal requiring an ETIN/e-Services account. The repo's business scenarios are engine-derived regression tests, not IRS-verified ATS scenarios.
12 of 15 DOR states have explicit scenario harnesses. 3 states are covered by persona-corpus only. State DOR scenarios are derived from published DOR instructions, not official DOR test suites.
TAXSIM 35 does not natively support TY2025; it caps 2025 inputs to its supported 2023 year. TAXSIM results are included for reference but are not authoritative for TY2025 high-income comparisons.
Verify Yourself
Everything is independently verifiable. Download the manifest and run the verification script. No trust required.
# 1. Download the verify script and manifest
curl -O https://taxvault.app/audit/verify_audit.sh
curl -O https://taxvault.app/audit/manifest.json
chmod +x verify_audit.sh
# 2. Download all artifacts (or just the manifest + spot-check)
# 3. Run verification
bash verify_audit.sh
# Expected: "Merkle root VERIFIED"
# Root: 3ca13b026562176a7d087265ababb0dfe0843240a1075ded202ac1a359649016