## Critical CLI-DAG Drift Exposed: Audit Finds Baseline Pipeline Missing Core Intelligence Links
A comprehensive 7-agent audit has exposed a critical structural flaw in the intelligence platform's command-line interface (CLI), revealing that its baseline data generation pipeline was fundamentally broken. The audit identified a major class of issues termed 'CLI ↔ DAG drift,' where the CLI's baseline process failed to execute core data enrichment and relationship-building jobs, leaving the resulting intelligence graph incomplete and misleading for operators. This failure meant that for an unspecified period, baseline runs did not create Campaign nodes, apply decay models, build vulnerability-to-CVE bridges, or establish over a dozen critical relationship types between entities like Malware, Techniques, Tools, and Sectors.

The third in a four-part fix sequence (PR-C) directly addresses three high-severity audit findings. Cross-Checker H1 found the CLI baseline never invoked `enrichment_jobs`, omitting entire data categories. Cross-Checker H2 revealed it also skipped `build_relationships.py`, bypassing 12 essential link queries. Perhaps most operationally deceptive, Cross-Checker H3 identified that the `baseline_dag` lacked a true `fresh_baseline` control knob. This meant operators triggering runs with names like `fresh__730d__...` were misled—only checkpoints were cleared, while the underlying data graph remained stale, creating a false sense of a comprehensive rebuild.

The implemented fixes rewire the pipeline's logic. Step 5c now correctly calls `run_all_enrichment_jobs` during a baseline run, and Step 5b invokes `build_relationships.py` via a subprocess with a matching 5-hour timeout to align with the Directed Acyclic Graph (DAG) workflow. A new `baseline_clean_task` PythonOperator has been inserted between the `misp_health` and `baseline_start` tasks, creating the missing destructive fresh-baseline capability. This closes the largest single class of issues uncovered, moving the system toward parity between its automated DAG and manual CLI operations, which is essential for reliable intelligence production.
---
- **Source**: GitHub Issues
- **Sector**: The Lab
- **Tags**: software_audit, data_pipeline, CLI, DAG, data_integrity
- **Credibility**: unverified
- **Published**: 2026-04-19 15:22:40
- **ID**: 71265
- **URL**: https://whisperx.ai/en/intel/71265