Black Swarm — Intelligence Journey

L0 / Source Discovery & Control

Authorized OSINT

Curated vendor APIs and government feeds
Shodan / VirusTotal / CISA KEV / NVD

Social & Open Web

LLM-assisted source discovery
Researcher credibility tiers with human approval
Telegram / Mastodon / Bluesky / RSS feeds

Dark Web

Multi-tier discovery with LLM scoring
Automated liveness probes
Human-in-the-loop approval for all new sources

L1 / Collection

OSINT Collection

Per-source scheduled polling
Configurable intervals via admin panel

Social Collection

Five platform collectors with deduplication
Researcher attribution preserved through pipeline

Dark Web Collection

Ephemeral isolated containers per crawl
No persistent state / Clearnet mirrors + .onion

L2 / Sanitization & Ingestion

Content Sanitization

Multi-pass filtering: pattern-based for PII, prompt injection, and known exploits.
LLM-powered semantic analysis for hostile payloads.
Indicators of compromise (IOCs) automatically extracted and structured.

Evidence Store

Content-addressed records / Observables / Tags
MITRE ATT&CK refs / Confidence / Embeddings

Immutable Evidence Vault

Raw evidence preserved immutably
Content-addressed / Long-term retention

L3 / LLM Enrichment

Intelligent Enrichment Pipeline

Every evidence item is automatically enriched through multiple LLM passes — triage, entity extraction, summarization, MITRE mapping, relevance scoring, and threat correlation.

~60%

Qwen2.5-14B High-volume triage & extraction

~30%

Llama 3.1 70B Complex correlation & synthesis

~10%

Claude Opus 4.7 Frontier Reasoning-intensive & weekly briefing

CONTINUOUS QUALITY MONITORING via shadow evaluation. If quality drifts, routing automatically shifts to maintain accuracy.

Enriched Evidence

Actors, IOCs, malware tags / MITRE mapping
Confidence scoring / Semantic embeddings

Knowledge Graph

Threat actors / Infrastructure nodes
Campaign linkage / MITRE techniques

L4 / Intelligence Synthesis

Intelligence Synthesizer

Runs on a recurring schedule with spike-triggered immediate synthesis when threat activity surges.

Evidence Clustering Groups related evidence by actor and campaign. Semantic similarity for cross-source correlation.

Narrative Generation LLM-generated threat narratives. Title, summary, severity, justification.

Customer Impact Scoring Maps threats to customer digital twins. Industry, tech stack, geography matching.

Researcher Signal Fusion Credible researcher signals elevate severity. Attribution and tier tracking preserved.

Lifecycle Management Evidence-driven state transitions. New / Active / Aging / Archived.

Spike Detection Monitors for activity surges. Triggers immediate synthesis on threshold breach.

L5 / Presentation — Three Intelligence Axes

Axis 01

Global Intelligence

Synthesized threat intelligence feed
Threat Pulse severity dashboard
Actor activity grouping
Weekly threat briefing (LLM-generated)
Graph pivot for relationship exploration

Axis 02

Customer Intelligence

Fleet threat exposure matrix
Intelligence filtered by industry, tech stack, geography
Customer digital twin with auto-discovery
“Which of MY customers are affected?”

Axis 03

Local Investigation

Persistent investigation workspace
Natural language query interface
Evidence timeline and graph pivoting
Report generation / STIX 2.1 export