Ingestion Pipeline
Learn about our automated data ingestion pipeline that continuously monitors multiple sources to keep our vulnerability database current and comprehensive.
Continuous Monitoring
Our pipeline runs 24/7, checking for new vulnerabilities every 15 minutes from all configured sources.
Real-time Processing
New vulnerabilities are processed, enriched, and published to the database within minutes of discovery.
Multiple Sources
We aggregate from NVD, GitHub Advisory, Exploit-DB, CVE Details, and community submissions.
Data Validation
All ingested data goes through validation, deduplication, and quality checks before publication.
Pipeline Architecture
Source Collection
Fetchers monitor configured sources via APIs, RSS feeds, and web scraping. Each source has a dedicated adapter that handles authentication, rate limiting, and data extraction.
Normalization
Raw data from different sources is normalized into a unified schema. CVE IDs are standardized, timestamps are normalized to UTC, and severity scores are mapped to our classification system.
Enrichment
Vulnerabilities are enriched with additional context: PoC links, writeup references, affected version ranges, patch availability, and CVSS scores from multiple sources.
Deduplication
Multiple sources often report the same vulnerability. Our deduplication engine merges entries based on CVE IDs, affected applications, and content similarity analysis.
Validation & Quality Control
Automated checks verify data integrity, required fields, and content quality. Suspicious entries are flagged for manual review before publication.
Publication
Validated vulnerabilities are published to the database, indexed for search, and distributed through our API, webhooks, and real-time notification channels.
Data Sources
Open Source
Our ingestion pipeline is open source! You can view the code, contribute improvements, or run your own instance to aggregate vulnerability data for your organization.
View on GitHub