System Architecture
Declassified overview of the TruthCheck verification pipeline.
Claim Extraction
The system first analyzes your input text using spaCy to identify factual claims. It filters out questions, opinions, and personal statements, isolating only verifiable assertions about the real world.
Evidence Retrieval
Using extracted keywords, TruthCheck scrapes trusted sources (Wikipedia, Government domains,
Scientific Journals) via standard search protocols. It specifically prioritizes
high-credibility domains like .gov, .edu, and
reuters.com.
NLI Classification
The core "brain" uses Large Language Models (RoBERTa & DeBERTa) fine-tuned for Natural Language Inference (NLI). It compares the claim against each piece of evidence to determine if the evidence Entails (supports), Contradicts (refutes), or is Neutral towards the claim.
Consensus Voting
Finally, all model judgments are aggregated using a weighted voting mechanism. Sources with higher domain authority carry more weight. The system calculates a final confidence score and issues a verdict: TRUE, FALSE, or LOW CONFIDENCE.