1 comments

  • MarcellLunczer 13 hours ago ago

    Hi HN,

    I’m the co-founder of Neutral News AI: a site that tries to answer a simple question:

    “What actually happened here, across multiple biased sources, and can we check the claims against the original articles?”

    Link: https://neutralnewsai.com Analyzer: https://neutralnewsai.com/analyzer No signup needed to read the news or run a basic analysis.

    What it does

    • Crawls multiple outlets (left / center / right + wires / gov sites) for the same story.

    • Generates a short, neutral summary constrained to those sources (no extra web search).

    • Extracts atomic claims (events, numbers, quotes) from the draft.

    • Uses an MNLI model to test each claim against the underlying articles:

    • entailment → “Supported”

    • contradiction → “Refuted”

    • neutral → “Inconclusive”

    • Surfaces a “receipt ledger” per article: claim text, verdict, quote, source, timestamp.

    • Exposes the underlying models on an Analyzer page where you can paste any URL and get:

    • political bias score,

    • sentiment / subjectivity,

    • readability metrics,

    • a rough credibility signal.

    Stack and models

    • Backend: Python, PostgreSQL.

    • Crawling / aggregation: scheduled scrapers + RSS + manual curated source lists.

    • Bias / propaganda detection: transformer-based classifiers fine-tuned on public political news datasets, plus some hand-engineered features (e.g., source-level priors, readability, sentiment). In offline tests I get 93% accuracy on bias detection(happy to share more detail if people care).

    • Claim extraction: sentence segmentation + a lightweight classifier to label check-worthy clauses (counts, quotes, time-bound events, entity claims).

    • Fact-checking: MNLI model (currently DeBERTa-based) over (claim, evidence-passage) pairs with some heuristics to merge multiple snippets.

    • Frontend: Angular + server-rendered news pages for speed and SEO.

    The methodology is documented here with more detail:

    https://neutralnewsai.com/methodology

    What I’m unsure about

    • How far I can push MNLI-style models before needing a more explicit retrieval-augmented system or custom architectures.

    • Whether my current claim extraction approach is good enough for high-stakes use, or if I should move to a more formal information extraction pipeline.

    • How to expose uncertainty and failure modes in a way that’s actually useful for non-technical readers.

    Why I’m posting

    I’d like feedback from this community on:

    • ML / NLP choices you strongly disagree with.

    • Evaluation: what would be a more convincing test suite or benchmark?

    • UI/UX for showing “supported/refuted/inconclusive” without overselling model confidence.

    I’m very open to critique. If you think this is conceptually wrong or socially dangerous, I’d also like to hear that argument.

    Thanks for reading, Marcell