Automated PR risk scoring with LLMs

(github.com)

1 points | by KinanNasri 12 hours ago ago

1 comments

  • KinanNasri 12 hours ago ago

    I built PRScope after noticing that large pull requests often get merged without anyone fully understanding their real risk surface.

    The idea is simple:

    Take the raw unified diff from a GitHub PR

    Parse and structure it

    Feed it into an LLM with a deterministic prompt

    Generate a structured Markdown review that includes:

    Severity levels

    Risk assessment

    Suggested improvements

    Positive observations

    The hard parts weren’t “using AI” — they were:

    • Handling large diffs without blowing token limits • Keeping output consistent across runs • Avoiding hallucinated issues • Making scoring feel rational instead of arbitrary • Supporting both hosted APIs and local inference (Ollama)

    One thing that improved reliability significantly was separating the prompt into:

    Analysis phase

    Structured output phase

    It currently works as a CLI and GitHub Action.

    I’m especially curious about:

    Deterministic scoring approaches

    Handling monorepo-scale PRs

    Preventing false positives in AI review

    CI performance tradeoffs

    Repo: https://github.com/KinanNasri/PRScope

    Happy to answer questions.