Automated PR risk scoring with LLMs

(github.com)

1 points | by KinanNasri 12 hours ago ago

1 comments

KinanNasri 12 hours ago ago

I built PRScope after noticing that large pull requests often get merged without anyone fully understanding their real risk surface.
The idea is simple:
Take the raw unified diff from a GitHub PR
Parse and structure it
Feed it into an LLM with a deterministic prompt
Generate a structured Markdown review that includes:
Severity levels
Risk assessment
Suggested improvements
Positive observations
The hard parts weren’t “using AI” — they were:
• Handling large diffs without blowing token limits • Keeping output consistent across runs • Avoiding hallucinated issues • Making scoring feel rational instead of arbitrary • Supporting both hosted APIs and local inference (Ollama)
One thing that improved reliability significantly was separating the prompt into:
Analysis phase
Structured output phase
It currently works as a CLI and GitHub Action.
I’m especially curious about:
Deterministic scoring approaches
Handling monorepo-scale PRs
Preventing false positives in AI review
CI performance tradeoffs
Repo: https://github.com/KinanNasri/PRScope
Happy to answer questions.