Reader – web scraping that outputs clean Markdown for LLMs

(github.com)

3 points | by nihalwashere 10 hours ago ago

1 comments

nihalwashere 10 hours ago ago

I built an open-source web scraping engine for LLMs, self-hostable, Docker ready
I kept rebuilding web scraping infrastructure for AI projects, so I open-sourced it.
Two functions:
- `scrape()` – any URL to markdown
- `crawl()` – entire sites with depth/page limits
Self-hosting:
- Docker image available
- Manages its own browser pool
- Proxy support built-in
- No external dependencies
Deployment guide: https://docs.reader.dev/documentation/guides/deployment
GitHub: https://github.com/vakra-dev/reader
Built with TypeScript, runs anywhere you can run Docker. Happy to answer questions about the architecture or deployment.