In an effort to make the Epstein Files easier to navigate, I’m using OpenAI’s GPT to generate concise, searchable descriptions for each file.* The goal is simple: with solid SEO (and maybe a hug from HN), we can get these documents indexed so people can find what they need through normal search engines—no specialized tools required.
The DOJ already offers decent full-text search for the PDF documents, so I’m skipping those for now and focusing on the first batch of photos, where searchability is much weaker.
* So far I’ve only processed the first ~1,000 files from the EFTA dataset. I’m sharing early to gauge interest before I continue scaling it up.
In an effort to make the Epstein Files easier to navigate, I’m using OpenAI’s GPT to generate concise, searchable descriptions for each file.* The goal is simple: with solid SEO (and maybe a hug from HN), we can get these documents indexed so people can find what they need through normal search engines—no specialized tools required.
The DOJ already offers decent full-text search for the PDF documents, so I’m skipping those for now and focusing on the first batch of photos, where searchability is much weaker.
* So far I’ve only processed the first ~1,000 files from the EFTA dataset. I’m sharing early to gauge interest before I continue scaling it up.