Some secret management belongs in your HTTP proxy

(blog.exe.dev)

46 points | by tosh 5 days ago ago

15 comments

sakisv 3 days ago ago

This feels like a good idea in principle, but I can't shake the feeling that it just moves the goalposts one step away:
Now your app doesn't have direct access to your stripe/github/aws/whatever keys (which is good!) but you still need to have _some_ authentication against your proxy.
If you have a per-app authentication, then if your app's key leaks, then whoever uses it will be able to reach all the external services your app can, i.e. with one key you lose everything. On the other hand, if you have per-endpoint authentication, then you didn't really solve anything, you still have to manage X secrets.
Even worse, from the perspective of the team who owns and runs the proxy, chances are you are going to use per-app AND per-endpoint authentication, because this will allow you to revoke bad keys without breaking everyone else, etc.
What this really solves is subscription management for (big?) organisations. Now that you have a proxy, you only need a single key to talk to <external-service>, no need to have to manage subscriptions, user onboarding and offboarding, etc. You just need to negotiate ratelimits.

[-]
- ithkuil 2 days ago ago
  
  The auth between your app and the proxy can be scoped more easily.
  For example if the proxy runs in localhost you can trust the localhost workload.
  Or you can use some other kind of workload identity proof (like cloud based metadata servers). If you leak such a key no other VM can use it, because it's scoped to your VM
  
  [-]
  - arianvanp 2 days ago ago
    
    That's not true. Both AWS' as well as GCP's workload identity tokens are not bound to the VM. If you leak the credentials they're valid until they expire. on AWS the expiry is 6 hours (non-configurable). Even if your IAM role has a shorter expiration, the credentials assumed by the VM will always be valid for 6 hours.
    
    [-]
    - ithkuil a day ago ago
      
      That entirely depends on the location of the proxy and the extra conditions you can express. E.g. you could bind it to a source IP and have the proxy check that, or use some overlay network (like tailscale does)
      My point was that you don't literally have to run the proxy on localhost in order to scope the request.
MyUltiDev 3 days ago ago

The GitHub App angle is the interesting half here. It is the one integration where rotation is genuinely free, because you get first-class refresh semantics rather than bolted-on PAT expiry (the 90-day-and-forget-on-vacation failure mode you describe is painfully familiar). For the plain-header case like the Stripe curl example earlier in the post, I've been running similar setups across a few cloud providers, and rotation is where it breaks in practice: proxies that don't hot-reload the injected credential when upstream issues a new one. The TLS termination piece tends to get most of the architectural attention but is usually the easier half once you're already owning the proxy.
For the integrations that aren't GitHub-style OAuth Apps, where upstream just ships a long-lived API key and someone still has to rotate it, how are you planning to handle the refresh lifecycle on the exe.dev side? Is that declared per-integration, or is the proxy expected to notice 401s and pull a fresh credential from somewhere upstream?
thewisenerd 3 days ago ago

we recently moved to a similar approach, inspired by gondolin which does the same: https://earendil-works.github.io/gondolin/secrets/
an 'mitm' tls proxy also gives you much better firewalling capabilities [1], not that firewalls aren't inherently leaky,
codex's a 'wildcard' based one [2]; hence "easy" to bypass [3] github's list is slightly better [4] but ymmv
[1] than a rudimentary "allow based on nslookup $host" we're seeing on new sandboxes popping up, esp. when the backing server may have other hosts.
[2] https://developers.openai.com/codex/cloud/internet-access#co...
[3] https://embracethered.com/blog/posts/2025/chatgpt-codex-remo...
[4] https://docs.github.com/en/copilot/reference/copilot-allowli...
danlitt 3 days ago ago

Rewriting the URL sounds like it would also allow hitting a dummy server in tests. But how does the rewrite actually happen? If you have the literal URL in your code, then fine, but what if you don't?
rtrgrd 3 days ago ago

Confused here - setting up certs to MITM https requests to add a header seems like a decently big security risk?

[-]
- Wuzzy 3 days ago ago
  
  I agree that there are downsides to this approach. NVIDIA OpenShell does the same thing: https://docs.nvidia.com/openshell/latest/sandboxes/manage-pr.... I had wondered how they deal with the fact that client programs sometimes come with their own CA bundles. Turns out OpenShell sets various common environment variables (like REQUESTS_CA_BUNDLE used by Python's requests) to try to convince as many clients as possible that the proxy's certificate is to be trusted :) I would assume exe.dev does something similar.
  (I was interested in this because I was actually working on something similar recently: https://github.com/imbue-ai/latchkey. To avoid the certificates issue, this library uses a gateway approach instead of a proxy, i.e. clients call endpoints like "http(s)://gateway.url:port/gateway/https://api.github.com/..." which can be effectively hidden behind the "latchkey curl" invocation.)
  
  [-]
  - thewisenerd 3 days ago ago
    thankfully more and more projects are supporting the "standard" SSL_CERT_DIR/SSL_CERT_FILE environment variables [1]
    i think requests is a tricky one, as it _should_ be supporting it already based on the PR [2], but looks like it was merged in the 3.x branch and idk where that is, release-wise.
    there is also native TLS on linux (idk what exactly you call it); but
```
    cp cert.pem /usr/local/share/ca-certificates/cert.pem && update-ca-certificates
```
    all languages also seem to have packages around providing cert bundles which get used directly (e.g., certifi [3]), which does cause some pain
    [1] https://github.com/rustls/rustls-native-certs/issues/16#issu...
    [2] https://github.com/psf/requests/issues/2899
    [3] https://pypi.org/project/certifi/
    [-]
    - thewisenerd 3 days ago ago
      
      slightly related, one of the more interesting issues i've faced due to mitm tls by the $job mandated CASB (cloud-access security broker)
      is when python 3.13 [1] introduced some stricter validations and the CASB issued certs were not compliant (missing AKI); which broke REQUESTS_CA_BUNDLE/SSL_CERT_FILE for us
      [1] https://discuss.python.org/t/python-3-13-x-ssl-security-chan...
- HumanOstrich 3 days ago ago
  
  Things aren't just "good" or "bad". There are tradeoffs to consider.
thesnide 2 days ago ago

it somehow reminds me of stunnel, which if done over unix socket can be a very interesting proposition.
pamcake 2 days ago ago

s/your/a
You may not want to be doing this at the edge.
vdgbfhyrw4 3 days ago ago

[dead]