we run ~10k agent pods on k3s and went with gvisor over microvms purely for density. the memory overhead of a dedicated kernel per tenant just doesn't scale when you're trying to pack thousands of instances onto a few nodes. strict network policies and pid limits cover most of the isolation gaps anyway.
Yeah, when you run ≈10k agents instead of ≈10, you need a different solution :)
I’m curious what gVisor is getting you in your setup — of course gVisor is good for running untrusted code, but would you say that gVisor prevents issues that would otherwise make the agent break out of the kubernetes pod? Like, do you have examples you’ve observed where gVisor has saved the day?
we run ~10k agent pods on k3s and went with gvisor over microvms purely for density. the memory overhead of a dedicated kernel per tenant just doesn't scale when you're trying to pack thousands of instances onto a few nodes. strict network policies and pid limits cover most of the isolation gaps anyway.
Yeah, when you run ≈10k agents instead of ≈10, you need a different solution :)
I’m curious what gVisor is getting you in your setup — of course gVisor is good for running untrusted code, but would you say that gVisor prevents issues that would otherwise make the agent break out of the kubernetes pod? Like, do you have examples you’ve observed where gVisor has saved the day?