Fresh Hacker News | Run Nix Based Environments in Kubernetes

▲Run Nix Based Environments in Kubernetes(flox.dev)

101 points by kelseyhightower 6 days ago | 9 comments

▲ronef 5 hours ago

Ron from Flox here, woke up to feed a brand new 3 day old to see this here! On about 3 hours of sleep (over the lat 48 hours) but excited to try and answer some questions! Feel free to also drop any below <3

We did just launch this last week after a good bit of work from the team. Steve wrote up a deeper technical dive here if anyone is interested - https://flox.dev/blog/kubernetes-uncontained-explained-unloc...

▲wathef 5 hours ago

congrats on the little one, here’s to many wonderful moments.

▲ronef 5 hours ago

online community love was not in my cards going into day 3 of a newborn but I'll take it + definitely needed! thank you!

▲rootnod3 6 hours ago

I used to love both, Kubernetes and Nix. But after a few years of using both I felt like the abstraction levels are a bit too deep.

Sure, it's easy to stand up a mail server in NixOS, or to just use docker/kubernetes to deploy stuff. But after a few years it felt like I don't have a single understanding of the stack. When shit hits the fan, it makes it very difficult to troubleshoot.

I am now back on running my servers on FreeBSD/OpenBSD and jails or VMM respectively. And also dumbing the stack down to just "run it in a jail, but set it up manually".

The only outlier is Immich. For some reason they only officially support the docker images but not a single clear instruction on how to set it up manually. Sure, I could look at the Dockerfiles, but many of the scripts also expect docker to be present.

And now that FreeBSD also has reproducible builds, it took one more stone away from Nix.

▲ronef 5 hours ago

Going to sound weird but with both my hats on I super appreciate this perspective. I can only speak to some areas of Nix and Flox obviously and I know folks are looking into doing this to your point a whole lot better. Zooming in way more into solving for us that just want to run and fix it fast when it breaks.

Also, think it's a huge ecosystem win for FreeBSD pushing on reproducibility too. I think we are trending in a direction where this just becomes a critical principle for certain stacks. (also needed when you dive into AI stacks/infra...)

▲rootnod3 3 hours ago

Yes, but I also think that the BSDs are the last bastions you will find any AI usage in. And I for one am grateful for that.

I like it when my system comes with a complete set of manpages and good docs.

But you mentioned Flox, which I didn't even know about. First I thought that's what they renamed the Nix fork to after the schism, but now I see it's a paid product and yuck...just further deepens my believe in going more bare bones manual control, even if sometimes bothersome.

▲antonvs 3 hours ago

Kubernetes can be a godsend at larger orgs.

We have six dev teams and are just about done with migrating to k8s. It's an immense improvement over what we had before.

It's a version of Greenspun's tenth rule: "Any sufficiently complicated distributed system contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of Kubernetes."

▲eep_social 2 hours ago

I think six dev teams is small in terms of kube. I wouldn’t be surprised if that’s close to the perfect size to move onto kube and create and adopt a standard set of platform idioms.

at orgs significantly larger than that, the kube team has to aggressively spin out platform functions that enable further layering or risk getting overwhelmed trying to support and configure kube features to cover diverse team needs (that is, storage software doesn’t have the same needs or concerns as middleware or the frontend). this incubator model isn’t easy in practice. trying to adopt kube at this scale is very challenging because it requires the kube team to spin up and out sub-teams at a very high rate or risk slowing the migration down to a crawl or outright failure and purchasing e.g. off the shelf AWS because teams need to offboard their previous platform.

▲throwaway838112 1 hour ago

No, most large orgs do not need it.

▲antonvs 18 minutes ago

Completely unsupported assertions are the least interesting kinds of comment anyone can post.

Besides, this particular comment would need to explain why it’s likely that its unsupported opinion is correct, rather than all the counterexamples that exist in the actual highly competitive industry that’s heavily using this system.

There’s a fundamental conflict there that doesn’t work in favor of your inchoate opinion. The most likely conclusion is that there are factors at work here that you don’t understand.

▲whazor 9 hours ago

When I worked on an enterprise data analytics platform, a big problem was docker image growth. People were using different python versions, different cuda versions, all kinds of libraries. With Cuda being over a gigabyte, this all explodes.

The solution is to decompose the docker images and make sure that every layer is hash equivalent. So if people update their Cuda version, it result in a change within the Python layers.

But it looks like Flox now simplifies this via Nix. Every Nix package already has a hash and you can combine packages however you would like.

▲ronef 5 hours ago

Yes, this hits the nail on the head. We’ve seen the same explosion in image size and rebuild complexity, especially with AI/ML workloads where Python + CUDA + random pip wheels + system libs = image bloat and massive rebuilds.

With the Kubernetes shim, you can run the hash-pinned environments without building or pulling an image at all. It starts the pod with a stub, then activates the exact runtime from a node-local store.

▲__MatrixMan__ 6 hours ago

I was an early and enthusiastic adopter of docker. I really liked how it would let me use layers to keep track of dependency between files.

After spending a few years using nix, the docker image situation looks pretty bonkers. If two files end up in separate layers, the system assumes dependency so if the lower file changes you need to build a separate copy of the higher one just in case there's actual dependency there.

Within nix you can be more precise about what depends on what, which is nice, but you do have to be thoughtful about it or you can summon the same footgun that got you with docker, just in smaller form. Because a nix derivation, while a box with nicely labeled inputs and output, is still a black box. If you insert a readme as an input to a derivation that does a build, nix will assume that the compiled binary depends on it and when you fix a typo in the readme and rebuild you'll end up with a duplicate binary build in the nix store despite the contents of the binary not actually depending on the text of the readme.

> you can combine packages however you would like

So this is true, more or less, but be aware that while nix lets you do this in ways that don't force needless duplication, it doesn't force you to avoid that duplication. Things carelessly packaged with nix can easily recreate the problem you mentioned with docker.

▲d3Xt3r 1 hour ago

> If you insert a readme as an input to a derivation that does a build, nix will assume that the compiled binary depends on it and when you fix a typo in the readme and rebuild you'll end up with a duplicate binary build in the nix store despite the contents of the binary not actually depending on the text of the readme.

One my issues with Nix is the black box that is the store, and maybe it's just my system, but over time I find it full of redundant files / orphans and no obvious way to flatten it or clean it safely without breaking something.

I wonder how flox solves this.

▲__MatrixMan__ 11 minutes ago

Have you found that nix-store --gc breaks things, or are you concerned that it's not an aggressive enough garbage collection?

When I have space problems, I run that and they're gone, then later I do it again. It could be that I'm just avoiding functionality that it breaks though.

▲jeremy_flox 1 hour ago

Both fair points. The README rebuild issue is a Nix hiccup we don't solve; our quantized catalog reduces cascading rebuilds from upstream churn, but input over-specification is still there.

On store bloat: Flox makes it clearer what's in use (explicit environments vs. implicit dependencies), but you still need nix-collect-garbage.

The store accumulates cruft, that's Nix reality, we haven't changed it.

▲justincormack 6 hours ago

The problem is that whiteouts are not commutative. If the layers you build turn out to be bit for bit identical the layers will be shared anyway, but its much mroe complex than Nix where the composition operation is commutative.

▲justincormack 8 hours ago

Yes, there were various attempts to do this in the container ecosystem, but there is a hard limit on layers on Docker images (because there are hard limits on overlay mounts; you don't really need to overlay all the Nix store mounts of course as they have different paths but the code is for teh geenral case). So then there were various ways of bundling sets of packages into layers, but just managing it directly through Nix store is much simpler.

▲ronef 17 minutes ago

And I'm back in the land of the living. Can't really beat a response from Justin Cormack!

▲based2 7 hours ago

https://github.com/pdtpartners/nix-snapshotter/blob/main/doc...

▲CuriouslyC 8 hours ago

Too bad this isn't open source, I'm 3/4ths of the way through building pretty much this exact product in order to support my actual products.

▲natebc 7 hours ago

Is it not GPL?

The license file in their github seems to indicate that it is. https://github.com/flox/flox?tab=GPL-2.0-1-ov-file

▲CuriouslyC 6 hours ago

Cool if so, I didn't see it prominently linked or mentioned on the landing page. Maintainers: being open source is a big feature, mention it prominently and have your repo links front and center.

▲nickysielicki 2 hours ago

What constraints/coordination exists with this, in terms of host driver support? What enforces that Nix does not attempt to use a newer cuda toolkit on a host with an older cuda driver?

▲nrhrjrjrjtntbt 10 hours ago

How does this differ from the tooling that lets you build containers from nix?

▲ronef 5 hours ago

Jotting down a few quick thoughts here but we can totally go deep. This is something Michael Brantley started working on a few months ago to test out how to make it super easy to ease and leverage existing Nix & Flox architecture. One of the core differences from my quick perspective is that it specifically leverages the unique way that Flox environments are rendered without performing a nix evaluation, making it safe and optimally performant for the k8s node to realize the packages directly on the node, outside of a container.

▲zobzu 4 hours ago

I read this a few times but there's no info.

▲Shebanator 3 hours ago

Wrong. If you know nix then you know "leverages the unique way that Flox environments are rendered without performing a nix evaluation" is a very significant statement.

▲nixosbestos 2 hours ago

> leverages the unique way that Flox environments are rendered without performing a nix evaluation"

I'm curious! and ignorant! help!

Is that via (centrally?) cached eval? or what? there's only so much room for magic in this arena.

▲jeremy_flox 1 hour ago

Yeah, it's essentially cached eval, the key being where/how that eval is stored.

When you create a Flox environment, we evaluate the Nix expressions once and store the concrete result (ie exact store paths) in FloxHub. The k8s node just fetches that pre-rendered manifest and bind-mounts the packages with no evaluation step at pod startup.

It's like the difference between giving the node a recipe to interpret vs. giving it a shopping list of exact items. Faster, safer, and the node doesn't need to know how to cook (evaluate Nix). I don't know, there's a metaphor here somewhere, I'll find it.

Only so much room for magic, for sure, but tons of room for efficiency and optimization.

▲dlahoda 10 hours ago

seems similar to this

https://github.com/pdtpartners/nix-snapshotter

so kind of allowing pull images from nix store, mounting shared host nix store per node into each container, incremental fast rebuilds, generating basic pod configs are good things.

and local, ci and remote runs same flows and envs.

▲setheron 4 hours ago

There was also Nixery paving the way

▲jeremy_flox 2 hours ago

Jeremy from Flox, here, I want to chime in here so Ron can be with his family, even though he will no doubt be right back on here:

Re: Relationship to nix-snapshotter and prior art This is original work, though very much built on prior innovations. Our approach hooks into the upstream containerd runc shim to pull the FloxHub-managed environment and bind-mount the closure at startup. The key distinction is that we use how Flox environments are rendered to avoid Nix evaluation entirely, making it safe and fast for a k8s node to realize packages directly on the node. Less about images and containers, per se, and more out bringing the power of Flox and Nix at the buildtime end to the runtime end of SDLC.

The cache story is surprisingly strong: nix store paths effectively behave like layers in the node’s registry, but with dramatically higher hit rates -- often across entirely unrelated pod deployments. Because all pods rely on the same underlying system libraries drawn from the “quantized” Flox catalog, different environments naturally share glibc, core utilities, and common dependencies, where traditional containers typically share nothing.

Tools like nix-snapshotter, Nixery, and others have pioneered this space and we're grateful for that work. This rising "post-Docker" tide raises all ships.

Re: Open Source The software is brand new -- only slightly older than Ron’s baby -- and currently in alpha. KubeCon was our first opportunity for broad feedback, and we uncovered a few issues we’re still addressing. Our intent is to open-source the project once we’ve fully vetted the approach, ideally in the coming weeks.

Yes, we launched early and the product is imperfect, but we’re doing so transparently and with a commitment to getting it right and releasing it to the community, we will continue to release early and often.

Re: Abstraction depth concerns I appreciate @rootnod3’s point about deeper abstractions complicating debugging. We’re thinking hard about how to keep things simple for people who need to run and fix systems quickly. It’s encouraging to see the broader ecosystem—like FreeBSD—lean further into reproducibility, especially as AI-centric stacks make this increasingly important.

Re: Nix vs traditional approaches Skilled Dockerfile authors can achieve great caching results -- and you can pin and you can prune registries, etc -- but our goal is to make these best practices the default. Nix enables finer-grained caching and a universal packaging format for building and consuming open source software.

We see intrinsic value in Flox environments -- whether on the CLI, k8s, Nomad down the road, or other platforms. Our aim is for Flox environments to be as universal and natural as Nix packages themselves -- essentially extending “flox activate” into the k8s world.

We likewise got a ton of valuable feedback at KubeCon, most of which was validating, all of which was very inline with this conversation.

▲robinhoodexe 1 hour ago

First, congrats on the release. I’ve looked at flox and devenv for nixifying our container builds. Our distribution of languages is about 40/30/20/10 of Python, F#, R and nodejs.

A dilemma I’m facing is that the win from nix in terms of faster builds and smaller images would be largely from python and R images (where the average size is often 1Gi or larger). However, the developers that use Python or R are less likely to “get” the point of Nix and might have a steeper learning curve than F# developers (where the builds are quite efficient).

That was the context, my question is, how’s the integration with Flox and R/RStudio? I know there’s Rix[1] for managing R packages with Nix.

[1] https://github.com/ropensci/rix

▲nixosbestos 2 hours ago

So, nix-snapshotter? Also, Flox going all in on "environments" seems like such a choice. I'm sure that Flox is not encouraging shipping a binary-in-a-devshell to Prod, so it seems an interesting branding decision.

It's hard for me to understand if I should be excited about this. I think companies do themselves such huge disservices from not being transparent to the nerds that WILL be the ones helping choose/implement these things. Instead of the current feeling I have, there could be three sentences that explains what Flox is offering here beyond what *anyone* can go do right now with nix-snapshotter.

If it's ecosystem stuff (you get Flox's CI, or CLI, or whatever else), that's not very well sold to me on the landing page. Otherwise I'm feeling left empty-handed.

▲jeremy_flox 2 hours ago

Totally valid - we buried the lede here. Quick version:

Not nix-snapshotter because we skip Nix eval entirely and get way better cache sharing across unrelated workloads (quantized catalog means everything shares base deps). On "environments": these aren't devshells-as-prod, they're the actual runtime; same as 'flox activate' works everywhere. You're shipping a declarative, hash-pinned runtime that happens to also work great in dev/CI

And yeah, we should have been upfront that this is alpha and we're planning to open source it after vetting at KubeCon.

You're right that we're doing ourselves a disservice not being transparent with the technical crowd. What specific technical details would help you evaluate this?

▲the_real_cher 8 hours ago

[flagged]

▲CuriouslyC 8 hours ago

You can say what you want about Kube (it's a bit of a necessary evil for the people that need it), but keep Nix's name out yo damn mouth. It's for real.