Fresh Hacker News | We Saved $500k per Year by Rolling Our Own "S3"

▲We Saved $500k per Year by Rolling Our Own "S3"(engineering.nanit.com)

130 points by mpweiher 9 hours ago | 19 comments

▲varenc 1 hour ago

In HN style, I'm going to diverge from the content and rant about the company:

Nanit needs this storage because they run cloud based baby cameras. Every Nanit user is uploading video and audio of their home/baby live to Nanit without any E2EE. It's a hot mic sending anything you say near it to the cloud.

Their hardware essentially requires a subscription to use, even though it costs $200/camera. You must spend an additional $200 on a Nanit floor stand if you want sleep tracking. This is purely a software limitation since there's plenty of other ways to get an overhead camera mount. (I'm curious how they even detect if you're using the stand since it's just a USB-C cable. Maybe etags?)

Of course Nanit is a popular and successful product that many parents swear by. It just pains me to see cloud based in-home audio/video storage being so normalized. Self-hosted video isn't that hard but no one makes a baby-monitor centric solution. I'm sure the cloud based video storage model will continue to be popular because it's easy, but also because it helps justifies a recurring subscription.

edit: just noticed an irony in my comment. I'm ranting about Nanit locking users into their 3rd party cloud video storage, and the article is about Nanit's engineering team moving off a 3rd party (S3) and self-hosting their own storage. Props to them for getting off S3.

▲sbrother 1 hour ago

As a happy customer, I picked nanit because it actually worked. We didn’t even use the “smart” features, but “you can turn on the app from anywhere you happen to be and expect the video feed to work” is unfortunately a bar that no competitor I tried could meet. The others were mostly made by non-software companies with outsourced apps that worked maybe 50% of the time.

I wish we could have local-first and e2ee consumer software for this sort of thing, but given the choice of that or actually usable software, I am going to pick the latter.

▲varenc 1 hour ago

I self host my "baby monitor" with UniFi Protect on UCG-Max and a G6 Instant wireless camera. It's more work to setup, but pretty easy for a techie. It has the "turn on the app anywhere and it works" feature, and with a 2TB SSD I get a month+ of video storage. Because storage is local, it doesn't need to compress the video and I get a super clear 4K image. And I use Homebridge to expose the camera over Apple HomeKit which is a convenient and a more user friendly way to access it. And HomeKit also gives you out-of-home access with a hub. I love my setup, but I couldn't in good conscience recommend it to a non-techie friend, especially if they're sleep deprived from their infant.

But I do miss the lack of any baby-specific features like sleep tracking. It has support for crying detection, but that's it.

▲sbrother 1 hour ago

Ok that’s really cool; I didn’t know you could set up Apple’s smart home thingy to forward a live feed to the cloud.

▲varenc 1 hour ago

It's pretty cool! But homebridge is another service to run in a Docker container.. so even less user friendly. But it's definitely the primary way everyone that's not me accesses the baby camera. The out-of-home access requires a "HomeKit Hub" which can just be an Apple TV that's always plugged in. And HomeKit also has "HomeKit Secure Video" feature which is cloud based video storage, but with E2EE. But don't recommend their video storage really.

▲spockz 38 minutes ago

Alternatively you can setup a vpn with rules that automatically enable vpn when you try to connect to specific addresses. Works with Tailscale and on-demand VPN for me. This will work with any IP webcam.

▲vachina 1 hour ago

What competitor have you actually tried? My girlfriend’s parents have a few cheap TPlink solar powered CCTV and they work flawlessly since setup. I used to jerryrig an Android phone for Alfred and that too worked well.

My impression is live feed is a solved problem.

▲sbrother 1 hour ago

I tried a high end Philips one and a Nest camera. Both were way less reliable than the Nanit. Possibly because they didn’t play nicely with my mesh WiFi at home. But regardless I just wanted to vouch for Nanit’s software, whatever they are doing with their networking and UX is really good.

▲jaas 47 minutes ago

Their networking is awful in my experience. The WiFi chip is cheap crap, extremely sensitive, cuts out a lot, and doesn’t support WPA3.

I had to set up a dedicated Nanit-only AP in my house in order to stabilize the connection. It would not work any other way, tried many different configurations, even other APs.

▲vlovich123 1 hour ago

The vtech camera is working well enough for me for what it’s worth. But any such app solution generally implies transfer through the company’s servers.

▲sbrother 1 hour ago

Yeah that’s fair, we had one of those too which absolutely did everything it advertised. The nanit is a different product that doubles as a home camera that lets you monitor your home while you’re away. Its software/networking is impressively reliable.

▲chrismorgan 31 minutes ago

> Every Nanit user is uploading video and audio of their home/baby live to Nanit without any E2EE. It's a hot mic sending anything you say near it to the cloud.

Your way of phrasing it makes it sound like it would be fine to upload the video if it were end-to-end-encrypted. I think this is worth clarifying (since many don’t really understand the E2EE trade-off): E2EE is for smart clients that do all the processing, plus dumb servers that are only used for blind routing and storage. In this instance, it sounds like Nanit aren’t doing any routing or (persistent) storage: the sole purpose of the upload is offloading processing to the cloud. Given that, you can have transport security (typically TLS), but end-to-end encryption is not possible.

If you wanted the same functionality with end-to-end encryption, you’d need to do the video analysis locally, and upload the results, instead of uploading the entire video. This would presumably require more powerful hardware, or some way of offloading that to a nominated computer or phone.

▲BrandoElFollito 19 minutes ago

In other words, E2EE requires two or more clients, and only on these clients the information is in clear.

In the case of this product, there is only one client (and a server).

E2EE bills then down to having the traffic encrypted like you have with a https website.

▲cbg0 53 minutes ago

> Self-hosted video isn't that hard

Self-hosting video is not something the typical user of a baby monitor would ever even consider.

▲gblargg 22 minutes ago

A microSD card in the camera, like most others use?

From the product description though it sounds like sleep analysis is what you're paying for, which they do on servers analyzing the video.

▲unethical_ban 33 minutes ago

Extraordinary claims require extraordinary evidence.

I'm not leaving a baby at home while I go on vacation. I would never be on another network, even. Why need the cloud?

▲sokoloff 29 minutes ago

Because it’s easy and convenient for new parents.

The typical parent has never heard of Synology or Ubiquiti, doesn’t have a NAS, and gets whatever tech their ISP gave/rents them.

▲chii 22 minutes ago

It's more that a typical parent has not thought of the need to have a baby monitor, until they have a baby (in which case, they're too busy to build out their own baby monitor stack).

Pay money to solve a problem and time-save as a parent is a valid business idea/strategy. The externalities that the parents might suffer if these businesses do not completely adhere to good security practices don't seem to come back to bite them (and most parents get lucky and not have any bad consequences - yet).

▲unethical_ban 20 minutes ago

There is no technical requirement for an easy-to-use baby monitor to be cloud-connected. If there is no easy-to-use baby monitor which is not cloud-connected, that is a market problem, not a technical problem.

▲spockz 40 minutes ago

We just used ipcams with our kids. Now with ubiquity it is dead simple to setup also storage for it. I think synology supports anything that emits rtsp.

Baby monitors around here -Alecto is a popular brand - cost twice as much and have only half the capabilities.

▲kdamica 1 hour ago

We've used an offline Infant Optics baby camera for three kids and have never wished for any of the smart features that online cameras offer. You really just want to know whether they are asleep and when they are crying. I just don't see a good use case for recording all that video for most kids. (I'm sure there are special needs situations where it is helpful)

▲jen20 1 hour ago

This is the reason I refused to buy Nanit cameras, instead opting for unconnected models. E2E encryption is table stakes.

▲hshdhdhehd 1 hour ago

By the way you dont need a video (or hell even audio) baby monitor. Source: 2 kids.

▲NetOpWibby 54 minutes ago

Same here. I wonder if the market is for first-time parents and people who work 8+ hour days.

▲wltr 41 minutes ago

I used to work with my laptop, sitting near my baby. Also, I used a timer to follow 45m sleep patterns, so technically there’s no need to react to anything within first 45m, but most times first 1h30m (45+45m).

▲Lucian6 2 hours ago

Having gone through S3 cost optimization ourselves, I want to share some important nuances around this approach. While the raw storage costs can look attractive, there are hidden operational costs to consider:

We found that implementing proper data durability (3+ replicas, corruption detection, automatic repair) added ~40% overhead to our initial estimates. The engineering time spent building and maintaining custom tooling for multi-region replication, access controls, and monitoring ended up being substantial - about 1.5 FTE over 18 months.

For high-throughput workloads (>500 req/s), we actually saw better cost efficiency with S3 due to their economies of scale on bandwidth. The breakeven point seems to be around 100-200TB of relatively static data with predictable access patterns. Below that, the operational overhead of running your own storage likely exceeds S3's markup.

The key is to be really honest about your use case. Are you truly at scale? Do you have the engineering resources to build AND maintain this long-term? Sometimes paying the AWS premium is worth it for the operational simplicity.

▲YZF 2 hours ago

Right. Having worked on a commercial S3 compatible storage I can tell y'all that there's a lot more to it then just sticking some files on JBOD. It does depend on your specific requirements though. 1.5 FTE over 18 months sounds on the low side for everything you've described.

That said the article seems to be more about an optimization of their pipeline to reduce the S3 usage by holding some objects in memory instead. That's very different than trying to build your own object store to replace S3.

▲supriyo-biswas 2 hours ago

Why do all your comments seem LLM generated? You do clearly have something to contribute, but it’s probably better to just write what you’re talking about than going through a LLM.

▲pjjpo 2 hours ago

I don't know about the commenter specifically but in general, using LLMs to format text is a game changer in the ability for English-as-Second-Language folks to contribute to tech conversations. While I get where some of the bias against anything LLM generated comes from, I would keep it for editorial content and not community comments to be fair to a global audience.

▲ashdksnndck 1 hour ago

I’m worried that LLMs could facilitate cheap, scaled astroturfing.

I understand that people encounter discrimination based on English skill, and it makes sense that people will use LLMs to help with that, especially in a professional context. On the other hand, I’d instinctively be more trusting of the authenticity of a comment with some language errors than one that reads like it was generated by ChatGPT.

▲barrell 1 hour ago

I’m not sure if that’s a realistic ask. There is ample abuse of LLM generated content, and there are plenty of ESL publishers.

Personally I would recommend including a note that English is not your native language and you had an LLM clean things up. I think people are willing to give quite a bit of grace, if it’s disclosed.

Personally, I’d rather see a response in your native language with a translation, but I’m fairly certain I’m the odd one out in that situation XD

▲nondrool 31 minutes ago

You're not alone.

▲phito 1 hour ago

It just makes everything sound bland and soulless. You don't know which part of the message actually comes from the user's brain and which part has been added/suggested by the LLM. The latter is not an original thought and it would be disingenuous to include it, but people do because it makes them look smarter. Meanwhile, on the other side, you might as well be talking to a LLM...

▲LPisGood 1 hour ago

What do you see about this comment that seems particularly LLM generated?

▲bryanrasmussen 40 minutes ago

I wondered myself, as it seemed ok, but I went through the poster's history as I was interested.

Firstly, they have a remarkably consistent style. Everything is like this. There's not very many examples to choose from, so that's maybe also to be expected, and perhaps it is just also their personality.

I worry, as I've been accused myself, that there is perhaps something in the style the accuser dislikes or finds off-putting and nowadays the suspected cause will be LLM.

Secondly, they have "extensive experience" in various areas of technology, that don't seem to be especially related to each other. I too have extensive experience in several areas of technology but there is something of a connector between them.

Perhaps it is just because of their high level of technical expertise that they have managed to move between these areas and gain this extensive experience. And because of the high level of technical expertise and their interest in only saying very technical things all the time, their communications seem less varying and human, and more LLM.

▲BoredPositron 2 minutes ago

It's the verbose writing style. I can see why you would be accused as well.

▲ 20 minutes ago

▲john01dav 1 hour ago

There are more options than using S3 or completely rolling your own on JBOD. For example, you could use a cheaper S3-compatible cloud (such as Backblaze) or you can deploy a project such as Ceph.

▲Twirrim 2 hours ago

S3 does more than 3x replica durability, as well, they use a form of erasure coding. They can lose several hard drives/servers/racks before your data becomes at risk, and have sufficient spare capacity to very quickly reproduce any missing shards before things become a problem.

That said, S3 seems like a really odd fit for their workload, plus their dependency on lifecycle rules seems utterly bizarre.

> Storage was a secondary tax. Even when processing finished in ~2 s, Lifecycle deletes meant paying for ~24 h of storage.

They decided not to implement the deletion logic in their service, so they'd just leave files sitting around for hours instead needlessly paying that storage cost? I wonder how much money they'd have saved if they just added that deletion logic.

▲groundzeros2015 2 hours ago

Is spending time to optimize S3 in the manner you describe not a relevant cost?

▲Havoc 4 hours ago

Tbh I feel this in one of those that would be significantly cleaner without serverless in first place.

Sticking something with 2 second lifespan on disk to shoehorn it into aws serverless paradigm created problems and cost out of thin air here

Good solution moving at least partially to a in memory solution though

▲tcdent 4 hours ago

Yeah, so now you're basically running a heavy instance in order to get the network throughput and the RAM, but not really using that much CPU when you could probably handle the encode with the available headroom. Although the article lists TLS handshakes as being a significant source of CPU usage, I must be missing something because I don't see how that is anywhere near the top of the constraints of a system like this.

Regardless, I enjoyed the article and I appreciate that people are still finding ways to build systems tailored to their workflows.

▲inlined 3 hours ago

Maybe they’re not using keepalives in their clients causing thousands of handshakes per second?

▲ixtli 2 hours ago

They didn’t actually do what the headline claims. They made a memory cache which sits in front of S3 for the happy path. Cool but not nearly rolling your own S3

▲anshumankmr 13 minutes ago

Some stuff like this also exists: https://www.dell.com/en-in/shop/storage-servers-and-networki...

We could just use something like that

Or there is that other Object storage solution called R1 from Cloudflare.

▲dmje 16 minutes ago

I’m sufficiently old / sensible (you decide) to think that uploading video of your baby (to anywhere) is fucking weird and fucking spooky and not needed anyway. This is a solution that doesn’t have a problem. Worse: it prays on parental / young parental fears. There’s nothing here - this is not a product that’s needed. You don’t need to “track” your baby, ffs. You don’t need to watch it while it sleeps. You don’t need “every breath, seen”. People have been having babies for fucking centuries without entering them into this hyper weird surveillance state at birth.

What an appalling screwed up world we seem to have manufactured for ourselves.

▲anarsdk 25 minutes ago

Sounds like the title should have been

> We used S3 even though it wasn’t the right service

▲dxxvi 2 hours ago

So, you want a place to store many files in a short period of time and when there's a new file, somebody must be notified?

Have you ever thought of using a postgresql db (also on aws) to store those files and use CDC to publish messages about those files to a kafka topic? In your original way, we need 3 aws services: s3, lambda and sqs. With this way, we need 2: postgresql and kafka. I'm not sure how well this method works though :-)

▲none2585 4 hours ago

I'm curious how many engineers per year this costs to maintain

▲CaptainOfCoit 3 hours ago

> I'm curious how many engineers per year this costs to maintain

The end of the article has this:

> Consider custom infrastructure when you have both: sufficient scale for meaningful cost savings, and specific constraints that enable a simple solution. The engineering effort to build and maintain your system must be less than the infrastructure costs it eliminates. In our case, specific requirements (ephemeral storage, loss tolerance, S3 fallback) let us build something simple enough that maintenance costs stay low. Without both factors, stick with managed services.

Seems they were well aware of the tradeoffs.

▲nbngeorcjhe 4 hours ago

A small fraction of 1, probably? It sounds like a fairly simple service that shouldn't require much ongoing development

▲codedokode 3 hours ago

Especially if you have access to LLMs.

▲hinkley 3 hours ago

You're going to run a production system with a bus number of 1?

I think you mean a small fraction of 3 engineers. And small fractions aren't that small.

▲xboxnolifes 1 hour ago

The cost being a fraction of 1 does not imply it's one person. 3 people each spending 2 weeks a year on the service is still a fraction of 1.

▲hinkley 29 minutes ago

It is three opportunity costs. No free lunches.

▲Dylan16807 20 minutes ago

Nobody implied it was free. Yes there are opportunity costs, and they add up to less than one sysadmin of opportunity.

▲adrianN 2 hours ago

So far I have seen a lot more production systems with a bus factor of zero than production systems with a bus factor greater one.

▲codedokode 4 hours ago

And I am curious how many engineer years it requires to port code to cloud services and deal with multiple issues you cannot even debug due to not having root privileges in the cloud.

Without cloud, saving a file is as simple as "with open(...) as f: f.write(data)" + adding a record to DB. And no weird network issues to debug.

▲rajamaka 4 hours ago

> as simple as "with open(...) as f: f.write(data)"

Save where? With what redundancy? With what access policies? With what backup strategy? With what network topology? With what storage equipment and file system and HVAC system and...

Without on-prem, saving a file is as simple as s3.put_object() !

▲AdieuToLogic 3 hours ago

>> Without cloud, saving a file is as simple as "with open(...) as f: f.write(data)" + adding a record to DB.

> Save where? With what redundancy? With what access policies? With what backup strategy? With what network topology? With what storage equipment and file system and HVAC system and...

Most of these concerns can be addressed with ZFS[0] provided by FreeBSD systems hosted in triple-A data centers.