Fresh Hacker News | Show HN: Searchable Kubernetes CSI provider listing

▲Show HN: Searchable Kubernetes CSI provider listing(storageclass.info)

47 points by noctarius 9 days ago | 5 comments

▲patrakov 9 days ago

Some search terms (try "iscsi") break the site due to JS errors:

    Uncaught TypeError: e.description is undefined
    e https://storageclass.info/storageclasses/:6
    filterStoragesClasses https://storageclass.info/storageclasses/:6
    oninput https://storageclass.info/storageclasses/:6

2 storageclasses:6:204943 e https://storageclass.info/storageclasses/:6 filter self-hosted:195 filterStoragesClasses https://storageclass.info/storageclasses/:6 oninput https://storageclass.info/storageclasses/:6

Additionally, https://plausible.io/js/script.js is blocked by adblockers, and the search breaks completely then.

▲noctarius 9 days ago

Uh! Thanks for mentioning. Also wasn't aware that plausible is blocked, thanks for the hint, even though I wonder why it breaks the search. I'll figure it out. Thanks again.

▲candiddevmike 9 days ago

While this may be useful to some folks, it appears to be content marketing for https://www.simplyblock.io/, FYI.

▲noctarius 9 days ago

Simplyblock sponsors the domain, yes. Good point. Should have mentioned that. It's not on the simplyblock github account though and as you can see on GitHub, I've built it over the weekend :)

▲csinode 9 days ago

Disclaimer: My full time job involves developing a CSI driver that is on this list (not simplyblock, but I won't say more than that to avoid completely doxxing myself).

A lot of this chart seems weird - is it somehow autogenerated?

For example, what does it mean for a driver to support ReadWriteOncePod? On Kubernetes, all drivers "automatically" support RWOP if they support normal ReadWriteOnce. I then thought maybe it meant the driver supported the SINGLE_NODE_SINGLE_WRITER CSI Capability (which basically lets a CSI driver differentiate RWO vs RWOP and treat the second specially) - but AliCloud disk supports RWOP on this chart despite not doing that (https://github.com/search?q=repo%3Akubernetes-sigs%2Falibaba...).

Another example, what does it mean for a driver to support "Topology" on this chart? The EBS driver allegedly doesn't despite using most (all?) of the CSI topology features: https://github.com/kubernetes-sigs/aws-ebs-csi-driver/blob/7...

Also, listing "ephemeral volume" support is kinda misleading because Kubernetes has a "generic ephemeral volumes" feature that lets you use any CSI driver (https://kubernetes.io/docs/concepts/storage/ephemeral-volume...).

▲noctarius 9 days ago

I bet there are lots of mistakes yet. It's not really automatically generated, but I started from the list at https://kubernetes-csi.github.io/docs/drivers.html and split the table into a YAML file, marking the features that are mentioned in the docs.

Fixed a few, where I saw thinks in their respective docs, and added features like file or object storage. Also added a few that weren't mentioned.

Topology is https://kubernetes-csi.github.io/docs/topology.html, ReadWriteOncePod is supposed to mean https://kubernetes.io/blog/2023/04/20/read-write-once-pod-ac...

▲willcodeforfoo 9 days ago

Somewhat related: can anyone recommend a simple solution to share each node’s ephemeral disk/“emptyDir” across the cluster? Speed is more important than durability, this is just for a temporary batch job cluster. It’d be ideal if I could stripe across nodes and expose one big volume to all pods (JBOD style)

▲noctarius 9 days ago

You need some type of cluster-wide shared memory?

▲willcodeforfoo 8 days ago

Yep! But I don't have access to any block devices on nodes, only the local paths so I'm not sure OpenEBS or Ceph would work...

▲noctarius 8 days ago

I guess your biggest issue may be the multiple writer problem, but you'd have the same issue on a local disk. The second multiple writer are supposed to update the same files, you'll run into issues.

Have you thought about TCP sockets between the apps and sharing state, or something like a redis database?

▲willcodeforfoo 8 days ago

Hmm, maybe... although that wouldn't help with the aggregating multiple nodes issue, or would require a lot of app-side logic.

In this example, I have 200GB of ephemeral storage available on each node, ideally I'd like something like this:

  node1: /tmp/data1 (200GB free space)
  node2: /tmp/data2 (200GB free space)
  node3: /tmp/data3 (200GB free space)
  node4: /tmp/data4 (200GB free space)
  node5: /tmp/data5 (200GB free space)

...pods could some how mount node{1..5} as a volume, which would have 5 * 200GB ~1TB of space to write to... multiple pods could mount it and read the same data.

▲__turbobrew__ 9 days ago

Does anyone have good experiences with a replicated storage CSI which can run on commodity hardware?

I tried out OpenEBS Replicated — and it is promising — but it doesn’t really seem mature yet. Im a bit scared to put production critical data into it.

▲erulabs 9 days ago

My experience is that OpenEBS and Longhorn are cool and new and simplified, but that I would only trust my life to Rook/Ceph. If it's going into production, I'd say look at https://rook.io/ - Ceph can do both block and filesystem volumes.

▲__turbobrew__ 9 days ago

Thanks, I will look at Rook.

▲candiddevmike 9 days ago

What problem are you trying to solve with replicated storage?

A lot of times, finding a solution further up the stack or settling for backups ends up being more robust and reliable. Many folks have been burnt by all the fun failure scenarios of replicated filesystems.

▲__turbobrew__ 9 days ago

I build a platform which hosts a few thousand services and tens of thousands of nodes. Currently, those services need to each internally manage replicating data and need to be aware of failure domains (host, rack, data center).

What I would like to do is develop a system where applications just need to request replicated volumes which span a specific failure domain and push that logic down to the platform.

▲iampims 9 days ago

Rook would suit your problem space very well.

▲__turbobrew__ 9 days ago

Thank you

▲withinboredom 8 days ago

Longhorn, hands down. It’s dead simple to set up and works well with production workloads. We’ve had disks fail, nodes fail, etc. and it has handled everything brilliantly. It’s also near-native speeds, which is really nice.

▲ngharo 8 days ago

Near native speeds? I’m seeing an order of magnitude slower performance out of mine.

Doing anything special with your config? I already am setting placement options and played with replica options.

My only hope has been to wait for the V2 engine to become stable.

▲withinboredom 8 days ago

I guess it depends on how you are measuring it! If you compare it to running it on a RAID5, it is. We are running it on a RAID0 and using Longhorn replication to provide replication instead of the RAID and using striping to get more throughput.

▲noctarius 9 days ago

Guess you want a block storage?