Fresh Hacker News | Show HN: Spark, An advanced 3D Gaussian Splatting renderer for Three.js

▲Show HN: Spark, An advanced 3D Gaussian Splatting renderer for Three.js(sparkjs.dev)

371 points by dmarcos 1 day ago | 28 comments

▲erulabs 1 day ago

Super impressive looking demo, works well on my older iphone.

As an only-dabbling-hobbiest game developer who lacks a lot of 3d programming knowledge, the only feedback I can offer is you might perhaps define what "Gaussian Splatting" is somewhere on the github or the website. Just the one-liner from wikipedia helps me get more excited about the project and potential uses: Gaussian splatting is a volume rendering technique that deals with the direct rendering of volume data without converting the data into surface or line primitives.

Super high performance clouds and fire and smoke and such? Awesome!

▲dmarcos 1 day ago

Thanks. We have to definitely add an FAQ

▲jasonthorsness 1 day ago

The food scans demo ("Interactivity" examples section) is incredible. Especially Mel's Steak Sandwich looking into the holes in the bread.

The performance seems amazingly good for the apparent level of detail, even on my integrated graphics laptop. Where is this technique most commonly used today?

▲dmarcos 1 day ago

There's a community of people passionate about scanning all short stuff with handheld devices, drones... Tipatat let us generously use his food scans for the demo. I also enjoy kotohibi flower scans: https://superspl.at/user?id=kotohibi

Edit: typos

▲jasonthorsness 1 day ago

Wow what kind of device do I need to make my own?

▲dmarcos 1 day ago

The food scans are just photos from a Pixel phone processed with postshot (https://www.jawset.com/) to generate the splats

▲mft_ 1 day ago

Out of interest, to what extent do splats recorded in this manner have reliable/measurable dimensions?

▲jaccola 21 hours ago

They are not reliable at all unless paired with some physical measurements (Lidar, or a known size object in the scene).

Probably an interesting use for a pretrained model to estimate scale based on common items seen in scenes (cars, doorframes, trees, etc…)

▲ChadNauseam 1 day ago

I'm sure it's not cutting edge, but the app "scaniverse" generates some very nice splats just by you waving your phone around an object for a minute or so.

▲dmarcos 1 day ago

Yes there are several phone apps to generate splats. Also Luma 3D capture.

▲creata 1 day ago

And the transfer size for that level of detail isn't that bad, either - only around 80MB. (Not being sarcastic, it's really neat.)

▲dmarcos 1 day ago

Yeah. And some of the individual scans like Clams and Caviar or Pad Thai are < 2MB.

▲ertucetin 1 day ago

This is cool also BabylonJS has nice gaussian splat support as well: https://doc.babylonjs.com/features/featuresDeepDive/mesh/gau...

▲echelon 1 day ago

BabylonJS and the OP's own Aframe [1] seem to have similar licenses, similar number of Github stars and forks, although Aframe seems newer and more game / VR focused.

How do Babylon, Aframe, Three.js, and PlayCanvas [2] compare from those that have used them?

IIUC, PlayCanvas is the most mature, featureful, and performant, but it's commercial. Babylon is the featureful 3D engine, whereas Three.js is fairly raw. Though it has some nice stuff for animation, textures, etc., you're really building your own kit.

Any good experiences (or bad) with any of these?

OP, your demo is rock solid! What's the pitch for Aframe?

How do you see the "gaussian splat" future panning out? Will these be useful for more than visualizations and "digital twins" (in the industrial setting)? Will we be editing them and animating them at any point in the near future? Or to rephrase, when (or will) they be useful for the creative and gaming fields?

[1] https://github.com/aframevr/aframe

[2] https://playcanvas.com/

▲dmarcos 1 day ago

A-Frame is an entity component system on top of THREE.js that uses the DOM as a declarative layer for the scene graph. It can be manipulated using the standard APIs and tools that Web developers are used to. Initial target was onboarding Web devs into 3D but found success beyond. The super low barrier of entry (hello world below) without sacrificing functionality made it very popular for people learning programming / 3D (part of the curriculum in many schools / universities) and in advanced scenarios (moonrider.xyz ~100k MAUs (300k MAUs at peak) most popular WebXR content to date is made with A-Frame)

One of the Spark goals is exploring applications of 3D Gaussian Splatting. I don't have all the answers yet but already compelling use cases quickly developing. e.g photogrammetry / scanning where splats represent high frequency detail in an appealing and relatively compact way as you can see in one of the demos (https://sparkjs.dev/examples/interactivity/index.html). There are great examples of video capture already (https://www.4dv.ai/). Looking forward to seeing new applications as we figure out better compression, streaming, relighting, generative models, LOD...

A-Frame hello world

▲echelon 1 day ago

Thank you, this is great info!

▲ovenchips 1 day ago

When you say that PlayCanvas is commercial, that's a little misleading. The PlayCanvas Engine (analogous to Three.js and Babylon.js) is free and open source (MIT). The PlayCanvas Engine is where you'll find all the cool 3DGS tech. There are two further frameworks that wrap the Engine (for those that prefer to use a declarative interface): PlayCanvas Web Components and PlayCanvas React. Again, both of these are free and open source (MIT). Only the PlayCanvas Editor (analogous to a browser-based Unity) has optional payment plans (for those that want to create private projects).

PlayCanvas Engine: https://github.com/playcanvas/engine

PlayCanvas Web Components: https://github.com/playcanvas/web-components

PlayCanvas React: https://github.com/playcanvas/react

▲Joel_Mckay 1 day ago

Did a test study in BabylonJS, and generally the subset of compatible features is browser specific.

The good:

1. Blender plugin for baked mesh animation export to stream asset is cool

2. the procedural texture tricks combined with displacement maps mean making reasonable looking in game ocean/water possible with some tweaking

3. adding 2D sprite swap out for distant objects is trivial (think Paper Mario style)

The bad:

1. burns gpu vram far faster than normal engines (dynamic paint bloats up fast when duplicating aliases etc. )

2. JS burns CPU cycles, but the wasm support is reasonable for physics/collision

3. all resources are exposed to end users (expect unsophisticated cheaters/cloners)

The ugly:

1. mobile gpu support on 90% of devices is patchwork

2. baked lighting ymmv (we tinted the gpu smoke VFX to cheat volumetric scattering)

3. in browser games essentially combine the worst aspects of browser memory waste, and security sandbox issues (audio sync is always bad in browser games)

Anecdotally, I would only recommend the engine for server hosted transactional games (i.e. cards or board games could be a good fit.)

Otherwise, if people want something that is performant, and doesn't look awful.... Than just use the Unreal engine, and hire someone that mastered efficient shader tricks. =3

▲tmilard 1 day ago

Personaly I have been using babylonJs for five years. And I just love it. For me it's so easy to program ( cleanest API I have ever seen) and my 3D runtime is so light, my demos work fine even on my android phone.

▲Joel_Mckay 1 day ago

Web browsers add a lot of unnecessary overhead, and require dancing with quarterly changes in policies.

In general, most iOS devices are forced to use/link their proprietary JS vm API implementation. While Babylon makes it easier, it often had features NERF'd by both Apple iOS, and Alphabet Android. In the former case it is driven by a business App walled garden, and in the latter it is device design fragmentation.

I like Babylon in many ways too, but we have to acknowledge the limitations in deployment impacting end users. People often end up patching every update Mozilla/Apple/Microsoft pushes.

Thus, difficult to deploy something unaffected by platform specific codecs, media syncing, and interface hook shenanigans.

This coverage issue is trivial to handle in Unity, GoDot, and Unreal.

The App store people always want their cut, and will find convenient excuses to nudge that policy. It is the price of admission on mobile... YMMV =3

▲m_kos 23 hours ago

One component of my hobby web app project is a wavetable. Below are two examples of wavetables. I want it to not tax the browser so that other, latency sensitive, components do not suffer.

Would you have any suggestions on what JS/TS package to use? I built a quick prototype in three.js but I am neither a 3D person nor a web dev, so I would appreciate your advice.

Examples:

- https://audiolabs-erlangen.de/media/pages/resources/MIR/2024...

- https://images.squarespace-cdn.com/content/v1/5ee5aa63c3a410...

▲Joel_Mckay 22 hours ago

Personally, I wouldn't try to do DSP pipe code in VM.

1. Use global fixed 16bit 44.1kHz stereo, and raw uncompressed lossless codec (avoids gpu/hardware-codec and sound-card specific quirks)

2. Don't try to sync your audio to the gpu 24fps+ animations ( https://en.wikipedia.org/wiki/Globally_asynchronous_locally_... ). I'd just try to cheat your display by 10Hz polling a non-blocking fifo stream copy. ymmv

3. Try statically allocated fifo* buffers in wasm, and software mixers to a single output stream for local chunk playback ( https://en.wikipedia.org/wiki/Clock_domain_crossing )

* recall fixed rate producer/consumers should lock relative phase when the garbage collector decides to ruin your day, things like software FIR filters are also fine, and a single-thread output pre-mixed stream will eventually buffer though whatever abstraction the local users have setup (i.e. while the GC does its thing... playback sounds continuous.)

Inside a VM we are unfortunately at the mercy of the garbage collector, and any assumptions JIT compiled languages make. Yet wasm should be able to push io transfers fast enough for software mixers on modern cpus.

Best of luck =3

▲m_kos 10 hours ago

Thank you!

▲echelon 1 day ago

Thanks!

▲moshegramovsky 1 day ago

Cool work, but I have to say the performance is pretty bad in Firefox on my laptop with an Nvidia RTX A3000 GPU. There are enough shader cores here to cause first degree burns.

▲dmarcos 19 hours ago

With any demo / example in particular?

▲fidotron 1 day ago

Very very cool.

Do you have any insights into the current performance bottlenecks? Especially around dynamic scenes. That particle simulation one seems to struggle but then improves dramatically when the camera is rotated, implying the static background is much heavier than it appears.

And as a counterpoint to the bottlenecks, that Sierpinski pyramid, procedurally, is brilliant.

▲dmarcos 1 day ago

Number of splats in the scene and distribution have an impact on performance. Probably in your case you turned the camera in a direction with less splats. There's definitely work to do to deliver consistent performance. We'll probably look into an LOD system next.

▲pvg 1 day ago

Slightly more obvious repo link https://github.com/sparkjsdev/spark

▲VikingCoder 23 hours ago

Can I run around with my phone, and capture some Gaussian Splats of... grass... bushes... dirt...

And then select one-meter square patches of land... and one-meter cubes of spots with bushes...

And then make a "Minecraft-looking" world, repeating the grass block all over the place, with occasional dirt and bushes?

I'm guessing I'd need some pretty beefy hardware to render thousands of blocks...

▲dmarcos 19 hours ago

You definitely could prototype something like that. Would be really cool to see.

▲ 1 day ago

▲socalgal2 1 day ago

I'm still highly skeptical of gaussian splatting as anything more than a demo. The files are too large. The steak sandwich is 12meg (as just one example)

There was a guassain splat based Matterport port clone at least year's siggraph. To view a 2 bedroom apartment required streaming 1.5gig

Cool demo

▲dmarcos 1 day ago

Thanks! Notice 12MB steak sandwich is the biggest of them all. Rest are < 10MB and several of those very compelling in the 1-3MB range (e.g: Iberico Sandwich 1MB, Clams and Caviar 1.8MB).

Fancier compression methods are coming (e.g SOGS). This is 30MB!

https://vincentwoo.com/3d/sutro_tower/

▲oofbey 1 day ago

How much of the huge file size is because you need tons of splats to simulate a hard surface? Conceptually the splats seems flawed because gaussians don't have hard edges - they literally go to infinity in all directions, just at vanishingly small densities. So practically everybody cuts them off at 3 sigma or something, which covers 99.7% of the volume. But real-world objects have hard edges, and splats don't.

Would the format work better if you made that cut-off at something like 1 sigma instead? Then instead of these blurry blobs you'd effectively be rendering ovals with hard edges. I speculate out loud that maybe you could get a better render with fewer hard-edged ovals than tons of blurry blobs.

▲pixelsynth 1 day ago

It's an interesting idea, and with spark you could test this by adjusting the parameter maxStdDev to control how far out it draws the splat.

I agree with you though that in general 3DGS is a worse representation for hard, flat, synthetic things with hard edges. But in the flip side, I would argue it's a better representation for many organic, real-world things, like imagine fur or hair or leaves on a tree... These are things that can render beautifully photo realistically in a way that would require much, much more complex polygon geometry and texturing and careful sorting and blending of semi-transparent texels. This is one reason why 3DGS has become so popular in scanning and 3D reconstruction.. you just get much better results with smaller file sizes. When 3DGS first appeared, everyone was shocked by how photorealistic you could render things in real time on a mobile device!

But one final thought I want to add: with Spark it's not an either/or. You can have BOTH in the same Three.js scene and they will blend together perfectly via the Z-buffer. So you can scan the world around you and render it with 3DGS, and then insert your hard-edged robot character polygon meshes right into that world, and get the best of both!

▲oofbey 19 hours ago

Cool - thanks for explaining that. I totally see how each has its place.

I imagine it's pretty complex to take the raw scan data and generate 3dgs. Are these algorithms simple & standard, or do they take a fair amount of tuning & tweaking to do a good job? Adapting these to work well with hard-edge ovals seems like it would take some work, and a lot more work to get them to output a mix of ovals & fuzzy blobs. But if you could do that, I agree the combination would be amazingly expressive.

▲pixelsynth 17 hours ago

There are a lot of tools to do this easily today, for free! Take a look at Postshot, or Brush. You can literally take a video with your mobile phone, toss it in Postshot, and a few minutes later you have a photorealistic 3DGS model you can use in Spark!

3DGS is still a rapidly evolving research field, but the "baseline" is pretty much standard these days.

▲semi-extrinsic 1 day ago

There was an interesting preprint (and code release) recently on triangle splatting, you may find this interesting:

https://trianglesplatting.github.io/

▲athriren 1 day ago

thanks for that link, i found it really cool.

▲ovenchips 17 hours ago

The SOGS compression technique works well. You can get 1M Gaussians with full spherical harmonics in about 14MB. There's a good article about it on the PlayCanvas blog:

https://blog.playcanvas.com/playcanvas-adopts-sogs-for-20x-3...

▲hellohello2 1 day ago

Large file sizes are mostly to store spherical harmonics coefficients which is a fixable problem.

▲Epa095 1 day ago

Slightly overloaded name. There is already Apache Spark, SPARK (Ada), sparklines, and SPARQL.

▲thesuperbigfrog 1 day ago

Don't forget SPARC: https://en.wikipedia.org/wiki/SPARC

▲mbo 1 day ago

Amazing stuff!

Do you foresee this project getting a Web Components API like A-Frame and Google's <model-viewer> component (https://modelviewer.dev/)?

▲dmarcos 19 hours ago

A-Frame integration coming soon!

▲bryzaguy 1 day ago

Wish I could see this! My iPhone 16 blocked viewing because of, I think, expired certificate. At least, that’s the error I think I got initially and now it just says the page belongs to a category that is blocked. :(

▲dmarcos 23 hours ago

Strange. I also have an iPhone I haven’t seen any issues. Are you using a different browser than Safari or any particular configuration?

▲two_handfuls 1 day ago

This looks super cool! Would this work in VR if someone opens the web page from a web browser? Because that would be even more awesome!

▲dmarcos 1 day ago

It does work in VR! We'll have a demo available soon.

▲two_handfuls 1 day ago

Oh that's fantastic!

▲pixelsynth 19 hours ago

We have a WebXR demo we built during Spark's development that showcases 3DGS running on Quest 3 or Vision Pro:

https://lofiworlds.ai

Make sure to enable hand tracking so you can "touch" the Gaussian splats :). (tap your wrists together to toggle spotlight hands mode)

▲danybittel 1 day ago

How do you do the rendering? Is it sorted (radix?) instances? Do you amortize the sorting over a couple frames? Or use some bin sorting? Are you happy with the performance?

▲pixelsynth 1 day ago

Yes, Spark does instanced rendering of quads, one covering each Gaussian splat. The sorting is done by 1) calculating sort distance for every splat on the GPU, 2) reading it back to the CPU as float16s, 3) doing a 1-pass bucket sort to get an ordering of all the splats from back to front.

On most newer devices the sorting can happen pretty much every frame with approx 1 frame latency, and runs in parallel on a Web Worker. So the sorting itself has minimal performance impact, and because of that Spark can do fully dynamic 3DGS where every splat can move independently each frame!

On some older Android devices it can be a few frames worth of latency, and in that case you could say it's amortized over a few frames. But since it all happens in parallel there's no real impact to the overall rendering performance. I expect for most devices the sorting in Spark is mostly a solved problem, especially with increasing memory bandwidth and shared CPU-GPU memory.

▲danybittel 1 day ago

If you say 1 pass bucket sorting.. I assume you do sort the buckets as well?

I've implemented a radix sort on GPU to sort the splats (every frame).. and I'm not quite happy with performance yet. A radix sort (+ prefix scan) is quite involved with lot's of dedicated hierarchical compute shaders.. I might have to get back to tune it.

I might switch to float16s as well, I'm a bit hesitant, as 1 million+ splats, may exceed the precision of halfs.

▲pixelsynth 1 day ago

We are purposefully trading off some sorting precision for speed with float16, and for scenes with large Z extents you'd probably get more Z-fighting, so I'm not sure if I'd recommend it for you if your goal is max reconstruction accuracy! But we'll likely add a 2-pass sort (i.e. radix sort with a large base / #buckets) in the future for higher precision (user selectable so you can decide what's more important for you). But I will say that implementing a sort on the CPU is much simpler than on the GPU, so it opens up possibilities if you're willing to do a readback from GPU to CPU and tolerate at least 1 frame of latency (usually not perceivable).

▲danybittel 1 day ago

You might want to consider using words (16 bit integer) instead of halfs? Then you can use all the 65k value precision in a range you choose (by remapping 32bit floats to words), potentially adjust it every frame, or with a delay.

▲pixelsynth 19 hours ago

Yeah you're right, using float16 gets us 0x7C00 buckets of resolution only. We could explicitly turn it into a log encoding and spread it over 2^16 buckets and get 2x the range there! Other renderers do this dynamic per-frame range adjustment, we could do that too.

▲imachine1980_ 1 day ago

What's the difference in this and and .obj render? Does gaussian produce a different format?

▲dmarcos 19 hours ago

obj is traditional geometry (vertices, triangles). gaussian splats is a different way to represent 3D information (simplifying. it's a point cloud where each point it's an ellipsoid with view dependent color)

▲_tqr3 1 day ago

This is great, thanks!

I have spent countless hours playing with the R3F - adding vertex and fragment shaders, eventually giving up. The math is just tedious.

▲dmarcos 1 day ago

We have a template for R3F you might find useful:

https://github.com/sparkjsdev/spark-react-r3f

▲markisus 1 day ago

The demos look great! I imagine it's not pure javascript. Are you using webgpu?

▲dmarcos 1 day ago

Just WebGL2

▲shadowgovt 1 day ago

The WebGL API is based on the OpenGL ES standard, which jettisoned a lot of the procedural pipeline calls that made it easy to write CPU-bound 3D logic.

The tradeoff is initial complexity (your "hello world" for WebGL showing one object will include a shader and priming data arrays for that shader), but as consequence of design the API sort of forces more computation into the GPU layer, so the fact JavaScript is driving it matters very little.

THREE.js adds a nice layer of abstraction atop that metal.

▲pixelsynth 1 day ago

Spark allows you to construct compute graphs at runtime in Javascript and have them compiled and run on the GPU and not be bound by the CPU: https://sparkjs.dev/docs/dyno-overview/

WebGL2 isn't the best graphics API, but it allows anyone to write Javascript code to harness the GPU for compute and rendering, and run on pretty much any device via the web browser. That's pretty amazing IMO!

▲ge96 1 day ago

That steak made my mouth water, mission accomplished

▲praveen9920 1 day ago

Any plans to extend support for triangle splats?

▲dmarcos 1 day ago

We're definitely looking at it. No specific plans yet

▲feiss 1 day ago

Super cool! Congrats on the launch :))

▲cchance 1 day ago

any chance this will support the video style 3d gaussian splats?

▲dmarcos 1 day ago

Yes. We have demos working already. Those 3D gaussian videos (or 4D that some people call) are really big so we're figuring out what's the best way to distribute and make it a great experience.

▲cchance 1 day ago

great to hear as a note, how the heck do they record them is it literally multiple camera points recording many videos and then synced?

▲pixelsynth 1 day ago

Most 4DGS reconstruction methods right now are exactly that: setting up many cameras and recording them simultaneously so you can reconstruct each instant in time as a 3DGS. In the future it might be possible to use a single camera and have an AI/ML method figure out how all the 3D gaussians move over time, including parts that are occluded from the single camera!

▲moron4hire 1 day ago

That interactive demo on the front page running perfectly fine on my Pixel 7 was pretty sick!

▲vmesel 1 day ago

hey @dmarcos! Congrats on the launch!

from your fellow gh accelerator friend, vinnie!

▲90s_dev 1 day ago

We seem to have to poles: extreme realism, and extremely minimalistic pixel art. I prefer the second camp. But your project looks really important in the first camp.

▲dmarcos 1 day ago

Thanks! It works for both! An under-explored area is converting into splats assets created in “traditional” ways (e.g blender). Better visual results in some scenarios (high freq detail). See the furry logo in the homepage carousel.

▲akomtu 1 day ago

Cool stuff. Do you have examples with semi-transparent surfaces? Something like a toy christmas tree inside a glass sphere, with basic reflections and refractions calculated?

▲dmarcos 1 day ago

Nothing top of mind like that. Check out people's scans at https://superspl.at/ Anything there should also render in Spark.

▲akomtu 1 day ago

https://superspl.at/view?id=8c35f06d rendered very well. This Spark is surprisingly fast: a 650 MB scene got rendered at 120 fps.

▲westurner 1 day ago

PartCAD can export CAD models to Three.js.

The OCP CAD viewer extension for build123d and cadquery models, for example, is also built on Three.js. https://github.com/bernhard-42/vscode-ocp-cad-viewer

▲westurner 1 hour ago

PartCAD has a library of 3D parts which might be useful for testing a different renderer.

The OCP CAD viewer could optionally use this Gaussian splatting renderer (because it's also already built on Three.js)

▲nobbis 1 day ago

Wait, you renamed Forge (https://forge.dev) released last week by World Labs, a startup that raised $230M.

Is this "I worked with some friends and I hope you find useful" or is it "So proud of the World Labs team that made this happen, and we are making this open source for everyone" (CEO, World Labs)?

https://x.com/drfeifei/status/1929617676810572234

▲dmarcos 1 day ago

Yes. I collaborated with one of the devs at World Labs on this. The goal is to explore new rendering techniques and popularize adoption of 3D gaussian splatting. There's no product associated with it.

▲nobbis 1 day ago

Understood. Thanks for the clarification.

▲dmarcos 1 day ago

We renamed due to a name collision with another renderer / tool.

▲ 1 day ago

▲JHONWALTER 23 hours ago

[dead]