Fresh Hacker News | Show HN: I taught AI to commentate Pong in real time

▲Show HN: I taught AI to commentate Pong in real time(github.com)

182 points by pncnmnp 2 days ago | 18 comments

▲mtlynch 6 hours ago

This is a fun idea, but I feel like Pong is too simple for the execution to work.

I watched the video, and it seemed like everything it was saying, you could have just pre-programmed for the very limited state space of Pong. It reminded me a little bit of the stock John Madden and Pat Sumerall sound bites that would play during 90s / early 2000s Madden games.

Could you apply the same idea to chess or Texas Hold 'Em? I feel like the additional complexity of those games could lead to more interesting commentary.

▲pncnmnp 6 hours ago

Author here. I agree with you - the number of metrics I can experiment with in Pong is limited. Chess and Go are next for me.

Overall, the simplicity of this project has helped me test the waters before diving into more complex territories. The underlying pipeline isn't bad - the approach of collecting events, periodically generating metrics from them, prioritizing them, generating commentary text, queuing those outputs, and then synthesizing speech should serve as the core for similar work.

It's also given me some intuition on how I can construct an "ecosystem" of data surrounding live action, to add a layer of realism to the narratives.

▲Tade0 5 hours ago

If I may, I would like to propose an, ahem, sport:

https://m.youtube.com/@JellesMarbleRuns

Greg Woods' commentary really brings this world of marble racing to life.

▲pncnmnp 5 hours ago

Hehehe! I love Jelle's Marble Runs - long-time subscriber. John Oliver introduced me to it - https://www.youtube.com/watch?v=z4gBMw64aqk

▲rybosome 1 hour ago

This is a great premise, and that underlying pipeline you mention sounds like a generally useful system for live commentary with the appropriate abstractions.

I’m curious to know more about how you retrieve from this ecosystem of data to add color. You mentioned nearest neighbor search, is that over game state? How is the data stored and queried?

▲pncnmnp 1 hour ago

Absolutely! I can elaborate on that part.

The code starts by simulating 15 tournament years (like from 2010 to 2024), with each year containing 4 grand slam tournaments - held in a knockout format. There are 64 players in the pool, all starting with an initial ELO score.

These players compete in the tournaments, with outcomes predicted based on their ELO ratings. ELO is then updated after each match. We rank players solely based on their ELO. Once the simulation completes, it generates a wealth of data. For each game, details such as points scored, points allowed, fastest ball speed, number of aces, point-by-point results, and more are simulated.

We can then cache and use this information for a ton of color commentary. For example, we can identify the GOATs of the game, highlight players who are performing exceptionally well, pinpoint underdogs, find matches similar to the one currently being played, etc.

However, I am just scratching the surface. Imagine having a function that considers "age" alongside ELO. Then, you could simulate performance based on age as well - and show things like the younger generation overtaking older players, or veterans still competing despite being past their prime. With a fn like this, you could simulate matches that span the past 75-100 years, generating a ton of nice data to analyze.

Data itself is not fun - you need nice metrics too - for fun correlations! See https://en.wikipedia.org/wiki/Baseball_statistics. The metrics don’t have to be perfect, after all, humans aren’t perfect. The key is engagement.

To find similar games, I store and cache all historical matches in a KD-tree, then use a NN search to find similar games - that's quite fast!

Some commentary can also be dynamically generated at runtime - for example, locker-room whispers. It is important to provide GPT with a decent historical window to avoid generating contradictory info in such cases.

▲htrp 4 hours ago

>Could you apply the same idea to chess or Texas Hold 'Em? I feel like the additional complexity of those games could lead to more interesting commentary.

The additional complexity in something like hold 'em lends itself extremely well to LLM generated commentary.

▲vunderba 4 hours ago

Agreed. Adjusting the LLM temperature to tweak speculation based off the fact that even though the AI commentator has access to all hands, all future draws represent the aspect of "imperfect information" would also be a fun experiment.

▲93po 6 hours ago

i think pong being so simple is why this is funny and interesting

▲DigiEggz 5 hours ago

I agree with this idea. The reason I visited this is because of the idea of commentating Pong is inherently amusing. It makes me realize I'd be down to watch competitive pong.

▲QRe 10 hours ago

Fun experiment. Main limitation I see is the delay between actions and commentary because of the whole script generation & TTS overhead. It seems like the commentary can quickly fall behind, especially in fast-paced sports.

▲haneul 4 hours ago

Naw there are tricks you can use to pipeline these things so that apparent latency is under 500ms even with significant game state history awareness, and also to interrupt ongoing but freshly out of date commentary.

I couldn’t get it under 250ms though (for rocket league), but the tech should be better now than 2024.

▲pncnmnp 8 hours ago

Author here. TTS and script generation can be a bit of an overhead for now, which is why I've worked with metric aggregates - 30+ bounces rather than exactly 33, for example. For this game, one might ideally want this overhead to be less than the time it takes for the ball to bounce from one paddle to another, which can be around 1–2 seconds. However, there may be another strategy to (maybe?) overcome this: start synthesizing numbers (ignoring the fractional part) using TTS and cache them for both commentators. Then, patch those audio clips together after core part is synthesized. It should be doable, I think - I just haven't gotten to it yet. Note that matching the excitement and tempo of core commentary with those numbers is key - otherwise, it will feel janky.

▲blakeburch 5 hours ago

Really fun to see! I'd love to have something similar for esports, like League of Legends or Rocket League. So much of the commentary feels like filler with stats and statements about a player.

▲haneul 4 hours ago

Have done an interactive commentator for rocket league that is also simultaneously your duo partner. Works quite well. This was in October 2024 so the tech is there and even better now.

▲vishalontheline 5 hours ago

E-Sports needs more commentators from Latin America or the Middle East.

▲croemer 7 hours ago

If you want to skip the demo's fairly boring pre-match talk, the fun starts here: https://youtu.be/i21wN6CDsE0?si=cdUs_xLCwE8B0ATq&t=153

▲kovezd 2 hours ago

It's funny that even in a simple setting the AI couldn't avoid to hallucinate. As this clearly isn't a champions match.

▲croemer 1 hour ago

No, it's not a hallucination, it's explicitly what it was shown in the 2.5min you skipped :) https://www.youtube.com/watch?v=i21wN6CDsE0&t=16s

▲computerthings 6 hours ago

[dead]

▲petercooper 10 hours ago

I want this for when I'm working.

"Here we see Peter copying and pasting in some generic quick sort algorithm from.. somewhere. Stack Overflow? ChatGPT? Who knows. And he goes for the compile without writing any tests! Let's see if it compiles first time. And it's a noooooo! Bad luck, let's see how he gets out of this pickle. (I told you he should have written some tests.)"

▲qwertox 9 hours ago

It raaaaaaannnnnn, no exceptiooooonss!!! Can you believe it??? Can you believe it how he compiled that code? What a beauty, what a beautiful job he's done...

I wonder if NotebookLM's podcast function could be used for this, to comment on code with the spirit of a Latin American soccer commentator. Because having it comment code is already pretty useful if you don't want to explain others what you have been doing. It can do that pretty well for you.

▲A4ET8a8uTh0_v2 6 hours ago

Or we could go with David Attenboroough:P

dolphin mistral output:

"In the digital ecosystem, where binary code intertwines with human cognition, there exists an important ritual known as the Coding Review. This intricate dance is not dissimilar to how our ancestors gathered around a communal fire, sharing stories and experiences in order to pass on wisdom and understanding of their world.

The coding review takes place in a carefully-crafted digital habitat - often referred to as a development team's workspace. Here, the code, akin to DNA that carries the blueprint for all life forms, is meticulously examined by a group of highly specialized creatures known as developers and quality assurance analysts."

▲pacifika 59 minutes ago

Like this https://www.youtube.com/watch?v=zgq236m0bcQ

▲Retr0id 7 hours ago

Someone could totally make this as a vscode (etc.) extension

▲layer8 5 hours ago

Or an AI doing the part of the pair programmer who doesn’t have the keyboard.

▲ 8 hours ago

▲jart 10 hours ago

Do headline games like John Madden do this? That's a great use case for LLMs.

▲Fripplebubby 7 hours ago

You might not know this if you don't actually play these games (Madden, 2K for NBA, MLB The Show), but the commentary is extremely high quality, sometimes comparable to the TV broadcast with riffs and tangents as well as describing the action. Over many years of producing these games they have continually refined the process. Of course, eventually you will hear repeating dialogue if you play the games enough, but I think the baseline quality is going to be _very_ hard to replicate with an LLM.

▲IshKebab 10 hours ago

Yeah I was thinking the same. No more "They've really got to want to win this. This is a game of two halves. Etc."

Though tbh I found it still pretty annoying. Maybe just the tone of voice though, and it's clearly not actually connected to what's happening in the game.

I imagine the major sports game players are working on this.

▲jsheard 9 hours ago

None that I can think of. The Finals has AI generated voiceovers for its announcers, but in that case the lines are pre-written and voice clips generated ahead-of-time so it just reeks of penny-pinching by cutting out real voice actors, rather than using the tech to do things that genuinely weren't possible before.

https://www.youtube.com/watch?v=kZ87wiHps9s

▲neilv 11 hours ago

Is a lot of the generated commentary pure fabrication?

▲raffael_de 9 hours ago

If there is a programmed connection to the physics for the in-game commentary then it should be here: https://github.com/pncnmnp/xpong/blob/main/main.py#L212

https://github.com/pncnmnp/xpong/blob/main/main.py#L289:

  "- **Shot Angles:** Derive each shot's angle from the (vx, vy) vector:\n"
  "    • Steep angles (>45°) become daring corner lobs or sharp cross-courts.\n"
  "    • Moderate angles (15°-45°) look like graceful arcs that test court coverage.\n"
  "    • Shallow angles (<15°) play out as direct, flat drives down the line.\n"

Didn't find where the balls motion is communicated to the LLM.

▲SillyUsername 9 hours ago

So like real commentary then :D

▲A4ET8a8uTh0_v2 6 hours ago

It is all in the delivery, as it were.

▲investa 9 hours ago

"Real time"

It does need some pointless anecdotes about past statistics, history of the game, training regimes, new managers and so on!

▲indigodaddy 4 hours ago

How about an alternative commentary to the staring competitions? It must be wry and dry of course, and hard to beat the existing commentary, but might be interesting to see how it turns out.

https://youtu.be/SWgg20IqibM?si=xP5ZpcQu8P2V2ZTc

▲sim7c00 11 hours ago

ths is so funny my god haha. the intro is a bit dry but when the game is on its fire haha :'). what an exhillarating match xD

▲smus 6 hours ago

I wouldn't say you taught the ai anything so much as wired some API calls together

▲ayongpm 11 hours ago

Pretty cool. I can see how commentary could make even Pong more interesting. Maybe there’s room for a pro Pong competition, kind of like what Tetris has.

▲hvardhan878 4 hours ago

I wonder if you can clone the voice and tonality of Peter Drury and even make a game of Pong emotional.

▲pawelduda 9 hours ago

Looks like the idea of Morgan Freeman narrating life in real time is closer to reality than ever

▲netsharc 2 hours ago

Hah, now I'm thinking of an AI that knows your personality and steers you do things you've always been fearful or anxious about. "Steve sees a 10/10 girl, the kind he's always felt inadequate to talk to, but he's a different man now, and he's going to tell her 'Tickle your ass with a feather?' in 5... 4... 3... 2...".

Hah, my next startup is an AI-Assist Pick-Up Artist. But that's the "Lamborghini-desiring Crypto-Bro" package that's 49.95 USD/month, the entry level feature would encourage you to go to the gym and eat your vegetables.

AI voices talking to you... now the hallucinations are actually in your head!

▲ 8 hours ago

▲danjl 4 hours ago

What's with the circular ball?

▲antonvs 3 hours ago

It’s 2025. We’ve been able to render small circles for a while now!

Seriously though, the entire graphics display is much more hi res than the original, and it’s not trying to emulate the original resolution. So one slightly more serious way to answer the question is, all the graphics are higher resolution, it’s just that you notice it more when it comes to the ball.

▲danjl 3 hours ago

Ok, then where's the ray tracing? ;-)

▲isaacremuant 8 hours ago

The idea is very fun and kudos on the author but it definitely feels lacking on the actual game commentary itself. My perception, is that it throws random fillers that don't quite feel like apt commentary.

If you're jokingly imitating filler from bad commentary I understand but I think I'd like more play by play and less color, but of course pong has a limited amount of inputs to work with for that commentary.

One thing that could very well work for the latency issue some commenters post is to just send the events and receive commentary outside of the rendering and playback so that it, within some max delay, can look more immediate and in sync.

Very fun idea. Hope to see it with more complex things with more inputs.

▲oulipo 6 hours ago

Perfect meta-commentary on why AI is 90% useless stuff. Main use-case of AI in the real-world is really not far from commenting a pong match, eg trying to painfully make something exciting out of something worthless and dull, but not succeeding

▲MontgomeryPy 8 hours ago

Tom Brady's job as a color commentator may be in jeopardy ;)

▲DonHopkins 9 hours ago

Commentator 1 (Greg “The Swatch Whisperer”): Welcome back, folks, to what can only be described as the pinnacle of human achievement: watching Disney Princess™ Pink paint dry. I haven’t been this excited since the 2002 Home Depot Black Friday Sale when I almost got my hands on a discontinued eggshell Martha Stewart Lavender.

Commentator 2 (Marsha “Two Coats” Hernandez): Greg, I still remember the way you wept in aisle 7. But let’s talk about today’s masterpiece—Disney Princess Pink, the shade officially inspired by the collective inner glow of Aurora, Cinderella, and, dare I say, Ariel's clam-bikini energy.

Greg: Absolutely, Marsha. And look at that glorious semi-damp sheen—like a freshly glazed donut at sunrise. It’s got a dreamy undertone of "your niece’s birthday party at 10 a.m. with a bouncy castle and too much Capri Sun."

Marsha: Oh-ho, what’s this? Is that… yes, I think the lower left quadrant is beginning to matte. Ladies and gentlemen, we may be witnessing the first signs of Stage 3: The Settling of the Pigment.

Greg (choked up): My god… I haven’t seen a transition like this since Elsa’s Let It Go phase. Remember that? How she emotionally dried her entire personality over a solo in under three minutes? Iconic.

Marsha: Speaking of queens, this paint owes everything to Belle’s bedroom in the lost “Live Laugh Library” deleted scene. That’s the shade they were going to use until someone spilled tea on the concept art. Literally. It was Chip. That kid is a menace.

Greg: I’m sorry but—hold on—this is huge. That patch near the window just tightened. We are witnessing micro-shrinkage. It’s subtle, it’s refined, it’s got the attitude of Mulan at a dim sum buffet. She came hungry, and this paint came to DRY.

Marsha: Greg, if this drying pace keeps up, we’re on track for a Suburban First-Timer Finish Time. I haven’t seen Disney Pink behave like this since the infamous 2017 "Frozen Themed Daycare Hallway Incident." They had to repaint in Tiana Teal—the shame.

Greg: And oh! There it is! That final middle patch—she’s going matte, folks. This wall is becoming a canvas of completion, a poetic stillness in a chaotic world. I feel like I just watched Cinderella get her slipper and a Roth IRA.

Marsha (tearfully): This… is why I do this job. For moments like this. For the shimmerless silence. For the slow, glorious commitment to finality.

Greg: And so we leave you, dear viewers, staring into a flat, fully-dry future. The room has changed… and so have we.

▲antonvs 3 hours ago

> Swatch Whisperer

According to Google, you’re only the second person in recorded human history to use these two words together.

▲aspenmayer 3 hours ago

> Greg: And oh! There it is! That final middle patch—she’s going matte, folks. This wall is becoming a canvas of completion, a poetic stillness in a chaotic world. I feel like I just watched Cinderella get her slipper and a Roth IRA.

> Marsha (tearfully): This… is why I do this job. For moments like this. For the shimmerless silence. For the slow, glorious commitment to finality.

> Greg: And so we leave you, dear viewers, staring into a flat, fully-dry future. The room has changed… and so have we.

I’m getting major Broomshakalaka vibes in the best possible way.

https://www.youtube.com/watch?v=zt2uIhAvQZ8