Fresh Hacker News | Three kinds of AI products work

▲Three kinds of AI products work(seangoedecke.com)

90 points by emschwartz 3 hours ago | 33 comments

▲wongarsu 3 hours ago

This seems to be biased heavily towards products that look like an LLM. And yes, only a small number of those work. But that's because if your product is a thing I chat with, it immediately is in competition with ChatGPT/Claude/Grok/etc, leading to everything the article expressed. But those are hardly the only use cases for LLMs, let alone AI (whatever people nowadays mean by AI)

To name some of the obvious counter-examples, Grammarly and Deepl are both AI (and now partially LLM-based) products that don't fit any of the categories in the post, but seem pretty successful to me. Lots of successful applications of Vison-LLMs in document scanning too, whether you are deciphering handwritten text or just trying to get structured data out of pdfs.

▲msabalau 26 minutes ago

Yeah, the "normal" people I know, use AI in Grammarly or Adobe Express, or a astonished and delighted by NotebookLM, mostly because of the audio overviews--but also because grounding chat with sources gets you better, focused chat.

And, outside of chat, it's less clear that that big labs win all the time. People who care about making films, rather than video memes, often look to Kling or Runway, not just Sora. People who want to make images often have a passion for Midjourney that I've never seen for ImageFX.(Nanobanna for editing often sparks joy, so a big lab can play successfully in such a space, but that is diffferent from saying it is destined to win.)

▲themanmaran 2 hours ago

Perhaps I'm biased since we're in a document heavy industry, but I think the original post misses a lot of the non-tech company use cases. An insane percentage of human time is spent copy pasting things from documents.

▲dbreunig 1 hour ago

Agree. I bucket things into three piles:

1. Batch/Pipeline: Processing a ton of things, with no oversight. Document parsing, content moderation, etc.

2. AI Features: An app calls out to an AI-powered function. Grammarly might pass out a document for a summary, a CMS might want to generate tags for a post, etc.

3. Agents: AI manages the control flow.

So much of discussion online is heavily focused towards agents so that skews the macro view, but these patterns are pretty distinct.

▲echelon 1 hour ago

> One other thing I haven’t mentioned is image generation. Is this part of a chatbot product, or a tool in itself? Frankly, I think AI image generation is still more of a toy than a product, but it’s certainly seeing a ton of use. There’s probably some fertile ground for products here, if they can successfully differentiate themselves from the built-in image generation in ChatGPT.

This guy is so LLM-biased that he's missing the entire media gen ecosystem.

I feel like image, video, music, voice, and 3D generation are a much bigger deal than text. Text and code are mundane compared to rich signals.

These tools are production ready today and can accomplish design, marketing, previz, concept art, game assets, web design, film VFX. It's incredibly useful. As a tool. Today.

Don't sleep on generative media.

▲taherchhabra 8 minutes ago

I am building one such tool. Flickpseed.ai Give it a shot

▲echelon 1 minute ago

Hope you don't mind the unsolicited feedback -

ComfyUI-inspired node graphs are the wrong approach for visual media. Nodes are great for the 1% of artists that get into it, but you really need to build the Adobe / Figma of image and video tools. Not Unreal Engine Blueprint / ComfyUI spaghetti.

ShaderToy and TouchDesigner and Comfy are neat toys, but they're not what the majority of people will use.

We want to mold ideas like clay.

Watch the demos Adobe just gave from their conference two weeks ago. That's what you should build. Something artists and designers and creatives intuit as an extension of themselves. Not a mathematical abacus.

▲PunchyHamster 4 minutes ago

Probably want to add "scamming clueless out of their savings" by combination of LLMs and voice generation.

▲aunty_helen 0 minutes ago

And the other side, detecting in real time as phishing is happening and intervening.

▲torlok 3 hours ago

So the only AI products that work is a chat bot you can talk to, or a chat bot that can perform tasks for you. Next thing you'll tell me is that the only businesses that work are ones where you can ask somebody to do something for you in exchange for money.

▲ohyoutravel 2 hours ago

Realistically there are only four types of businesses writ large: tourism, food service, railroads, and sales. People building AI-based products should focus on those verticals.

▲gervwyk 11 minutes ago

lol. would love an episode on how Micheal and Dwight responds to Jims Ai slop.

▲lelandbatey 1 hour ago

Really only two kinds:

- Energy generation and

- Expending energy to convince the folks generating energy to give you money for activating their neurons (food service, entertainment, tourism, transportation, sales).

Any other fun ways to compartmentalize an economy?

▲tehjoker 1 hour ago

Not shown: any activity involved in production, science, or healthcare just off the top of my head

▲alickz 2 hours ago

The only GUI products that work are GUIs that you can interface with, or that perform tasks for you

Maybe the real value of AI, particularly LLMs, is in the interface it provides for other things, and not in the AI itself

What if AI isn't the _thing_? What if it's the thing that gets us _to_ the thing?

▲owenpalmer 2 hours ago

> Next thing you'll tell me is that the only businesses that work are ones where you can ask somebody to do something for you in exchange for money.

What other type of business is there?

▲hobs 2 hours ago

That is the joke.

▲gordonhart 2 hours ago

The best kind of businesses are the ones I don’t have to ask; they’ve already built a better product than what I would have asked for. That’s kinda the point the OP is making about chat vs a [good] dedicated interface.

▲Animats 15 minutes ago

> So you can never give a support chatbot real support powers like “refund this customer”, because the moment you do, thousands of people will immediately find the right way to jailbreak your chatbot into giving them money.

And that's the elephant in the room. AI "agents" can't do much until someone solves that problem. Most AI "agents" work for and favor the business operating the agent, but impose the costs of their errors on the customer. Errors are an externality, like pollution. This is no good.

▲Shebanator 3 hours ago

Author forgot about image, video, and music creation. These have all been quite successfully commercially, though maybe not as much artistically.

▲carsoon 1 hour ago

Recent articles seem only to mean LLMs when they reference AI. There are tons of commercial usecases for other models. Image Classification models, Image Generation models (traditionally difusion models, although some do use llm for image now), TTS models, Speach Transcription, translation models, AI driving models(autopilot), AI risk assessment for fraud, 3D structural engineering enhancement models.

With many of the good usecases of AI the end user doesn't know that ai exists and so it doesn't feel like there is AI present.

▲throwawaymaths 21 minutes ago

I think there's a space for something that wraps an llm (especially multimodal) to do something that's halfway to agentic. Yes you could do it yourself but it's not worth it to you to figure out prompts etc, especially when someone has already optimized it. Plus, it could go from 100 clicks and 10 minutes in front of chatgpt to zero clicks, automated ingest, and get an email when the results is baked.

A good example I saw recently was stripping ads from podcasts.

▲theptip 3 hours ago

I think this is kind of like saying “Only three kinds of internet products work, SaaS, webpages, and mobile apps”

At the level of granularity selected, maybe true. But too coarse to make any interesting distinctions or predictions.

▲levocardia 2 hours ago

Very obviously missing the mundane agentic work. I think the following things are basically already solved, and are just waiting for the right harness:

- Call this government service center, wait on hold for 45 minutes, then when they finally answer, tell them to reactivate my insurance marketplace account that got wrongly deleted.

- Find a good dentist within 2mi from my house, call them to make sure they take my insurance, and book an appointment sometime in the next two weeks no earlier than 11am

- Figure out how I'm going to get from Baltimore to Boston next Thursday, here's $100 and if you need more, ask me.

- I want to apply a posterizing filter in photoshop, take control of my mouse for the next 10sec and show me where it is in the menu

- Call that gym I never go to and cancel my membership

▲thisisit 36 minutes ago

> Figure out how I'm going to get from Baltimore to Boston next Thursday, here's $100 and if you need more, ask me.

I tried something like this last month. I was going on a holiday and asked LLM to prepare a sightseeing guide on a fixed budget. The LLMs plan looked feasible unless you looked closer.

The first issue was the opening/closing times of certain attractions. It kept saying stuff like - "At 6pm you can go and visit place X". While in reality X closed at 5pm.

Second issue was underestimating the walking speed/distance. The plans were often fully packed with lots of walking. Now without a Google maps guidance it often underestimated the time. Instead of say 10 mins between A and B it routinely underestimated the time to be 5-6 mins.

I keep prompting it go back and check the opening hours. And once it took that into account the walking routes became complicated- often double backing to same location. Lots of prompts and re-prompts to get it right.

So, I don't know if this is already solved - at least at scale and within costs - especially given the token costs.

▲irq-1 1 hour ago

> - Find a good dentist within 2mi from my house, call them to make sure they take my insurance, and book an appointment sometime in the next two weeks no earlier than 11am

The web caused dentists to make websites, but they don't post their appointment calendar; they don't have to.

Will AI looking for appointments cause businesses to post live, structured data (like calendars)? The complexity of scheduling and multiple calendars is perfect for an AI solution. What other AI uses and interactive systems will come soon?

- Accounting: generate balance sheets, audit in real-time, and have human accountants double check it (rather than doing)

- Correspondence: create and send notifications of all sorts, and consume them

- Purchase selection: shifting the lack of knowledge about products in the customers favor

- Forms: doing taxes or applying for a visa

▲input_sh 1 hour ago

Basically already solved = you've never used it for any of those purposes and have no idea if or how well would they work?

▲zkmon 3 hours ago

>> in five years time most internet users will spend a big part of their day scrolling an AI-generated feed.

Yep. Looking forward to the future where you can eat plastic pop-corn while watching the AI-generated video feeds.

▲pixl97 2 hours ago

Why 5 years, I'm pretty sure we're there today.

▲vorticalbox 2 hours ago

By Ai generated feeds do you mean a feed that is just full of AI posts or an AI generating a feed to one can scroll?

▲ZeroConcerns 2 hours ago

Well, the elephant in the room here is that the generic AI product that is being promised, i.e. "you get into your car in the morning, and on your drive to the office dictate your requirements for one of the apps that is going to guarantee your retirement, in order to find it completely done, rolled out to all the app stores and making money already once you arrive" isn't happening anytime soon, if ever, yet everyone pretty much is acting like it's already there.

Can "AI" in its current form deliver value? Sure, and it absolutely does but it's more in the form of "several hours saved per FTE per week" than "several FTEs saved per week".

The way I currently frame it: I have a Claude 1/2-way-to-the-Max subscription that costs me 90 Euros a month. And it's absolutely worth it! Just today, it helped me debug and enhance my iSCSI target in new and novel ways. But is it worth double the price? Not sure yet...

▲madeofpalk 2 hours ago

The other part to this is that LLMs as a technology definitely has some value as a foundation to build features/products on other than chat bots. But unclear to be whether that value can sustain current valuations.

Is a better de-noisier algorithm in Adobe Lightroom worth $500 billion?

▲ZeroConcerns 1 hour ago

> Is a better de-noisier algorithm in Adobe Lightroom worth $500 billion?

No.

But: a tool that allows me to de-noise some images, just by uploading a few samples and describing what I want to change, just might be? Even more so, possibly, if I can also upload a desired result and let the "AI" work on things until it matches that?

But also: cool, that saves me several hours per week! Not: oh, wow, that means I can get rid of this entire department...

▲ansgri 2 hours ago

A bit off-topic, but denoise in LR is like 3 years behind the purpose-built products like Topaz, so a bad example. They've added any ML-based denoise to it when, like a year ago?

▲vorticalbox 2 hours ago

I use mongo at work and LLM helped me find index issues.

Feeding it the explain, query and current indexes it can quickly tell what it was doing and why it was slow.

I saved a bunch time as I didn’t have to read large amounts of json from explain to see what is going on.

▲adastra22 2 hours ago

Agentic tools is already delivering an increase in productivity equivalent to many FTEs. I say this as someone in the position of having to hire coders and needing far fewer than we otherwise would have.

▲ZeroConcerns 2 hours ago

Well, yeah, as they say on Wikipedia: {{Citation Needed}}

Can AI-as-it-currently-is save FTEs? Sure: but, again, there's a template for that: {{How Many}} -- 1% of your org chart? 10%? In my case it's around 0.5% right now.

Or, to reframe it a bit: can AI pay Sam A's salary? Sure! His stock options? Doubtful. His future plans? Heck nah!

▲adastra22 56 minutes ago

400-800%. That is to say, I am hiring 4x-8x fewer developers for the same total output (measured in burn down progress, not AI-biased metrics like kLOC).

▲pixl97 2 hours ago

Skeptics always like to toss in 'if ever' as some form of enlightenment they they are aware of some fundamental limitation of the universe only they are privy to.

▲falseprofit 1 hour ago

Let’s say there are three options: {soon, later, not at all}. Ruling out only one to arrive at {later, not at all} implies less knowledge than ruling out two and asserting {later}.

Awareness of a fundamental limitation would eliminate possibilities to just {not at all}, and the phrasing would be “never”, rather than “not soon, if ever”.

▲mzajc 1 hour ago

Of the universe, perhaps, but humans certainly are a limiting factor here. Assuming we get this technology someday, why would one buy your software when the mere description of its functionality allows one to recreate it effortlessly?

▲leksak 31 minutes ago

I would consider profitable to be a requirement to qualify as a product working and none of these fit the bill I believe?

▲skerit 3 hours ago

The article claims Claude Sonnet 3.5 was released less than 9 months ago, but this is wrong.

Claude 3.5 was released in june 2024.

Maybe he has been writing this article for a while, maybe he meant Claude Code or Claude 4.0

▲simonw 2 hours ago

He meant Sonnet 3.7 which was released on the same day as Claude Code, Feb 24th 2025: https://www.anthropic.com/news/claude-3-7-sonnet

With hindsight, given that Claude Code turned into a billion dollar precut category, it was a bit of a miss bundling those two announcements together like that!

▲koliber 3 hours ago

A few more seem to work as well, because I've used them and found them valuable

- human language translation

- summarization

- basic content generation

- spoken language transcription

▲notatoad 2 hours ago

> summarization

can you point me to a useful example of this? i see websites including ai-generated summaries all the time, but i've yet to see one that is actually useful and it seems like the product being sold here is simply "ai", not the summary itself - that is, companies and product managers are under pressure to implement some sort of AI, and sticking summaries in places is a way for them to fill that requirement and be able to say "yes, we have AI in our product"

▲koliber 1 hour ago

I sometimes get contracts, NDAs, or terms and conditions which normally I would automatically accept because they are low stakes and I don't have time to read them. At best I would skim them.

Now I pass them through an LLM and ask them to point out interesting, unconventional, or surprising things, and to summarize the document in a few bullet points. They're quite good at this, and I am can use what I discover later in my relationship with the counterparty in various ways.

I also use it to "summarize" a large log output and point out the interesting bits that are relevant to my inquiry.

Another use case is meeting notes. I use fireflies.ai for some of my meetings and the summaries are decent.

I guess summarization might not be the right word for all the cases, but it deals with going through the hay stack to find the needle.

▲ 23 minutes ago

▲gregates 1 hour ago

Do you go through the haystack yourself first, find the needle, and then use that to validate your hypothesis that the AI is good at accomplishing that task (because it usually finds the same needle)? If not, how do you know they're good at the task?

My own experience using LLMs is that we frequently disagree about which points are crucial and which can be omitted from a summary.

▲koliber 1 hour ago

It depends on how much time I have, and how important the task is. I've been surprised and I've been disappointed.

One particular time I was wrestling with a CI/CD issue. I could not for the life of me figure it out. The logs were cryptic and there was a lot of them. In desperation I pasted the 10 or so pages of raw logs into ChatGPT and asked it to see if it can spot the problem. It have me three potential things to look at, and the first one was it.

By directing my attention it saved me a lot of time.

At the same time, I've seen it fail. I recently pasted about 10 meetings worth of conversation notes and asked it to summarize what one person said. It came back with garbage, mixed a bunch of things up, and in general did not come up with anything useful.

In some middle-of-the road cases, what you said mirrors my experience: we disagree what is notable and what is not. Still, this is a net positive. I take the stuff it gives me, discard the things I disagree on, and at least I have a partial summary. I generally check everything it spits out against the original and ask it to cite the original sources, so I don't end up with hallucinated facts. It's less time than writing up a summary myself, and it's the kind of work that I find more enjoyable than typing summaries.

Still, the hit to miss ration is good enough and the time savings on the hits are impressive so I continue to use it in various situations where I need a summary or I need it to direct my attention to something.

▲notatoad 1 hour ago

for your first one, if you're just feeding docs into a chatbot prompt and asking for a summary, i think that matches what the article would call a "chatbot product" rather than a summarization product.

fireflies.ai is interesting though, that's more what i was looking for. i've used the meeting summary tool in google meet before and it was hilariously bad, it's good to hear that there are some companies out there having success with this product type.

▲koliber 21 minutes ago

I guess you’re right re chatbot for summaries. I was thinking about the use case and not the whole integrated product experience.

For example, for code gen I use agents like Claude Code, one-shot interfaces like Codex tasks, and chatbots like the generic ChatGPT. It depends on the task at hand, how much time I have, whether I am on the phone or on a laptop, and my mood. It’s all code gen though.

▲aunty_helen 1 hour ago

We built a system that uses summaries of video clips to build a shorts video against a screenplay. Customer was an events company. So think 15 minute wedding highlights video that has all of the important parts to it, bride arrival, ring exchange, kiss the bride, first dance, drunken uncle etc

▲thewebguyd 2 hours ago

I've also found LLMs helpful for breaking down user requests into a technical spec or even just clarifying requests.

I make a lot of business reporting where I work and dashboards for various things. When I get user requests for data, it's rarely clear or well thought out. They struggle with articulating their actual requirements and usually leads to a lot of back and forth emails or meetings and just delays things further.

I now paste their initial request emails into an LLM and tell it "This is what I think they are trying to accomplish, interpret their request into defined business metrics" or something similar and it does a pretty good job and saves a ton of the back and forth. I can usually then feed it a sample json response or our database schema and have it also make something quick with streamlit.

It's saved me (and the users) a ton of time and headaches of me trying to coerce more and more information from them, the LLMs have been decent enough at interpreting what they're actually asking for.

I'd love to see a day where I can hook them up with RO access to a data warehouse or something and make a self-service tool that users can prompt and it spits out a streamlit site or something similar for them.

▲loloquwowndueo 2 hours ago

> basic content generation

Dunno, man, I can spot ai-generated content a mile away, it tends to be incredibly useless so once I spot it, I’ll run in the opposite direction.

▲carsoon 2 hours ago

You spot bad ai content. Since there is no button that will tell you if something was Ai generated you never know if what you read was/wasn't.

▲HelloUsername 2 hours ago

> once I spot it

Exactly; pretty sure you've seen media or read text that you thought was human created..

▲koliber 1 hour ago

I hate what LLM spit out and would never accept the whole output verbatim.

I love how they occasionally come up with a turn of phrase, a thought path, or surprising perspective. I work with them iteratively to brainstorm, transform, and crate compose content that I incorporate into my own work.

Regarding spotting AI-generate content, I was once accused of posting AI-generated content where I bona-fide typed every single letter myself without as much as glancing at an LLM. People's confidence in spotting AI content will vary and err on fake-positives and fake-negatives too. My kids now think all CG movies are AI generated, even the ones that pre-date image and video gen. They're pretty sure it's AI though.

▲samuelknight 1 hour ago

I look at LLMs with an engineering mindset. It is an intelligence black box that goes in a tool box with the old classical algorithms and frameworks. In order to use it in a solution I need to figure out:

1) Whether I can give it information in a compatible and cost effective way

2) Whether the model is likely to to produce useful output

I have use language models for years before LLMs such as part of speech classifiers in the Python NLTK framework.

▲EagnaIonat 1 hour ago

> This doesn’t work well because savvy users can manipulate the chatbot into calling tools. So you can never give a support chatbot real support powers like “refund this customer”, ...

I would disagree with this.

Part of how security is handled in current agentic systems is to not let the LLM have any access to how the underlying tools work. At best it's like hitting "inspect" in your browser and changing the web page.

Of course, that assumes that the agentic chatbot has been built correctly.

▲sgt101 54 minutes ago

The product is the LLM, the wrap has marginal value atm.

You can write an agent, it's cool. I can copy it.

I cannot build my own LLM (although I can run open source ones).

▲larodi 2 hours ago

More than three kinds are then actually listed in the article

▲shermantanktop 2 hours ago

Formatting did not help. Three kinds, but then subheadings in the same size font, and then here come two more kinds, plus a side journey into various topics.

▲kken 1 hour ago

Well, considering that the long term idea is to have AGI, general intelligence, it seems that the goal as also to only have a single product in the end.

There may be different ways to access it, but the product is always the same.

▲websap 2 hours ago

> Users simply do not want to type out “hey, can you increase the font size for me” when they could simply hit “ctrl-plus” or click a single button3.

I would def challenge this. “Turn off private relay”, “send this photo to X”, “Add a pit stop at a coffee shop along the way” are all voice commands I would love to use

▲Mikhail_Edoshin 1 hour ago

Old Apple Newton had a feature, I don't remember how they called it, but on any screen you could write "please", and then describe what to do, e. g. using one of their examples: "please fax this to Bob". And it worked. Internally it was a rather simple keyword match plus access to data, such as the system address book. New applications could register their own names for actions and relevant dictionaries.

▲chrisweekly 2 hours ago

Yes, this! esp the last one. Finding coffee shop / restaurant options ALONG THE WAY seems like it should've been solved years ago. Scenario: while driving, "want to eat in about an hour, must have vegetarian options, don't add more than 10m extra drive time" and get a shortlist to pick from.

▲hencq 1 hour ago

Yeah that one is surprisingly difficult even with a Human Intelligence in the passenger seat.

▲YesBox 50 minutes ago

Regarding games: > A third reason could be that generated content is just not a good fit for gaming.

This is my current opinion as a game developer. IMO this isn't going to be fun for most once the novelty wears off. Games are goal oriented at the end of the day and the great games are masterfully curated multi-disciplinary experiences. I'd argue throwing a game wrapper around an LLM is a new LLM experience, not a new game experience.

▲happyopossum 2 hours ago

Very myopic view here - agents are turning out useful output in many fields outside of coding..

▲Xiol 2 hours ago

Such as?

▲carsoon 2 hours ago

Legal seems to be a big usecase for AI. I think more for simplification and classification versus generation though.

▲8organicbits 3 hours ago

> It’s easy to verify changes by running tests or checking if the code compiles

This is actually a low bar, when the agent wrote those tests.

▲Mikhail_Edoshin 2 hours ago

AI would make a very good librarian. It doesn't understand, only comprehends, but in this case it is enough.

Thing is, there is no library for it to work in.

▲SirensOfTitan 3 hours ago

I’ve been working on a learning / incremental reading tool for a while, and I’ve found LLM and LLM adjacent tech useful, but as ways of resolving ambiguity within a product that doesn’t otherwise show any use of LLM. It’s like LLM-as-parser.

▲owenpalmer 2 hours ago

Is there somewhere I can try the tool out? I'm interested in that kind of thing.

▲bix6 3 hours ago

On agents it’s interesting but not surprising coding has seen so much initial success.

Personally I’m waiting for better O365 and SharePoint agents. I think there’s a lot of automation and helper potential there.

▲airstrike 3 hours ago

I'm building an opinionated take on this. It's shaping up nicely.

If you're a Rust developer reading this, interested in AI + GUI + Enterprise SaaS, and wants to talk, I'm building a team as we speak. E-mail in profile.

▲bix6 1 hour ago

So like an o365 ServiceNow?

▲esseph 2 hours ago

At this point MS should probably sunset SharePoint and try again.

▲bix6 1 hour ago

How come?

▲ 3 hours ago

▲renewiltord 2 hours ago

The classic problem that online commenters face is that they only know products that are on Hacker News and Reddit. And I get why. Not being plugged into anything the only way to get information is social media so you only know social media.

E.g. https://www.thomsonreuters.com/en/press-releases/2025/septem...

B2B AI company, 2 years in sold for hundreds of millions, not an agent, chatbot, or completion. Do you know it exists? No. You only read Hacker News. How could you know?

▲PopAlongKid 2 hours ago

>The company’s [Additive] technology automates complex tasks such as extracting footnotes from K-1s, K-3s, and related forms, so every staff member can become a reviewer and complete work that used to take weeks in a matter of hours.

Any tax professional who takes weeks to enter footnote info from a K-1 form into their professional tax prep software is probably just as bad at other job-related tasks and either needs more training or to find another job.

▲Dilettante_ 2 hours ago

  Additive’s GenAI-native platform streamlines the repetitive, time-consuming task of ingesting and parsing pass-through entity documents

From TFA:

  There’s another kind of agent that isn’t about coding: the research agent. LLMs are particularly good at tasks like “skim through ten pages of search results” or “keyword search this giant dataset for any information on a particular topic”.

▲Aldipower 3 hours ago

In my current project the agent (GPT-5) isn't helpful at all. Damn thing, lying all the time to me.

▲kevin_thibedeau 2 hours ago

They're idiot savants. Use them for their strengths. Know their weaknesses.

▲Aldipower 2 hours ago

So, what are their strengths then? I've fed it with a detailed, very well documented and typed API description. Asking to construct me some not too hard code snippets based on that. GPT-5 then pretend to do the right thing, but actually is creating meaningless nonsense out of it. Even after I tried to reiterate and refine my tasks. Every junior dev is waaay better.

▲gherkinnn 2 minutes ago

I recently had something no longer compile. I got bored sniffing around after maybe an hour, set Claude in Zed on to it, got a snack, and by the time I was back it had found the problem.

When I am unsure how to implement something, I give an LMM a rough description and then tell it to ask me five questions it needs to get a good solution. More often than not, that uncovers a blind spot.

LLMs remain unhelpful at writing code beyond trivial tasks though.

▲ohyoutravel 2 hours ago

Parsing a thousand line stack trace and telling me what the problem was. Writing regexes. Spitting out ffmpeg commands.

▲bob1029 2 hours ago

Chatbot is the only one I agree with (human in the loop).

Agents are essentially the chatbot, but without the human in the loop. Chatbot without human in the loop is a slop factory. Things like "multi-agent systems" are a clever ploy to get you to burn tokens and ideally justify all this madness.

Copilot/completion does not work in business terms for me. It looks like it works and it might feel like it's working in some localized technical sense, but it does not actually work on strategic timescales with complex domains in such a way that a customer would eventually be willing to pay you money for the results. The hypothesis that work/jobs will be created due to sloppy AI is proving itself out very quickly. I think "completion" tools like classic IntelliSense are still at the peak of efficiency.

▲mrweasel 2 hours ago

Chatbot in many environment simply doesn't work, because we won't let them and if we did, they'd be agents. Here I'm mostly thinking in terms of things like customer service chats. A chatbot that can't reach into other systems are essentially only useful for role playing.

The copilot/completion thing also doesn't work for me. I have no doubt that a lot of developers are having a lot of benefits from the coding LLMs, but I can't make them work.

I think one glaring obvious missing kind of AI is medical image recognition, which is already deployed and working in many scenarios.

▲adammarples 2 hours ago

>I think there are serious ethical problems with this kind of product.

Unless there are serious ethical problems with people generating arbitrary text ie. Writing - then no there isn't

▲theonething 1 hour ago

seem like data analysis would be a good one. Company ingests massive amounts of disparate business data. Ask AI to clean and normalize it, visualize it and give recommendations.

▲baxtr 3 hours ago

From the article:

> Summary

By my count, there are three successful types of language model product:

- Chatbots like ChatGPT, which are used by hundreds of millions of people for a huge variety of tasks

- Completions coding products like Copilot or Cursor Tab, which are very niche but easy to get immediate value from

- Agentic products like Claude Code, Codex, Cursor, and Copilot Agent mode, which have only really started working in the last six months

On top of that, there are two kinds of LLM-based product that don’t work yet but may soon:

- LLM-generated feeds

- Video games that are based on AI-generated content

▲ 23 minutes ago