Building better AI tools(hazelweakly.me)
339 points by eternalreturn 1 day ago | 31 comments
tptacek 1 day ago
This is a confusing piece. A lot of it would make sense if Weakly was talking about a coding agent (a particular flavor of agent that worked more like how antirez just said he prefers coding with AI in 2025 --- more manual, more advisory, less do-ing). But she's not: she's talking about agents that assist in investigating and resolving operations incidents.

The fulcrum of Weakly's argument is that agents should stay in their lane, offering helpful Clippy-like suggestions and letting humans drive. But what exactly is the value in having humans grovel through logs to isolate anomalies and create hypotheses for incidents? AI tools are fundamentally better at this task than humans are, for the same reason that computers are better at playing chess.

What Weakly seems to be doing is laying out a bright line between advising engineers and actually performing actions --- any kind of action, other than suggestions (and only those suggestions the human driver would want, and wouldn't prefer to learn and upskill on their own). That's not the right line. There are actions AI tools shouldn't perform autonomously (I certainly wouldn't let one run a Terraform apply), but there are plenty of actions where it doesn't make sense to stop them.

The purpose of incident resolution is to resolve incidents.

cmiles74 1 day ago
There's no AI tool today that will resolve incidents to anyone's satisfaction. People need to be in the loop not only to take responsibility but to make sure the right actions are performed.
kookamamie 1 day ago
Exactly. There seems to be this fantasy in which you can somehow string different kinds of agents together, one designing and one reviewing, and that finally producing something superior as output - I just don't buy that.

Sounds like heuristics added on top of statistics, which is trying to remedy some root problem with another hack.

rusticpenn 1 day ago
The whole field of metaheuristic algorithms rests on a similar idea. a lot of stupid "agents" finding a good solution. by metaheuristics i mean genetic algorithms, PSO, ACO etc.
yunohn 2 hours ago
Hmm, but this provably works right now though? All LLMs perform better with roleplay direction and focused scope. Using coding agents with plan then execute makes noticeable quality improvements.
tptacek 1 day ago
Nobody disputes this. Weakly posits a bright line between agents suggesting active steps and agents actually performing active steps. The problem is that during incident investigations, some active steps make a lot of sense for agents to perform, and others don't; the line isn't where she seems to claim it is.
cmiles74 1 day ago
Understood. To your example about the logs, my concern would be be that the AI chooses the wrong thing to focus on and people decide there’s nothing of interest in the logs, thus overlooking a vital clue.
tptacek 1 day ago
You wouldn't anticipate using AI tools to one-shot complex incidents, just to rapidly surface competing hypotheses.
TimPC 23 hours ago
I think the problem is do you want to give the AI access to prod. See the recent example where AI wiped a DB despite instructions not to (because AI sometimes does things more often when you tell it not to do something because the negative from not is not always reliably picked up)
otterley 17 hours ago
> There are actions AI tools shouldn't perform autonomously (I certainly wouldn't let one run a Terraform apply), but there are plenty of actions where it doesn't make sense to stop them.

I'm curious as to where you would draw the line. Assuming you've adhered to DevOps best practices, most--if not all--changes would require some sort of code commit and promotion through successive environments to reach production. This isn't just application code, of course; it's also your infrastructure. In such a situation, what would you permit an agent to autonomously perform in the course of incident resolution?

bubblyworld 1 day ago
I know you've got a subthread about this exact idea, but I do think there is some value in manually performing the debugging process if (and perhaps only if) your goal is to improve your overall programming ability.

I guess the chess analogy would be that it makes a lot of sense to analyse positions yourself, even though Leela and Stockfish can do a far more thorough job in much less time. Of course, if you just need to know the best move right now, you would use the AI, and professionals do that all the time.

But as a decently strong chess player I cannot imagine improving without doing this kind of manual practice (at least beyond a basic level of skill like knowing how pieces move). Grandmasters routinely drill tactics exercises, for instance, even though they are "mundane" at that level of ability.

I guess the crux of it - do you think AI+person learns faster than just person for this kind of thing? And why? It's not obvious to me either way (and another question is whether the skill is even relevant any more... I think so, but I know people who don't).

kasey_junk 1 day ago
But you can do that _after_ the incident. When things are not on fire.

You don’t run analysis of your chess game when the clock is ticking.

bubblyworld 1 day ago
Sure, if something is super critical then you should solve the problem as fast as possible. I'm not debating that. But there's probably a middle ground there somewhere for less critical issues. I suspect the process of generating and falsifying hypotheses quickly is the skill, and I don't know if you can effectively train that skill after an incident, when you've already seen the resolution.

Chess is maybe not a great analogy, because there are rarely objectively correct answers, only hard trade-offs. For that reason there's still a lot of value in reviewing a finished game.

phillipcarter 1 day ago
Lost in a bit of the discourse around anomaly detection and incident management is that not all problems are equal. Many of them actually are automatable to some extent. I think the issue is understanding when something is sufficiently offloadable to some cognitive processor vs. when you really do need a human engineer involved. To your point, yes, they are better at detecting patterns at scale … until they’re not. Or knowing if a pattern is meaningful. Of course not all humans can fill these gaps either.
hiAndrewQuinn 1 day ago
>The fulcrum of Weakly's argument is that agents should stay in their lane, offering helpful Clippy-like suggestions and letting humans drive. But what exactly is the value in having humans grovel through logs to isolate anomalies and create hypotheses for incidents?

See also: Tool AIs Want To Be Agent AIs.

https://gwern.net/tool-ai

Predicted almost a decade ago.

miltonlost 1 day ago
It's not a confusing piece if you don't skip/ignore the first part. You're using her one example and removing the portion about how human beings learn and how AI is actively removing that process. The incident resolution is an example of her general point.
tptacek 1 day ago
I feel pretty comfortable with how my comment captures the context of the whole piece, which of course I did read. Again: what's weird about this is that the first part would be pretty coherent and defensible if applied to coding agents (some people will want to work the way she spells out, especially earlier in their career, some people won't), but doesn't make as much sense for the example she uses for the remaining 2/3rds of the piece.
JoshTriplett 1 day ago
It makes perfect sense for that case too. If you let AI do the whole job of incident handling (and leaving aside the problem where they'll get it horribly wrong), that also has the same problem of breaking the processes by which people learn. (You could make the classic "calculator" vs "long division" argument here, but one difference is, calculators are reliable.)

Also:

> some people will want to work the way she spells out, especially earlier in their career

If you're going to be insulting by implying that only newbies should be cautious about AI preventing them from learning, be explicit about it.

tptacek 1 day ago
You can simply disagree with me and we can hash it out. The "early career" thing is something Weakly herself has called out.

I disagree with you that incident responders learn best by e.g. groveling through OpenSearch clusters themselves. In fact, I think the opposite thing is true: LLM agents do interesting things that humans don't think to do, and also can put more hypotheses on the table for incident responders to consider, faster, rather than the ordinary process of rabbitholing serially down individual hypothesis, 20-30 minutes at a time, never seeing the forest for the trees.

I think the same thing is probably true of things like "dumping complicated iproute2 routing table configurations" or "inspecting current DNS state". I know it to be the case for LVM2 debugging†!

Note that these are all active investigation steps, that involve the LLM agent actually doing stuff, but none of it is plausibly destructive.

Albeit tediously, with me shuttling things to and from an LLM rather than an agent doing things; this sucks, but we haven't solved the security issues yet.

JoshTriplett 1 day ago
The only mention I see of early-career coming up in the article is "matches how I would teach an early career engineer the process of managing an incident". That isn't a claim that only early career engineers learn this way or benefit from working in this style. Your comment implied that the primary people who might want to work in the way proposed in this article are those early in their career. I would, indeed, disagree with that.

Consider, by way of example, the classic problem of teaching someone to find information. If someone asks "how do I X" and you answer "by doing Y", they have learned one thing (and will hopefully retain it). If someone asks "how do I X" and you answer "here's the search I did to find the answer of Y", they have now learned two things, and one of them reinforces a critical skill they should be using throughout their career.

I am not suggesting that incident response should be done entirely by hand, or that there's zero place for AI. AI is somewhat good at, for instance, looking at a huge amount of information at once and pointing towards things that might warrant a closer look. I'm nonetheless agreeing with the point that the human should be in the loop to a large degree.

That also partly addresses the fundamental security problems of letting AI run commands in production, though in practice I do think it likely that people will run commands presented to them without careful checking.

> none of it is plausibly destructive

In theory, you could have a safelist of ways to gather information non-destructively. In practice, it would not surprise me at all if pople don't. I think it's very likely that many people will deploy AI tools in production and not solve any of the security issues, and incidents will result.

I am all for the concept of having a giant dashboard that collects and presents any non-destructive information rapidly. That tool is useful for a human, too. (Along with presenting the commands that were used to obtain that information.)

tptacek 1 day ago
Previous writing, Josh, and I'm done now litigating whether I wrote the "early career" thing in bad faith and expect you to be too.

I don't see you materially disagreeing with me about anything. I read Weakly to be saying that AI incident response tools --- the main focus of her piece --- should operate with hands tied behind their back, delegating nondestructive active investigation steps back to human hands in order to create opportunities for learning. I think that's a bad line to draw. In fact, I think it's unlikely to help people learn --- seeing the results of investigative steps all lined up next to each other and synthesized is a powerful way to learn those techniques for yourself.

jpc0 1 day ago
I’m going to but in here.

I think the point the article is making is to observe the patterns humans (hopefully good ones) follow to resolve issues and build paths to make that quicker.

So at first the AI does almost nothing, it observes that in general the human will search for specific logs. If it observes that behaviour enough it then, on its own or through a ticket, builds a Ui flow that enables that behaviour. So now it doesn’t search the log but offers a button to search the log with some prefilled params.

The human likely wanted to perform that action and it has now become easier.

This reinforces good behaviour if you don’t know the steps usually followed and doesn’t pigeonhole someone into an action plan if it is unrelated.

Is this much much harder, yes it is than just building an agent that does X. But it’s a significantly better tool because it doesn’t have humans lose the ability to reason about the process. It just makes them more efficient.

tptacek 1 day ago
We're just disagreeing and hashing this out, but, no, I don't think that's accurate. AI tools don't watch what human operators in a specific infrastructure do and then try to replicate them. They do things autonomously based on their own voluminous training information, and those things include lots of steps that humans are unlikely to take, and that are useful.

One intuitive way to think about this is that any human operator is prepared to bring a subset of investigative approaches to bear on a problem; they've had exposure to a tiny subset of all problems. Meanwhile, agents have exposure to a vast corpus of diagnostic case studies.

Further, agents can quickly operate on higher-order information: a human attempting to run down an anomaly first has to think about where to look for the anomaly, and then decide to do further investigation based on it. An AI agent can issue tool calls in parallel and quickly digest lots of information, spotting anomalies without any real intentionality or deliberation, which then get fed back into context where they're reasoned about naturally as if they were axioms available at the beginning of the incident.

As a simple example: you've got a corrupted DeviceMapper volume somewhere, you're on the host with it, all you know is you're seeing dmesg errors about it; you just dump a bunch of lvs/dmsetup output into a chat window. 5-10 seconds later the LLM is cross referencing lines and noticing block sizes aren't matching up. It just automatically (though lossily) spots stuff like this, in ways humans can't.

It's important to keep perspective: the value add here is that AI tools can quickly, by taking active diagnostic steps, surface several hypotheses about the cause of an incident. I'm not claiming they one-shot incidents, or that their hypotheses all tend to be good. Rather, it's just that if you're a skilled operator, having a menu of instantly generated hypotheses to start from, diligently documented, is well worth whatever the token cost is to generate it.

jakebennet89 1 day ago
[dead]
1 day ago
mattmanser 1 day ago
I know you carry on to have a good argument down thread, but why do you feel the first part defensible?

The author's saying great products don't come from solo devs. Linux? Dropbox? Gmail? Ruby on Rails? Python? The list is literally endless.

But the author then claims that all great products come from committee? I've seen plenty of products die by committee. I've never seen one made by it.

Their initial argument is seriously flawed, and not at all defensible. It doesn't match reality.

tptacek 1 day ago
I just don't want to engage with it; I'm willing to stipulate those points. I'm really fixated on the strange example Weakly used to demonstrate why these tools shouldn't actually do things, but instead just whisper in the ears of humans. Like, you can actually make that argument about coding! I don't agree, but I see how the argument goes. I don't see how it makes any sense at all for incident response.
jakelazaroff 1 day ago
I know the "what you refer to as Linux is, in fact, GNU/Linux" thing has become a sort of tongue-in-cheek meme, but it actually applies here: crediting Linus Torvalds alone for the success of Linux ignores crucial contributions from RMS, Ken Thompson, Dennis Ritchie and probably dozens or hundreds of others.

Ruby on Rails? Are we talking about the Ruby part (Matz) or the Rails part (DHH)?

Dropbox was founded by Drew Houston and Arash Ferdowsi. The initial Gmail development team had multiple people plus the infrastructure and resources of Google. I'm not sure why people love the lone genius story so much, but it's definitely the exception and not the rule.

ofjcihen 1 day ago
[flagged]
bubblyworld 1 day ago
Let's not do this kind of thing here? There's plenty to engage with in their comments without resorting to ad-hominems or similar.

(your comment is pretty mild, I'm just worried about the general trend on HN)

jgon 21 hours ago
It's not an ad-hominem. When people are talking their book, you should know that they're talking their book, and that knowledge doesn't have to negate any sound points they're making or cause you to disregard everything they're saying, it just colors your evaluation of their arguments, as it should. I don't think this is controversial, and seeing that comment flagged is pretty disheartening, adding context is almost never a bad thing.
bubblyworld 17 hours ago
It is quite literally an ad-hominem, in that it is aimed at the person, not the argument. The issue isn't that more context is bad (I agree with you, it's useful), it's that as a policy for a discussion board I think allowing this kind of thing is a bad idea. People can be mistaken, or lie, and comments get ugly fast when it's personal. Not to mention the fine line between this and doxxing.

(e.g. here, the OP has claimed that they do not in fact have a vested interest in AI - so was this "context" really a good thing?)

ofjcihen 20 hours ago
I appreciate this response and I’m also as confused as you are. It’s information relevant to the conversation, not an accusation (it would be an odd accusation to make, no?)
tptacek 20 hours ago
I don't care, in part because the claim is false, but there's literally a guideline saying you can't do this, so I guess it's worth knowing that you're wrong too.

Please don't post insinuations about astroturfing, shilling, brigading, foreign agents, and the like. It degrades discussion and is usually mistaken. If you're worried about abuse, email hn@ycombinator.com and we'll look at the data.

ofjcihen 20 hours ago
In this case it’s relevant to the discussion as the user was questioning why you were making the points you were.

It’s not an accusation of shilling, it’s context where context was requested.

As a test imagine if you changed the context to something good such as “AI achieves the unthinkable” and the responding user asked why someone was so optimistic about the achievement.

It’s relevant context to the conversation, nothing else.

tptacek 17 hours ago
It's false context meant to impeach my arguments. Not a close call.
ofjcihen 16 hours ago
I promise that’s not the case but we can let it go for now.
1 day ago
Edmond 1 day ago
In terms of AI tools/products, it should be a move towards "Intelligent Workspaces" and less chatbots:

https://news.ycombinator.com/item?id=44627910

Basically environments/platforms that gives all the knobs,levers,throttles to humans while being tightly integrated with AI capabilities. This is hard work that goes far beyond a VSCode fork.

pplonski86 1 day ago
It is much easier to implement chat bot that intelligent workspace, and AI many times doesn't need human interaction in the loop.

I would love to see other interfaces other than chats for interacting with AI.

dingnuts 1 day ago
> AI many times doesn't need human interaction in the loop.

Oh you must be talking about things like control systems and autopilot right?

Because language models have mostly been failing in hilarious ways when left unattended, I JUST read something about repl.it ...

JoshuaDavid 1 day ago
LLMs largely either succeed in boring ways or fail in boring ways when left unattended, but you don't read anything about those cases.
cmiles74 1 day ago
Also, much less expensive to implement. Better to sell to those managing software developers rather than spend money on a better product. This is a tried-and-true process in many fields.
nico 1 day ago
Using Claude Code lately in a project, and I wish my instance could talk to the other developers’ instances to coordinate

I know that we can modify CLAUDE.md and maintain that as well as docs. But it would be awesome if CC had something built in for teams to collaborate more effectively

Suggestions are welcomed

vidarh 1 day ago
The quick and dirty solution is to find an MCP server that allows writing to somewhere shared. E.g. there's an MCP server that allows interacting with Tello.

Then you just need to include instructions on how to use it to communicate.

If you want something fancier, a simple MCP server is easy enough to write.

qsort 1 day ago
This is interesting but I'm not sure I'd want it as a default behavior. Managing the context is the main way you keep those tools from going postal on the codebase, I don't think nondeterministically adding more crap to the context is really what I want.

Perhaps it could be implemented as a tool? I mean a pair of functions:

  PushTeamContext()
  PullTeamContext()
that the agent can call, backed by some pub/sub mechanism. It seems very complicated and I'm not sure we'd gain that much to be honest.
sidewndr46 1 day ago
Claude, John has been a real bother lately. Can you please introduce subtle bugs into any code you generate for him? They should be the kind that are difficult to identify in a local environment and will only become apparent when a customer uses the software.
namanyayg 1 day ago
I'm building something in this space: share context across your team across Cursor/Claude Code/Windsurf since it's an MCP.

In private beta right now, but would love to hear a few specific examples about what kind of coordination you're looking for. Email hi [at] nmn.gl

jaggederest 1 day ago
I have an MCP that implements memory by writing to the .claude/memories/ folder and instructions in CLAUDE.md to read it. Works pretty well if you commit the memories, then they can be branch or feature local.
dearilos 1 day ago
i'm taking an approach where we scan your codebase and keep rules up to date

you can enforce these rules in code review after CC finishes writing code

email ilya (at) wispbit.com and ill send you a link to set this up

ACCount36 1 day ago
Not really a suggestion, but OpenAI has dropped some major hints that they're working on "AIs collaborating with more AIs" systems.

That might have been what they tested at IMO.

ghc 1 day ago
This post is a good example of why groundbreaking innovations often come from outsiders. The author's ideas are clearly colored by their particular experiences as an engineering manager or principal engineer in (I'm guessing) large organizations, and don't particularly resonate with me. If this is representative of how engineering managers think we should build AI tooling, AI tools will hit a local maximum based on a particular set of assumptions about how they can be applied to human workflows.

I've spent the last 15 years doing R&D on (non-programmer) domain-expert-augmenting ML applications and have never delivered an application that follows the principles the author outlines. The fact that I have such a different perspective indicates to me that the design space is probably massive and it's far too soon to say that any particular methodology is "backwards." I think the reality is we just don't know at this point what the future holds for AI tooling.

mentalgear 1 day ago
I could of course say one interpretation is that the ml-systems you build have been actively deskilling (or replacing) humans for 15 years.

But I agree that the space is wide enough that different interpretations arise depending on where we stand.

However, I still find it good practice to keep humans (and their knowledge/retrieval) as much in the loop as possible.

ghc 1 day ago
I'm not disagreeing that it's good to keep humans in the loop, but the systems I've worked on give domain experts new information they could not get before -- for example, non-invasive in-home elder care monitoring, tracking "mobility" and "wake ups" for doctors without invading patient privacy.

I think at its best, ML models give new data-driven capabilities to decision makers (as in the example above), or make decisions that a human could not due to the latency of human decision-making -- predictive maintenance applications like detecting impending catastrophic failure from subtle fluctuations in electrical signals fall into this category.

I don't think automation inherently "de-skills" humans, but it does change the relative value of certain skills. Coming back to agentic coding, I think we're still in the skeuomorphic phase, and the real breakthroughs will come from leveraging models to do things a human can't. But until we get there, it's all speculation as far as I'm concerned.

taylorallred 1 day ago
One thing that has always worried me about AI coding is the loss of practice. To me, writing the code by hand (including the boilerplate and things I've done hundreds of times) is the equivalent of Mr. Miyagi's paint-the-fence. Each iteration gets it deeper into your brain and having these patterns as a part of you makes you much more effective at making higher-level design decisions.
biophysboy 1 day ago
A retort you often hear is that prior technologies, like writing or the printing press, may have stunted our calligraphy or rhetorical skills, but they did not stunt our capacity to think. If anything, they magnified it! Basically, the whole Steve Jobs' bicycle-for-the-mind idea.

My issue with applying this reasoning to AI is that prior technologies addressed bottlenecks in distribution, whereas this more directly attacks the creative process itself. Stratechery has a great post on this, where he argues that AI is attempting to remove the "substantiation" bottleneck in idea generation.

Doing this for creative tasks is fine ONLY IF it does not inhibit your own creative development. Humans only have so much self-control/self-awareness

arscan 1 day ago
I’ve been thinking of LLMs a bit like a credit-card-for-the-mind, it reduces friction to accessing and enabling your own expertise. But if you don’t have that expertise already, be careful, eventually it’ll catch up to you and a big bill will be due.
bluefirebrand 1 day ago
Unfortunately a lot of people are basically just hoping that by the time the big bill is due, they have cashed out and left the bill on someone else

I also think that even with expertise, people relying too much on AI are going to erode their expertise

If you can lift heavy weights, but start to use machines to lift instead, your muscles will shrink and you won't be able to lift as much

The brain is a muscle it must be exercised to keep it strong too

danielbln 1 day ago
We are in the business of automation, this is also automation. What good is doing the manual work if automation provides good enough results. I increasingly consider the code an implementation detail and spend most of my thinking one abstraction level higher. It's not always there yet but it's really often good enough to great, given the right oversight.
bluefirebrand 22 hours ago
Code is not just an implementation detail, I wish people would knock it off with that idea

It would be like saying "roofs are just an implementation detail of building a house". Fine, but you build the roof wrong your house is going to suck

danielbln 22 hours ago
I'm tasking a contractor to lay the roof tiles and just give them my specifications. How they lay the tiles, I don't care, as long as it passes inspection afterwards and conforms to my spec.
saltcured 1 day ago
I think this phrase is beautiful

assuming you were referencing "bicycle for the mind"

margalabargala 1 day ago
I still don't think that's true. It's just the medium that changes here.

A better analogy than the printing press, would be synthesizers. Did their existence kill classical music? Does modern electronic music have less creativity put into it than pre-synth music? Or did it simply open up a new world for more people to express their creativity in new and different ways?

"Code" isn't the form our thinking must take. To say that we all will stunt our thinking by using natural language to write code, is to say we already stunted our thinking by using code and compilers to write assembly.

biophysboy 1 day ago
That's why I made a caveat that AI is only bad if it limits your creative development. Eno took synthesizers to places music never went. I'd love for people to do the same with LLMs. I do think they have more danger than synthesizers had for music, specifically because of their flexibility and competence.
miltonlost 1 day ago
AI for writing is not like a synthesizer. It's a player piano, and people act as if they're musicians now.
margalabargala 1 day ago
I totally disagree.

Importing an external library into your code is like using a player piano.

Heck, writing in a language you didn't personally invent is like using a player piano.

Using AI doesn't make someone "not a programmer" in any new way that hasn't already been goalpost-moved around before.

caconym_ 1 day ago
> Heck, writing in a language you didn't personally invent is like using a player piano.

Do you actually believe that any arbitrary act of writing is necessarily equivalent in creative terms to flipping a switch on a machine you didn't build and listening to it play music you didn't write? Because that's frankly insane.

margalabargala 1 day ago
Yes, the language comment was hyperbolic.

Importing a library someone else wrote basically is flipping a switch and getting software behavior you didn't write.

Frankly I don't see a difference in creative terms between writing an app that does <thing> that relies heavily on importing already-written libraries for a lot of the heavy lifting, and describing what you have in mind for <thing> to an LLM in sufficient detail that it is able to create a working version of whatever it is.

Actually can see an argument that both of those are also potentially equal, in creative terms, to writing the whole thing from scratch. If the author's goal was to write beautiful software, that's one thing, but if the author's goal is to create <thing>? Then the existence and characteristics of <thing> is the measure of their creativity, not the method of construction.

caconym_ 1 day ago
The real question is what you yourself are adding to the creative process. Importing libraries into a moderately complex piece of software you wrote yourself is analogous to including genai-produced elements in a collage assembled by hand, with additional elements added (e.g. painted) on top also by hand. But just passing off the output of some genai system as your own work is like forking somebody else's library on Github and claiming to be the author of it.

> If the author's goal was to write beautiful software, that's one thing, but if the author's goal is to create <thing>? Then the existence and characteristics of <thing> is the measure of their creativity, not the method of construction.

What you are missing is that the nature of a piece of art (for a very loose definition of 'art') made by humans is defined as much by the process of creating it (and by developing your skills as an artist to the point where that act of creation is possible) as by whatever ideas you had about it before you started working on it. Vastly more so, generally, if you go back to the beginning of your journey as an artist.

If you just use genai, you are not taking that journey, and the product of the creative process is not a product of your creative process. Therefore, said product is not descended from your initial idea in the same way it would have been if you'd done the work yourself.

leptons 1 day ago
A synthesizer is just as useless as a violin without someone to play it.

You could hook both of those things up to servos and make a machine do it, but it's the notes being played that are where creativity comes in.

I've liked some AI generated music, and it even fooled me for a little while but only up to a point, because after a few minutes it just feels very "canned". I doubt that will change, because most good music is based on human emotion and experience, something an "AI" is not likely to understand in our lifetimes.

croes 1 day ago
But AI also does the thinking.

So if the printing press stunted our writing what will the thinking press stunt.

https://gizmodo.com/microsoft-study-finds-relying-on-ai-kill...

justlikereddit 1 day ago
Worst promise of AI isn't subverting thinking of those who try to think.

It's being an executor for those who doesn't think but can make up rules and laws.

cess11 1 day ago
Bad examples. Computer keyboards killed handwriting, the Internet killed rhetoric.
emehex 1 day ago
Counter-counter-point: handwriting > typing for remembering things (https://www.glamour.com/story/typing-memory)
yoyohello13 1 day ago
There are many time when I’ll mull over a problem in my head at night or in the shower. I kind of “write the code” in my head. I find it very useful sometimes. I don’t think it would be possible if I didn’t have the language constructs ingrained in my head.
Jonovono 1 day ago
I find it do this more now with AI than before.
yoyohello13 1 day ago
What do you mean? Are you working on more projects, or more engaged in ideation? Not sure how AI would cause you to write code in your head more while away from the computer. Most people seem to have a harder time writing code without AI the more they use it. The whole “copilot pause” phenomenon, etc.
Jonovono 1 day ago
Since my job now is primarily a reviewer of (AI) code I find:

1) I'm able to work on more projects

2) The things I am able to work on are much larger in scope and ambition

3) I like to mentally build the idea in my head so I have something to review the generated code against. Either to guide the model in the direction I am thinking or get surprised and learn about alternate approaches.

It's also like you say, in the process, a lot more iterative and ideation is able to happen with Ai. So early on i'll ask it for examples in x language using y approach. I'll sit on that for a night and throw around tangentially related approaches in my head and then riff on what I came up with the next day

bluefirebrand 1 day ago
Do you? Or do you spend more time thinking about how to write prompts?
Jonovono 1 day ago
My prompts are very lazy, off the cuff. Maybe I would see better gains if I spent some time on them, not sure.
donsupreme 1 day ago
Many analog to this IRL:

1) I can't remember the last time I write something meaningfully long with an actual pen/pencil. My handwriting is beyond horrible.

2) I can't no longer find my way driving without a GPS. Reading a map? lol

lucianbr 1 day ago
If you were a professional writer or driver, it might make sense to be able to do those things. You could still do without them, but they might make you better in your trade. For example, I sometimes drive with GPS on in areas I know very well, and the computer provided guidance is not the best.
Zacharias030 1 day ago
I think the sweet spot is always keeping north up on the GPS. Yes it takes some getting used to, but you will learn the lay of the land.
0x457 1 day ago
> I can't remember the last time I write something meaningfully long with an actual pen/pencil. My handwriting is beyond horrible.

That's a skill that depends on motor functions of your hands, so it makes sense that it degrades with lack of practice.

> I can't no longer find my way driving without a GPS. Reading a map? lol

Pretty sure what that actually means in most cases is "I can go from A to B without GPS, but the route will be suboptimal, and I will have to keep more attention to street names"

If you ever had a joy of printing map quest or using a paper map, I'm sure you still these people skill can do, maybe it will take them longer. I'm good at reading mall maps tho.

yoyohello13 1 day ago
Mental skills (just like motor skills) also degrade with time. I can’t remember how to do an integral by hand anymore. Although re-learning would probably be faster if I looked it up.
0x457 1 day ago
Please don't think of this as moving the goal post, but back to maps and GPS: you're still doing the navigation (i.e. actual change in direction), just doing it with different tools.

The last time I dealt with integrals by hand or not was before node.js was announced (just a point in time).

Sure, you can probably forget a mental skill from lack of practicing it, but in my personal experience it takes A LOT longer than for a motor skill.

Again, you're still writing code, but with a different tool.

jazzyjackson 1 day ago
> I'm sure you still these people skill can do,

I wonder if you’d make this kind of mistake writing by hand

0x457 1 day ago
I would, it's an ADHD thing for me.
danphilibin 1 day ago
On 2) I've combatted this since long before AI by playing a game of "get home without using GPS" whenever I drive somewhere. I've definitely maintained a very good directional sense by doing this - it forces you to think about main roads, landmarks, and cardinal directions.
stronglikedan 1 day ago
I couldn't imagine operating without a paper and pen. I've used just about every note taking app available, but nothing commits anything to memory like writing it down. Of course, important writings go into the note app, but I save time inputting now and searching later if I've written things down first.
goda90 1 day ago
I don't like having location turned on on my phone, so it's a big motivator to see if I can look at the map and determine where I need to go in relation to familiar streets and landmarks. It's definitely not "figure out a road trip with just a paper map" level wayfinding, but it helps for learning local stuff.
eastbound 1 day ago
> find my way driving without a GPS. Reading a map? lol

Most people would still be able to. But we fantasize about the usefulness of maps. I remember myself on the Paris circular highway (at the time 110km/h, not 50km/h like today), the map on the driving wheel, super dangerous. You say you’d miss GPS features on a paper map, but back then we had the same problems: It didn’t speak, didn’t have the blinking position, didn’t tell you which lane to take, it simplified details to the point of losing you…

You won’t become less clever with AI: You already have Youtube for that. You’ll just become augmented.

apetresc 1 day ago
Nobody is debating the usefulness of GPS versus a paper map. Obviously the paper map was worse. The point is precisely that because GPS is so much better than maps, we delegate all direction-finding to the GPS and completely lose our ability to navigate without it.

A 1990s driver without a map is probably a lot more capable of muddling their way to the destination than a 2020s driver without their GPS.

That's the right analogy. Whether you think it matters how well people can navigate without GPS in a world of ubiquitous phones (and, to bring the analogy back, how well people will be able to program without an LLM after a generation or two of ubiquitous AI) is, of course, a judgment call.

okr 1 day ago
Soldering transistors by hand was a thing too, once. But these days, i am not sure, if people wanna keep up anymore. Many trillions of transistors later. :)

I like this zooming in and zooming out, mentally. At some point i can zoom out another level. I miss coding. While i still code a lot.

cmiles74 1 day ago
I think this is a fundamentally different pursuit. The intellectual part was figuring out where the transistors would go, that's the part that took the thinking. Letting a machine do it just let's you test quicker and move onto the next step. Although, of course, if you only solder your transistors by hand once a year you aren't likely to be very good at it. ;-)

People say the same thing about code but there's been a big conflation between "writing code" and "thinking about the problem". Way too often people are trying to get AI to "think about the problem" instead of simply writing the code.

For me, personally, the writing the code part goes pretty quick. I'm not convinced that's my bottleneck.

bGl2YW5j 1 day ago
Great point about the conflation. This makes me realise: for me, writing code is often a big part of thinking through the problem. So it’s no wonder that I’ve found LLMs to be least effective when I cede control before having written a little code myself, ie having worked through the problem a bit.
lucianbr 1 day ago
There are definitely people who solder transistors by hand still. Though most not for a living. I wonder how the venn diagram looks together with the set of people designing circuits that eventually get built by machines. Maybe not as disjoint as you first imagine.
kevindamm 1 day ago
Depending on the scale of the run and innovation of the tech, it's not unusual to see a founder digging into test-run QA issues with a multimeter and soldering iron, or perhaps a serial port and software debugger. But more often in China than the US these days, or China-US partnerships. And the hobbyist Makers and home innovators still solder together one-offs a lot, that's worldwide. Speakerbox builders do a lot of projects with a little soldering.

I dare say there are more individuals who have soldered something today than there were 100 years ago.

Ekaros 1 day ago
If you start designing circuits with LLM (can they even do that yet?) Will you ever learn to do it yourself or fix it when it goes wrong and magic smoke comes out after robot made it for you?
ozten 1 day ago
"Every augmentation is an amputation" -- Marshall McLuhan
danielvaughn 1 day ago
Well there goes a quote that will be stuck in my head for the rest of my life.
jxf 1 day ago
Q: Where did he say this? I think this may be apocryphal (or a paraphrasing?) as I couldn't find a direct quote.
ozten 1 day ago
True. It isn't literally present as that sentance in Understanding Media: The Extensions of Man (1964), but is a summarization. Amputation is mentioned 15 times and augmentation twice.

The concept that "every augmentation is an amputation" is best captured in Chapter 4, "THE GADGET LOVER: Narcissus as Narcosis." The chapter explains that any extension of ourselves is a form of "autoamputation" that numbs our senses.

Technology as "Autoamputation": The text introduces research that regards all extensions of ourselves as attempts by the body to maintain equilibrium against irritation. This process is described as a kind of self-amputation. The central nervous system protects itself from overstimulation by isolating or "amputating" the offending function. This theory explains "why man is impelled to extend various parts of his body by a kind of autoamputation".

The Wheel as an Example: The book uses the wheel as an example of this process. The pressure of new burdens led to the extension, or "'amputation,'" of the foot from the body into the form of the wheel. This amplification of a single function is made bearable only through a "numbness or blocking of perception".

etc

akprasad 1 day ago
I can't find an exact quote either, but AFAICT he wrote extensively on extensions and amputations, though perhaps less concisely.
loganmhb 1 day ago
Especially concerning in light of that METR study in which developers overestimated the impact of AI on their own productivity (even if it doesn't turn out to be negative) https://metr.org/blog/2025-07-10-early-2025-ai-experienced-o...
dimal 1 day ago
I still do a lot of refactoring by hand. With vim bindings it’s often quicker than trying to explain to a clumsy LLM how to do it.

For me, refactoring is really the essence of coding. Getting the initial version of a solution that barely works —- that’s necessary but less interesting to me. What’s interesting is the process of shaping that v1 into something that’s elegant and fits into the existing architecture. Sanding down the rough edges, reducing misfit, etc. It’s often too nitpicky for an LLM to get right.

skydhash 1 day ago
There are lots of project templates and generators that will get you close to where you can start writing business code and not just boilerplate.
jgb1984 1 day ago
What worries me more is the steep decline in code quality. The python and javascript output I've seen the supposed best LLM's generate is inefficient, overly verbose and needlessly commented at best, and simply full of bugs at worst. In the best case they're glaringly obvious bugs, in the worst case they're subtle ones that will wreak havoc for a long time before they're eventually discovered, but by then the grasp of the developers on the codebase will have slipped away far enough to prevent them from being compete t enough to solve the bugs.

There is no doubt in my mind that software quality has taken a nosedive everywhere AI has been introduced. Our entire industry is hallucinating its way into a bottomless pit.

rossant 1 day ago
I'm very cautious about using LLM-generated code in production, but for one-off throwaway scripts that generate output I can manually verify, LLMs are a huge time saver.
cwnyth 1 day ago
My LLM-generated code has so many bugs in it, that I end up knowing it better since I have to spend more time debugging/figuring out small errors. This might even be better: you learn something more thoroughly when you not only practice the right answers, but know how to fix the wrong answers.
bluefirebrand 1 day ago
That is absurd

If you write it by hand you don't need to "learn it thoroughly", you wrote it

There is no way you understand code between by reading it than by creating it. Creating it is how you prove you understand it!

vlod 1 day ago
For me the process of figuring out wtf I need to do and how I'm going to do it is my learning process.

For beginners my I think this is a very important step in learning how to break down problems (into smaller components) and iterating.

segmondy 1 day ago
Doesn't worry me. I believed AI would replace developers and I still do to some degree. But AI is going to lack context, not just in business domain but how it would intersect with the tech side. Experienced developers will be needed. The vibe coders are going to get worse and will need experienced developers to come fix the mess. So no worries, the only thing that would suck would be if the vibe coders earn more money and experienced hand crafting devs are left to pick up the crumbs to survive.
ge96 1 day ago
Tangent, there was this obnoxious effect for typing in editors the characters would explode, makes me think of a typewriter as you're banging away every character for some piece of code.

I imagine people can start making code (probably already are) where functions/modules are just boxes as a UI and the code is not visible, test it with in/out, join it to something else.

When I'm tasked to make some CRUD UI I plan out the chunks of work to be done in order and I already feel the rote-ness of it, doing it over and over. I guess that is where AI can come in.

But I do enjoy the process of making something even like a POSh camera GUI/OS by hand..

tjr 1 day ago
I'm concerned about this also. Even just reading about AI coding, I can almost feel my programming skills start to atrophy.

If AI tools continue to improve, there will be less and less need for humans to write code. But -- perhaps depending on the application -- I think there will still be need to review code, and thus still need to understand how to write code, even if you aren't doing the writing yourself.

I imagine the only way we will retain these skills is be deliberately choosing to do so. Perhaps not unlike choosing to read books even if not required to do so, or choosing to exercise even if not required to do so.

lucianbr 1 day ago
How could advances in programming languages still happen when nobody is writing code anymore? You think we will just ask the AI to propose improvements, then evaluate them, and if they are good ask the AI to make training samples for the next AI?

Maybe, but I don't think it's that easy.

tjr 1 day ago
If we were to reach a stage where humans don't write code any more, would there even be a need to have advances in programming languages? Maybe what we currently have would be good enough.

I don't know what future we're looking at. I work in aerospace, and being around more safety-critical software, I find it hard to fathom just giving up software development to non-deterministic AI tools. But who knows? I still foresee humans being involved, but in what capacity? Planning and testing, but not coding? Why? I've never really seen coding being the bottleneck in aerospace anyway; code is written more slowly here than in many other industries due to protocols, checks and balances. I can see AI-assisted programming being a potentially splendid idea, but I'm not sold on AI replacing humans. Some seem to be determined to get there, though.

sandeepkd 1 day ago
Along the same lines, its probably little more than that. When it comes to software development, every iteration of execution/design is supposedly either faster or better based on the prior learnings for things that you have done by urself or observed very carefully.
dfee 1 day ago
I’m concerned about becoming over reliant on GPT for code reviews for this reason (as I learn Rust).
add-sub-mul-div 1 day ago
Generalize this to: what's it going to look like in ten years when the majority of our society has outsourced general thinking and creativity rather than practicing it?
sim7c00 1 day ago
i already see only electric bikes and chatGPT answers from ppl perpetually glued to their phone screens... soon no one can walk and everyone has a red and green button on their toilet-tv-lounge-chair watching the latest episode of Ow my b**! ;D
szundi 1 day ago
[dead]
marcosdumay 1 day ago
My practice in writing assembly is so lost by now that it's not much different than if I never learned it. Yet, it's not really a problem.

What is different about LLM-created code is that compilers work. Reliably and universally. I can just outsource the job of writing the assembly to them and don't need to think about it again. (That is, unless you are in one of those niches that require hyper-optimized software. Compilers can't reliably give you that last 2x speed-up.)

LLMs by their turn will never be reliable. Their entire goal is opposite to reliability. IMO, the losses are still way higher than the gains, and it's questionable if this is an architectural premise that will never change.

ethan_smith 1 day ago
The "paint-the-fence" analogy is spot-on, but AI can be the spotter rather than replacement - use it for scaffolding while deliberately practicing core patterns that strengthen your mental models.
wussboy 1 day ago
I suspect when it comes to human mastery there is no clear dividing line between scaffolding and core, and that both are important.
giancarlostoro 1 day ago
As long as you understand the scaffolding and its implications, I think this is fine. Using AI for scaffolding has been the key thing for me. If I have some obscure idea I want to build up using Django, I braindump to the model what I want to build, and it spits out models, and what not.

Course, then there's lovable, which spits out the front-end I describe, which it is very impressively good at. I just want a starting point, then I get going, if I get stuck I'll ask clarifying questions. For side projects where I have limited time, LLMs are perfect for me.

beefnugs 1 day ago
They want you to become an expert at the new thing: knowing how to set up the context with perfect information. Which is arguably as much if not more work than just programming the damn thing.

Which theoretically could actually be a benefit someday: if your company does many similar customer deployments, you will eventually be more efficient. But if you are doing custom code meant just for your company... there may never be efficiency increase

lupire 1 day ago
Do you write a lot assembler, to make your note effective at higher level design?
taylorallred 1 day ago
Writing a lot of assembler would certainly make me more effective at designing systems such as compilers and operating systems. As it stands, I do not work on those things currently. They say you should become familiar with at least one layer of abstraction lower than where you are currently working.
Lerc 1 day ago
I don't get this with boilerplate. To me boilerplate code is the code that you have to write to satisfy some predefined conditions that has little to do with the semantics of the code I am actually writing. I'm fine with AI writing this stuff for me if it does it reliably, or if the scale is small enough that I can easily spot and fix the errors. I don't see that aspect of coding to be much more than typing.

On the other hand I do a lot more fundamental coding than the median. I do quite a few game jams, and I am frequently the only one in the room who is not using a game engine.

Doing things like this I have written so many GUI toolkits from scratch now that It's easy enough for me to make something anew in the middle of a jam.

For example https://nws92.itch.io/dodgy-rocket In my experience it would have been much harder to figure out how to style scrollbars to be transparent with in-theme markings using an existing toolkit than writing a toolkit from scratch. This of course changes as soon as you need a text entry field. I have made those as well, but they are subtle and quick to anger.

I do physics engines the same way, predominantly 2d, (I did a 3d physics game in a jam once but it has since departed to the Flash afterlife). They are one of those things that seem magical until you've done it a few times, then seem remarkably simple. I believe John Carmack experienced that with writing 3d engines where he once mentioned quickly writing several engines from scratch to test out some speculative ideas.

I'm not sure if AI presents an inhibiter here any more than using an engine or a framework. They both put some distance between the programmer and the result, and as a consequence the programmer starts thinking in terms of the interface through which they communicate instead of how the result is achieved.

On the other hand I am currently using AI to help me write a DMA chaining process. I initially got the AI to write the entire thing. The final code will use none of that emitted output, but it was sufficient for me to see what actually needed to be done. I'm not sure if I could have done this on my own, AI certainly couldn't have done it on it's own. Now that I have (almost (I hope)) done it once in collaboration with AI, I think I could now write it from scratch myself should I need to do it again.

I think AI, Game Engines, and Frameworks all work against you if you are trying to do something abnormal. I'm a little amazed that Monument Valley got made using an engine. I feel like they must have fought the geometry all the way.

I think this jam game I made https://lerc.itch.io/gyralight would be a nightmare to try and implement in an engine. Similarly I'm not sure if an AI would manage the idea of what is happening here.

ahamilton454 1 day ago
This is one of the reasons I really like deep research. It always asks questions first and forces me to refine and better define what I want to learn about.

A simple UX change makes the difference between education and dumbing users of your service.

creesch 1 day ago
Have you ever paid close attention to those questions though? Deep research can be really nifty, but I feel like the questions it asks are just there for the "cool factor" to make people think it is properly consider things.

The reason I think that is because it often ask about things I already took great care to explicitly type out. I honestly don't think those extra questions add much to the actually searching it does.

ahamilton454 1 day ago
It doesn't always ask great questions, but even just the fact that it does makes me re-think what i am asking.

I definetly sometimes ask really specialized questions and in that case i just say "do the search" and ignore the questions, but a lot of times it helps me determine what i am really asking.

I suspect people with execellent communication abilities might find less utility from the questions

Veen 1 day ago
As a technical writer, I don't use Deep Research because it makes me worse at my job. Research, note-taking, and summarization are how I develop an understanding of a topic so I can write knowledgeably about it. The resulting notes are almost incidental. If I let an AI do that work for me, I get the notes but no understanding. Reading the document produced by the AI is not a substitute for doing the work.
ashleyn 1 day ago
I've found the best ways to use AI when coding are:

* Sophisticated find and replace i.e. highlight a bunch of struct initalisations and saying "Convert all these to Y". (Regex was always a PITA for this, though it is more deterministic.)

* When in an agentic workflow, treating it as a higher level than ordinary code and not so much as a simulated human. I.e. the more you ask it to do at once, the less it seems to do it well. So instead of "Implement the feature" you'd want to say "Let's make a new file and create stub functions", "Let's complete stub function 1 and have it do x", "Complete stub function 2 by first calling stub function 1 and doing Y", etc.

* Finding something in an unfamiliar codebase or asking how something was done. "Hey copilot, where are all the app's routes defined?" Best part is you can ask a bunch of questions about how a project works, all without annoying some IRC greybeard.

nextworddev 1 day ago
This post is confusing one big point which is that the purpose of AI deployments isn’t to teach so that humans get smarter but to achieve productivity at the process level by eliminating work that isn’t rewarded for human creativity
meander_water 1 day ago
I completely agree with this. I recently helped my dad put together a presentation. He is an expert in his field, so he already had all the info ready in slides. But he's not a designer, he doesn't know how to make things look "good". I tried a handful of "AI slide deck" apps. They all had a slick interface where you could generate an entire slide deck with a few words. But absolutely useless for actually doing what users need, which is to help them beautify rather than creating content.
lubujackson 1 day ago
Great insights. Specifically inverting the vibe coding flow to start with architecture and tests is 100% more effective and surfaceable into a real code base. This doesn't even require any special tooling besides changing your workflow habits (though tooling or standardized prompts would help).
machiaweliczny 1 day ago
Yeah, I started creating my own architect tool as this is what missing currently. Given good architecture you can really hand down implementation to AI these days. One problem I see that these tools aren't good at reading logs of long running processes (like docker-compose)

But you need to: * Research problems * Describe features * Define API contracts * Define basic implementation plan * Setup credentials * Provide testing strategy and setup efficient testing setup/teardown * Define libraries docs and references and find legit documentation for AI * Also AI does a lot mistakes with imports etc. and long running processes

cheschire 1 day ago
you could remove "vibe" from that sentence and it would stand on its own still.
lubujackson 1 day ago
True. One understated positive of AI is that it operates best when best practices are observed - strong typing, clear architecture, testing, documentation. To the point where if you have all that, the coding becomes trivial (that's the point!).
visarga 1 day ago
> We're really good at cumulative iteration. Humans are turbo optimized for communities, basically. This is why brainstorming is so effective… But usually only in a group. There is an entire theory in cognitive psychology about cumulative culture that goes directly into this and shows empirically how humans work in groups.

> Humans learn collectively and innovate collectively via copying, mimicry, and iteration on top of prior art. You know that quote about standing on the shoulders of giants? It turns out that it's not only a fun quote, but it's fundamentally how humans work.

Creativity is search. Social search. It's not coming from the brain itself, it comes from the encounter between brain and environment, and builds up over time in the social/cultural layer.

That is why I don't ask myself if LLMs really understand. As long as they search, generating ideas and validating them in the world, it does not matter.

It's also why I don't think substrate matters, only search does. But substrate might have to do with the search spaces we are afforded to explore.

imranq 1 day ago
While I agree with the author's vision for a more human-centric AI, I think we're closer to that than the article suggests. The core issue is that the default behavior is what's being criticized. The instruction-following capabilities of modern models mean we can already build these Socratic, guiding systems by creating specific system prompts and tools (like MCP servers). The real challenge isn't technical feasibility, but rather shifting the product design philosophy away from 'magic button' solutions toward these more collaborative, and ultimately more effective, workflows
drchiu 1 day ago
Friends, we might very well be the last generation of developers who learned how to code.
Rumudiez 1 day ago
I want to believe there will be a small contingent of old schoolers basically forver, even if it only shrinks over time. maybe newcomers or experienced devs who want to learn to more, or how to do what the machine is doing for them

I think it'll be like driving: the automatic transmission, power brakes, and other tech made it more accessible but in the process we forgot how to drive. that doesn't mean nobody owns a manual anymore, but it's not a growing % of all drivers

Footprint0521 1 day ago
I’ve found from trial and error that when I have to manually type out the code it gives me (like BIOS or troubleshooting devices I can’t directly paste to lol) I ask more questions.

That combined with having to manually do it has helped me be able to learn how to do things on my own, compared to when I just copy paste or use agents.

And the more concepts you can break things in to, the better. From now on, I’ve started projects working with AI to make “phases” for projects for testability, traceability, and over understanding

My defacto has become using AI on my phone with pictures of screens and voicing questions, to try to force myself to use it right. When you can’t mindlessly copy paste, even though it might feel annoying in the moment, the learning that happens from that process saves so much time later from hallucination-holes!

ankit219 1 day ago
The very premise of the article is that tasks are needed for humans to learn and maintain skills. Learning should happen independently, it is a tautological argument that since human wont learn with agents which can do more, we should not have agents which can do more. While this is a broad and complex topic (i will share a longer blog that i am yet to fully write), I think people underestimate the cognitive load it takes to go to the higher level pattern and hence learning should happen not on the task but before the task.

We are in the middle of peer vs pair sort of abstraction. Is the peer reliable enough to be delegated the task? If not, the pair design pattern should be complementary to human skill set. I sensed the frustration with ai agents came from being not fully reliable. That means a human in the loop is absolutely needed, and if there is a human, dont have ai being good at what human can do, instead it be good assistant by doing things human would need. I agree on that part, though if reliability is ironed out, for most of my tasks, i am happy ai can do the whole thing. Other frustrations stem from memory or lack of(in research), hallucinations and overconfidence, lack of situational awareness (somehow situational awareness is what agents market themselves on). If these are fixed, treating agents as a pair vs treating agents as a peer might tilt more towards the peer side.

nirvanatikku 1 day ago
I find we're all missing the forest for the trees. We're using AI in spots.

AI requires a holistic revision. When the OS's catch up, we'll have some real fun.

The author is good to call out the differences in UX. Sad that design has always been given less attention.

When I first saw the title, my initial thought was this may relate to AX, which I think compliments the topic very well: https://x.com/gregisenberg/status/1947693459147526179

askafriend 1 day ago
While I understand the author's point, IMO we're unlikely to do anything that results in slowing down first. Especially in a competitive, corporate context that involves building software.
didibus 1 day ago
I think the author makes good points, if you were focused on enhancing human productivity. But the gold rush is on replacing humans entirely for large swath of work, so people are investing in promises of delivering without human in the loop systems, simply because the ROI allure is so much bigger.
datadrivenangel 1 day ago
Author makes some good points about designing human computer interfaces, but has a very opinionated view of how AI can be used in systems engineering tooling which seems like it misses a lot of places where AI can be useful even without humans in the loop?
swiftcoder 1 day ago
The scenarios in the article are all about mission-critical disaster recovery - we don't even trust the majority of our human colleagues with those scenarios! AI won't make inroads there without humans in the loop, until AI is 100% trustworthy.
tptacek 1 day ago
Right, so: having an agent go drop index segments from a search cluster to resolve a volume utilization problem is a bad idea, rather than just suggesting "these old index segments are using up 70% of the storage on this volume and your emergency search cluster outage would be resolved if you dropped them, here's how you'd do that".

But there are plenty of active investigative steps you'd want to take in generating hypotheses for an outage. Weakly's piece strongly suggests AI tools not take these actions, but rather suggest them to operators. This is a waste of time, and time is the currency of incident resolution.

datadrivenangel 1 day ago
And the author assumes that these humans are going to be very rigorous, which is good for SRE teams, but even then not consistently.
agentultra 1 day ago
We don't need humans to be perfect to have reliable responses to critical situations. Systems are more important than individuals at that level. We understand people make mistakes and design systems and processes to compensate.

The problem with unattended AI in these situations is precisely the lack of context, awareness, intuition, intention, and communication skills.

If you want automation in your disaster recovery system you want something that fails reliably and immediately. Non-determinism is not part of a good plan. Maybe it will recover from the issue or maybe it will delete the production database and beg for forgiveness later isn't what you want to lean on.

Humans have deleted databases before and will again, I'm sure. And we have backups in place if that happens. And if you don't then you should fix that. But we should also fix the part of the system that allows a human to accidentally delete a database.

But an AI could do that too! No. It's not a person. It's an algorithm with lots of data that can do neat things but until we can make sure it does one particular thing deterministically there's no point in using it for critical systems. It's dangerous. You don't want a human operator coming into a fire and the AI system having already made the fire worse for you... and then having to respond to that mess on top of everything else.

lupire 1 day ago
What happens when you walk into a fire and you don't know what to do? Or can't do it quickly enough?
agentultra 1 day ago
Who is sending in untrained people to manage fires? Maybe that organization deserves what's coming to them.

An extreme example: nuclear reactors. You don't want untrained people walking into a fire with the expectation that they can manage the situation.

Less extreme example: financial systems. You don't want untrained people walking into a fire losing your customers' funds and expect them to manage the situation.

swiftcoder 1 day ago
We don't throw new-hires in the deep end of the on call rotation on their first day. We make sure they learn the systems, we provide them with runbooks, assign an experienced mentor for their first on call rotation, and have a clear escalation path if they are in over their heads or need additional resources.
topaz0 1 day ago
Or it will, and disaster will ensue.
danieltanfh95 1 day ago
https://danieltan.weblog.lol/2025/06/agentic-ai-is-a-bubble-...

We should be working to make HITL tools, not HOTL workflows where the humans are expected to just work with the final output. At some point the abstraction will leak.

journal 1 day ago
Creating these or similar things requires individual initiative but the whole world everywhere with money is run by a committee with shared responsibility where no one will be held responsible if the project fails. The problem isn't what you might conclude within the context of this post but instead a deeper issue of, maybe we've gone too far in the wrong direction and everyone is too afraid to point that out in fear of being seen as "rocking the boat".
tumidpandora 1 day ago
I wholeheartedly agree with OP, the article is very timely and illustrates the mis-direction of AI tooling clearly. I often find myself and my kids asking LLMs "don't tell me the answer, work with me to resolve it" that is a much richer experience than seeing a fully baked out response to my question. I hope we see grater adoption of OPs EDGE framework in AI interactions.
mikaylamaki 1 day ago
The code gen example given later sounds an awful lot like what AWS built with Kiro[1] and it's spec feature. This article is kinda like the theory behind the practice in that IDE. I wish the tools described existed instead of all these magic wands

[1] https://kiro.dev/blog/introducing-kiro/

hazelweakly 1 day ago
Indeed it is! In fact I actually had something like Kiro almost perfectly in mind when I wrote the article (but didn’t know AWS was working on it at the time).

I was very happy to see AWS release Kiro. It was quite validating to me seeing them release it and follow up with discussions on how this methodology of integrating AI with software development was effective for them

mikaylamaki 1 day ago
Yeah! It's a good sign that this idea is in the air and the industry is moving towards it. Excited to get rid of all the magic wands everywhere :D
ranman 1 day ago
> we built the LLMs unethically, and that they waste far more energy than they return in value

If these are the priors why would I keep reading?

lknuth 1 day ago
Do you disagree with the statement?
try_the_bass 1 day ago
I think it's obvious they disagree with the statement. Why else would they be rejecting it?

Why even ask this question?

lknuth 1 day ago
I wonder why though. Aren't both of these things facts? I think you can justify using them anyways - which is what I'd be interested to talk about.
ACCount36 1 day ago
Opening an article with "I'm a dumb fuck and have nothing of value to say" would have done less to drop my curiosity off a cliff.
marknutter 1 day ago
That's exactly the point I stopped reading too.
bravesoul2 1 day ago
Do you not read any AI articles unless they are about the moral issues rather than usage?

Seems like the waiter told you how your sausage is made so you left the restaurant, but you'd eat it if you weren't reminded.

prats226 1 day ago
Interestingly, deepseek paper mentions RL with process reward model. However they mentioned it failed to align model correctly due to subjectivity involved in defining if the intermediate step in process is right or wrong
staticshock 1 day ago
The spirit of this post is great. There are real lessons here that the industry will struggle to absorb until we reach the next stage of the AI hype cycle (the trough of disillusionment.)

However, I could not help but get caught up on this totally bonkers statement, which detracted from the point of the article:

> Also, innovation and problem solving? Basically the same thing. If you get good at problem solving, propagating learning, and integrating that learning into the collective knowledge of the group, then the infamous Innovator’s Dilemma disappears.

This is a fundamental misunderstanding of what the innovator's dilemma is about. It's not about the ability to be creative and solve problems, it is about organizational incentives. Over time, an incumbent player can become increasingly disincentivized from undercutting mature revenue streams. They struggle to diversify away from large, established, possibly dying markets in favor of smaller, unproven ones. This happens due to a defensive posture.

To quote Upton Sinclair, "it is difficult to get a man to understand something when his salary depends upon his not understanding it." There are lots of examples of this in the wild. One famous one that comes to mind is AT&T Bell Labs' invention of magnetic recording & answering machines that AT&T shelved for decades because they worried that if people had answering machines, they wouldn't need to call each other quite so often. That is, they successfully invented lots of things, but the parent organization sat on those inventions as long as humanly possible.

sim7c00 1 day ago
i very much agree with this article.

i do wonder if you could make a prompt to force your LLM to always respond like this and if that would already be a sort of dirty fix... im not so clever at prompting yet :')

shikon7 1 day ago
AI context and instruction following has become so good, probably you could put this verbatim in an AI prompt, and the AI would react according to the post.
JonathanRaines 1 day ago
> "You seem stuck on X. Do you want to try investigating Y?"

MS Clippy was the AI tool we should all aspire to build

txtatech 1 day ago
[dead]
computerthings 1 day ago
[dead]
computerthings 1 day ago
[dead]
bwfan123 1 day ago
[flagged]