Are LLMs Replacements for Developers or Productivity Tools? (Part 4)

In the previous posts (Part 1, Part 2, Part 3) I followed a structured approach suggested by Harper Reed to generate a greenfield (completely new) application. My secondary goal was to generate a useful application that could extract statistics from published pdf papers. But the primary goal was to determine if the hype is true and LLMs can replace professional software engineers. Now I am ready to summarize my thoughts on the current and future states!

What Does This All Mean?

The idea that LLMs will replace all developers any time soon is clearly premature. While they can generate large amounts of code, that code often lacks nuance, robustness, or even basic coherence once you look beyond surface functionality. LLMs don’t “understand” projects the way humans do They pattern-match, remix, and guess based on language statistics. That means they can produce output that seems plausible but isn’t always logically sound or complete.

That said, they can absolutely accelerate and augment a developer’s workflow, particularly when the task is well-bounded, small in scope, or exploratory in nature. In these contexts, they behave less like seasoned engineers and more like tireless, well-informed interns: fast, helpful, and occasionally baffling.

At this point in time LLMs excel at supporting the software lifecycle, particularly in areas that don’t require deep architectural reasoning or nuanced design decisions. Their strengths lie in:

Documentation: Writing and explaining docstrings, README files, usage examples.
Specifications: Turning rough ideas into structured outlines or developer-ready specs.
Prototypes & Demos: Rapidly building out proof-of-concept versions to validate ideas.
Small Tools & Apps: Generating end-to-end code for focused use cases with manageable complexity.
Testing: Writing basic unit tests and suggesting edge cases.
Investigations & Research: Exploring APIs, libraries, or unfamiliar tech stacks.
Experiments: Quickly trying out new ideas to see what works.
Coding Assistance: Filling in boilerplate, writing helpers, answering “how do I…” questions.
Research: Summarizing technical docs, papers, or providing background context on a domain.

These are the tasks where speed and iteration matter more than perfection, and where a helpful, tireless coding partner (even if a bit forgetful) makes all the difference.

Vibe Coding

These strengths are embraced in a new paradigm called ‘Vibe Coding’.

Vibe coding is less about carefully architected systems and more about improvisation. You start with a general idea — not a full spec — and then collaborate with the LLM to shape and reshape your code on the fly. You’re effectively “jamming” with the model. Like musicians riffing off each other, the coder and the model bounce ideas, snippets, and tweaks back and forth until something functional and hopefully elegant emerges.

This style of development feels like pair programming with an extremely eager junior partner: the LLM is fast, always responsive, and ready to refactor or regenerate entire components at your request. But unlike traditional development, where you might carefully plan out your architecture or data model, vibe coding often starts with something much looser — and that’s both its strength and its weakness.

What It Feels Like

You begin with a vague description: “Let’s build a Flask API that can summarize news articles.”
The LLM generates boilerplate code and a basic implementation.
You skim it, say “Okay, now make it async,” and it does.
You ask for tests, and they show up.
You start changing directions — “Actually, let’s use FastAPI instead” — and the code morphs again.

The process is quick, fluid, and remarkably satisfying, at least until something breaks, or you realize you’re not entirely sure what the code is doing anymore.

The Pros and Cons

Pro: It enables rapid prototyping and exploration. You can get from idea to running code shockingly fast.
Pro: It reduces the friction of trying new libraries, frameworks, or patterns — the model does the integration work.
Con: The code quality is uneven, especially if you don’t take time to review and test.
Con: Architecture tends to emerge as a side effect, not as a plan. This can make long-term maintenance painful.
Con: You can end up with a “cargo cult” codebase — things work, but no one knows why.

Vibe coding isn’t going to replace rigorous software engineering any time soon, but it’s a useful, fast-moving tool in the modern developer’s toolkit. It’s great for exploration, prototyping, and learning. But I don’t believe you can currently Vibe Code your way to an enterprise-quality professional application.

Enterprise Quality Software

For the past 20 years I’ve spent most of my time working on enterprise systems: software that must work securely and reliably for major customers and millions of users. This is significantly different from Vibe Coding and the sorts of work that LLMs currently excel at as it involves far more complicated applications with many specifications and numerous non-functional requirements.

For these sorts of professional applications LLMS are currently still lacking. They struggle to keep all of the details ‘in mind’ and produce high quality code. They can hallucinate and mislead. As clearly demonstrated throughout this exercise, they can write basic logs, comments, and tests, but lack the depth and insight that professionals bring.

Greater Context to the Rescue?

So was it a mistake to try to follow Harper Reed’s more structured approach to application development? Not necessarily. LLMs are rapidly improving, and may soon overcome some of their current limitations.

Some of the types of challenges I noted both with the approach the models took to defining the project and their ability to understand and work with complex code might be addressed by the significantly larger contexts that the latest LLMs are achieving.

With enough context you can point an LLM to a wide variety of source material related to your application and the model might, in some fashion, consider this material as it works through the process of creating questions, specifications, and blueprints. In this way it might be able to achieve a deeper understanding of the requirements closer to that of humans.

Likewise, you could point the system to reference documents (such as coding standards), best-practice information, and source code for large production programs such that it would ‘learn’ the paradigms your teams use when developing enterprise-quality products. With the recent introduction of the MCP standard by Anthropic it is already possible to point their models to up-to-date language definitions.

I have not yet seen any of this in practice. I’ll look forward to finding out if these advances result in significantly better results.

A Replacement for Developers or Productivity Tools?

LLMs can and already are replacing some Developers, QA, Design, and Product folks. However, they cannot reliably work as well as skilled professionals. In this sense the initial question was always slightly misleading–LLMs are both replacements for some people, and tools that help others work more productively.

This should not be too surprising–all good tools work like this. A hand saw can help one person cut more wood than many people with stone tools, and a chainsaw dramatically improves the performance over the hand saw. So in both cases you can say that improved tools may replace people. Of course if you are not careful you can more easily cut off a limb when using the chainsaw…

As Andrej Karpathy, one of my favorite AI experts said regarding LLMs: “I don’t fully trust what is coming out here (from the model), this is just a probabilistic statistical recollection of the internet.”

For complicated, significant applications, particularly those that must be Enterprise Quality, you need to maintain a healthy skepticism like Andrej and have professionals specify, review, and test the work. Despite current improvements, including better algorithms (such as Thinking LLMs) and vastly increased context, LLMs still need developer assistance to successfully work with larger, more complicated projects and code bases. But they are already too useful to ignore and will get even more capable as time goes on.

Modern LLMs are amazing tools, but like all tools they provide the best results when wielded by skilled professionals.