How business leaders should think about AI coding tools and what to expect from their engineering teams.
You don't need to understand AI coding tools. You need to understand what to demand from the people who use them and what you are spending for
Every business leader I talk to has seen the same LinkedIn video. Someone types a sentence into a tool, and sixty seconds later there’s supposedly a “working application” on screen. The implication is always the same: building software is now trivially easy, and if you’re still paying significant money for custom technology, you’re either behind the times or getting fleeced.
That implication is wrong, but it’s wrong in a specific and instructive way. The tools have gotten dramatically better. The part of building software that involves typing code has been compressed by 50 to 80 percent, depending on who you ask and what they’re measuring. What has not been compressed is everything else: understanding what the business actually needs, designing systems that hold up under real operational load, integrating with the five other platforms you already run, handling the edge cases that only surface when real customers use the thing, and maintaining it after it ships. That “everything else” is where most of the cost and risk has always lived, and no AI tool has changed that.
This article is not a buyer’s guide to AI coding tools. You don’t need one, for the same reason you don’t need a buyer’s guide to surgical instruments before a surgery. What you need is enough understanding of the landscape to ask intelligent questions of the people building technology on your behalf, whether that’s an internal CTO, an IT team, an outsourced development firm, or some combination.
Three ways software gets built now
The current AI coding tools market can be bucketed into three distinct categories, each built on a different theory about who should be involved in the process of creating software and how much human judgment is required.
The first category is tools for professional developers.
Products like Cursor, Claude Code, GitHub Copilot, and OpenAI Codex sit inside a developer’s working environment and accelerate what they do. Think of them as power tools in a carpenter’s shop. A table saw does not eliminate the need for a carpenter who understands joinery, load-bearing, and finishing. It lets a skilled carpenter work faster. These tools range from $10 to $200 per month per developer, and the best engineering teams use two or three of them in combination, matching each tool’s strengths to different categories of work.
The critical point for you as a business leader: these tools do not change who you need on the team. They change how fast a competent team can move. A recent Anthropic study found that developers who used AI to generate code without understanding what it produced scored 17 percent lower on comprehension tests than developers who used AI to learn and then wrote the code themselves. Speed without understanding is a cost, not a benefit, because the code still has to be maintained, debugged, and extended by someone who knows what it does.
The second category is “vibe coding” platforms.
Lovable, Bolt.new, Replit, and Vercel V0 let non-developers describe an application in plain English and receive working code. The marketing is compelling: describe your idea, and a functional prototype appears in minutes. For certain use cases, the marketing understates the reality. These tools are genuinely extraordinary for building prototypes, proof-of-concept demos, simple internal tools, and marketing landing pages.
The part the marketing omits is what practitioners call “the last 10 percent problem.” Getting from a working demo to a production application that handles real users, real data, real security requirements, and real integration with your existing systems is where these tools run out of road. One widely cited practitioner review put it plainly: vibe coding tools are brilliant for building something you can show to someone, and then you should pay a developer. Multiple independent assessments converge on the same conclusion. The initial build phase is largely commoditized across tools, meaning any of them can produce a demo. What differs is whether the result can survive contact with your actual business.
The third category is outcome-based services.
Rather than giving you a tool to build software, a firm builds the software for you and delivers the finished application using what ever the latest, fastest, coolest tool available. You get the ERP, the customer portal, the AI-powered workflow, the data platform. You do not get a development environment, a subscription to an AI coding tool, or a codebase you’re expected to maintain yourself. The firm uses the professional developer tools from the first category internally, which is how they deliver faster and at lower cost than traditional development shops. But the tools stay on their side of the glass. Your interaction is with the finished product. You don’t have to buy a power plant to turn on a light bulb.
The strategic difference between these categories is not price or speed. It is who absorbs the risk. In the first category, your team absorbs it. In the second, you absorb it personally. In the third, the firm absorbs it contractually.
What is actually happening versus what you are being told
The narrative in the market right now has two risks that you need to understand, because both of them cost real money.
The science project.
A team experiments with a vibe coding platform or an AI coding tool, builds an impressive demo, presents it internally with great enthusiasm, and then the project stalls. It stalls because the demo cannot handle the integration with your existing ERP. It stalls because the authentication model doesn’t meet your compliance requirements. It stalls because the thing that worked perfectly with ten test records falls over with ten thousand real ones. The demo was the easy part. Getting from demo to production was always the expensive part, and the demo created the false impression that the expensive part had already been solved. I have always believed that if a piece of software does not spend more than 90% in production, that software is not needed.
I see this pattern in mid-market companies with alarming frequency. Someone on the team builds a prototype over a weekend, the CEO sees it on Monday and asks why the IT department can’t ship this fast, and the organization spends the next three months trying to make a prototype do production work. The cost of that exercise, measured in time, morale, and opportunity, routinely exceeds what it would have cost to build the thing properly from the start.
The productivity mirage.
A development team adopts AI coding tools, reports a significant increase in lines of code produced per day, and leadership declares the investment a success. Meanwhile, the codebase is accumulating technical debt at an accelerated rate because the code is being generated faster than it can be reviewed. The 17% comprehension gap from the Anthropic study is the empirical signal: developers who delegate code generation to AI understand the resulting code less well than developers who write it themselves. Multiply that gap across an entire team producing code at twice the previous rate, and you have a codebase that is growing faster than anyone’s ability to understand it.
This is not an argument against AI coding tools. It is an argument that the tools are only as good as the review discipline, the testing infrastructure, and the architectural judgment of the team using them. The binding constraint on software quality has shifted from writing speed to review capacity, and most organizations have not adjusted their processes accordingly.
What you should ask your CTO, IT lead, or development partner
You do not need to understand the technical specifics of these tools. You need to ask questions that force the people who do understand them to give you honest answers about risk, cost, and outcomes. Here are the questions that matter, along with what a credible answer sounds like.
”We’re using AI coding tools. How has our review process changed to account for the increase in output?”
This is the single most revealing question you can ask. If the answer is that nothing has changed, you have a problem. If the team is producing code twice as fast but reviewing it at the same rate, the gap is being filled with unreviewed AI-generated code. A credible answer describes specific changes: dedicated review time, automated testing requirements before code is merged, architectural review gates for AI-generated components. An evasive answer talks about how the tools are “really accurate” and “almost never wrong.” The tools are impressive. They are not infallible. And the cost of their mistakes compounds in ways that are invisible until something breaks in production.
”Someone on the team built a prototype with Lovable (or Bolt, or Replit, or ChatGPT). They want to take it to production. What would that actually involve?”
The honest answer is usually that the prototype needs to be substantially rebuilt. The vibe coding tools generate clean-looking applications that are structurally fragile. They typically use a single database provider (usually Supabase), a narrow technology stack (usually React and TypeScript), and minimal error handling. Taking a prototype to production means integrating it with your existing systems, adding authentication and authorization that meets your security requirements, handling the edge cases that the AI didn’t anticipate, building the administrative and reporting features that no demo ever includes, and setting up monitoring, backups, and deployment infrastructure.
None of this means the prototype was a waste. Prototypes are enormously valuable as communication tools. The prototype showed everyone what was possible and aligned the team around a shared vision. That alignment is worth something. What the prototype is not, and was never going to be, is the production system.
”If we’re building this in-house with AI tools, what does the team need to look like?”
The answer to this question has changed less than you might expect. You still need someone who understands software architecture, meaning how systems are structured, how components communicate, and what happens when load increases or a component fails. You still need someone who understands your business domain well enough to translate business requirements into technical specifications. You still need someone who can review code at depth, not just for syntax errors but for security vulnerabilities, performance implications, and maintainability.
What AI tools have changed is the ratio. Where a project might have previously required six developers, it might now require three, with each developer producing roughly twice the output. But you cannot replace three senior developers with six junior developers and AI tools and expect the same outcome. The tools amplify expertise. They do not substitute for it.
”What is the total cost of ownership, not just the build cost?”
Business leaders consistently underestimate the cost of maintaining software after it ships. Industry data suggests that maintenance, enhancement, and support typically account for 60 to 80 percent of a software system’s total lifetime cost. When you evaluate a proposal to build something, whether internally or through a vendor, ask explicitly about year-two and year-three costs. Ask who will be available to fix bugs, add features, and update the system when the underlying platforms and APIs change. Ask what happens if the person or team who built the system is no longer available.
This question is particularly important when evaluating vibe-coded prototypes and freelance-built applications. The build cost may be low. The cost of maintaining a system that was built quickly, without documentation, without tests, and without architectural forethought, is rarely low.
”Are we buying a tool or buying an outcome?”
This is the framing question that clarifies the entire decision. If you are buying a tool (an AI coding platform, a development environment, a set of licenses), you are accepting responsibility for turning that tool into a business outcome. You need the team, the process, and the judgment to do that. If you are buying an outcome (a working application that meets defined requirements), the vendor is accepting that responsibility, and their tools and processes are their problem, not yours.
Neither answer is universally right. For organizations with strong internal engineering teams, buying tools and building internally can be faster and more cost-effective. For organizations without deep technical leadership, buying outcomes eliminates the risk of the science project and the productivity mirage. The wrong answer is the one that conflates the two: buying a tool while expecting an outcome, or paying for an outcome while insisting on controlling the tools.
”What happens when something breaks at 2 a.m. on a Saturday?”
The question sounds operational, but it is actually a strategic question about dependency and risk. A vibe-coded application built by a non-technical team member has no support structure. An application built by a freelancer who has moved on to other projects has no support structure. An application built by an internal team has the support structure of that team’s availability and retention. An application built by a firm with a contractual support agreement has the support structure of that contract.
The 2 a.m. question forces a conversation about what kind of dependency you are willing to accept. Every technology decision creates a dependency. The question is whether that dependency is explicit, contractual, and manageable, or implicit, unstructured, and fragile.
How to evaluate what you are being shown
When someone presents you with a technology proposal, a vendor pitch, or an internal prototype, here is a framework for evaluating it that does not require you to understand the technology itself.
Ask how the system handles the unhappy path. Every demo shows the happy path: the user does exactly the right thing, the data is clean, the system responds perfectly. Ask what happens when a user enters invalid data, when the payment processor is down, when two people try to update the same record simultaneously, when the system needs to process ten times the expected volume. If the answer is “we haven’t gotten to that yet,” you are looking at a demo, not a production system. That is fine, as long as everyone in the room acknowledges it.
Ask about integration. Your business does not run on a single application. It runs on an ecosystem of tools, platforms, and data sources that need to talk to each other. Any new technology that cannot integrate with your existing ERP, CRM, billing system, or data warehouse is creating a silo, and silos have costs that compound over time. Ask specifically how data will flow between the new system and the systems you already have.
Ask about the exit strategy. What happens if you need to change vendors, bring the system in-house, or migrate to a different platform? Can the code be exported and maintained independently? Is it built on open standards, or is it locked into a proprietary platform that you cannot leave without starting over? The vibe coding platforms vary significantly on this dimension. Some generate portable, standard code that any developer can continue working on. Others couple the application tightly to their own infrastructure in ways that make migration expensive.
Ask for the second-year budget. Not the build budget. The budget for the year after the system is live. If that number is zero, the proposal is either incomplete or dishonest, because every software system requires ongoing maintenance, security updates, and adaptation to changing business requirements.
The question underneath all the other questions
The AI coding tools revolution has done something important and genuinely positive: it has compressed the distance between an idea and a working prototype from months to hours. That compression is real and valuable. But it
has also created a dangerous illusion, which is that the prototype is the hard part. It never was. The hard part is understanding what the business actually needs, building something that works under real conditions, integrating it with everything else, and keeping it running after the excitement of the launch has faded.
The question underneath all the other questions is: who is responsible for that hard part? If the answer is clear, contractual, and backed by demonstrated capability, you are probably making a sound technology investment. If the answer is vague, or if no one has asked the question at all, the investment is speculative regardless of how impressive the demo looked.



