You Didn't Use AI. AI Used Your Afternoon.

The Rabbit Hole Has No Bottom. Why AI makes everyone a better researcher and a worse finisher and how to avoid it.

Jun 12, 2026

A product manager at a Series B fintech company told me recently that she had spent four hours with Claude preparing for a board presentation. She had the deck open the entire time. When she closed the laptop, the deck had two new slides. The conversation log had 68 messages. She could not account for where the time went, except that every ten minutes she understood her market slightly better and her slides were no closer to done.

She is not unusual. She may, in fact, be the median AI user in any knowledge work environment today. And the reason her experience deserves serious scrutiny is that she was not procrastinating. She was not on Twitter, not in her inbox, not doing any of the things that productivity advice is designed to combat. She was working. The tool she was working with simply had no interest in helping her finish.

What research used to cost

Before generative AI, investigating a question required effort that escalated with depth. If you wanted to understand how a competitor priced their implementation services, you might start with their website, then read a Gartner comparison, then call a colleague who had evaluated them recently. Each step took time, demanded context switching, and often hit a dead end that forced you back to the task that prompted the question. The dead end was doing important work: it told you that the incremental value of knowing one more thing had dropped below the cost of finding it out.

Economists would recognize this as a natural price signal regulating consumption. You stopped researching when research got expensive relative to its value, which meant that people with finite time budgets allocated between exploring and producing in roughly sensible proportions, not because they were disciplined but because the environment imposed the discipline for them.

AI removed the price without replacing the signal. A follow-up question in ChatGPT or Claude costs five seconds and returns something plausible and detailed regardless of whether the question serves your original purpose. The background calculation that once said “this is probably far enough” never fires, because there is no cost input to trigger it. What you get, predictably, is the same thing that happens in any market when you remove the price of a good: people consume past the point where the marginal unit adds value. In this case the good is information, the consumption is asking questions, and the point of negative returns is the moment when learning more actively delays the work that learning was supposed to support.

Why it feels like working

The part that makes this problem genuinely difficult, and distinguishes it from ordinary distraction, is that every individual message in the conversation is useful. You asked something reasonable. You got something informed. Your understanding of the subject improved. If someone looked over your shoulder at any single exchange, they would conclude you were being productive.

The trouble is visible only at the level of the session. Forty messages of individually productive exchanges can, in aggregate, produce nothing. A long conversation thread with an AI model creates a kind of phantom artifact: it has weight, it has substance, it represents real intellectual labor, and nobody will ever read it. The deck, the memo, the email that someone is actually waiting for remains untouched while the conversation log grows into a document that serves no audience and meets no deadline.

This is where a concession is warranted, and it is not a small one. Sometimes the rabbit hole is the point. Some of the most valuable strategic thinking happens precisely when a person follows an unexpected thread and arrives at a connection they would not have made through linear execution. The product manager with the 68-message conversation may have developed an insight about her market that reshapes her company’s positioning for the next two years. That does happen. The problem is that it happens perhaps one time in fifteen, and the other fourteen times produce what a generous observer would call background knowledge and an honest one would call motion without progress. Most people cannot tell, in the moment, which session they are in.

The abstraction ladder

One pattern recurs with particular consistency. A person sits down to write something specific, say a two-paragraph justification for a budget increase, and asks the model for help with the opening sentence. The model’s response mentions a concept, perhaps zero-based budgeting or total cost of ownership framing. The person asks about the concept. The model explains the broader framework. Within fifteen minutes the conversation has ascended from “write me two paragraphs about why we need $40K more for cloud infrastructure” to a discussion of how CFOs at PE-backed companies evaluate technology spending requests.

Each rung of that ladder felt like a reasonable step. Understanding how CFOs think should, in theory, make the two paragraphs more persuasive. In practice, the person who spent the fifteen minutes climbing the abstraction ladder has worse paragraphs than the person who spent fifteen minutes writing three drafts, because the second person has something to edit and the first person has only theory. The theory might be correct and still useless, because the bottleneck was never insufficient understanding. It was insufficient text on the page.

What makes this particular failure mode so persistent is that AI models are equally willing to discuss frameworks and to draft the actual paragraph. They do not distinguish between the two. The person’s own preference does the sorting, and most people prefer frameworks because frameworks are more intellectually engaging than grinding out a paragraph whose content they already roughly know. Nobody makes a conscious decision to choose theory over output. It happens through a sequence of individually reasonable follow-up questions, each one moving upward when the task needed them to move forward.

The forking architecture of AI conversations

A Google search returns ten links and you pick one. A conversation with an AI model returns a response that contains, embedded in its paragraphs, three or four ideas that could each become the next question. The model does not mark any of them as the one most relevant to your original task. It treats them with equal depth, equal confidence, and equal implicit invitation to explore further.

The selection pressure this creates is worth taking seriously. When a response contains one thread that would help you finish and another that opens a surprising new question, the surprising new question wins almost every time. Novelty is more compelling than closure. An unfamiliar concept is more engaging than the familiar grind of converting knowledge into a document. So the conversation forks toward the novel, and then forks again at the next response, and within four forks you are in a conversation whose connection to your original task requires genuine archaeological effort to reconstruct.

This would matter less if people abandoned unproductive conversations easily. They do not. Thirty messages into a conversation, switching to a new one means re-establishing context, re-explaining your situation, losing the thread of whatever useful things appeared in messages 8 through 14. The switching cost is real even if the conversation has drifted somewhere unhelpful, and so people stay in conversations they should leave for the same reason they stay in meetings that lost their agenda twenty minutes ago: the sunk investment feels too large to abandon.

When to stop has always been the hard problem

The thing that rarely gets said plainly in discussions about AI productivity is that most people were already mediocre at knowing when to stop researching and start producing. AI did not create this weakness. It removed the environmental constraints that had been compensating for it. A person who would have spent three hours in a library before writing a report now spends three hours in Claude before writing the same report; the difference is that the library closed at 5pm and Claude does not.

The operating challenge, then, is not about using AI less. It is about rebuilding, through deliberate practice, the stopping function that the old information environment provided for free. Some of what follows will sound like productivity advice, and in a narrow sense it is. But the underlying argument is structural: AI shifted the burden of scope management from the environment to the individual, and most individuals have not yet adjusted.

Operating Principles

These are not tips. They are constraints designed to replace the ones that AI removed.

Declare the deliverable before you open the chat. One sentence, written down, describing the specific output. “Write the pricing section of the Acme proposal, three paragraphs, referencing the TCO analysis” is a deliverable. “Explore pricing strategy” is an invitation to spend two hours learning things that do not need to go in the proposal. The sentence is your scope boundary. Anything the conversation wanders past it should trigger a conscious decision to stop.

Prompt: “I need to write a 3-paragraph pricing justification for a $180K annual engagement. The client is a PE-backed healthcare operator comparing us to buying a SaaS platform. Draft those three paragraphs. Do not explain pricing theory or suggest alternative approaches.”

Set a message budget. Decide before you start how many exchanges this task warrants. Five is reasonable for a single deliverable. Ten for something with multiple sections. When you hit the number, stop prompting and start editing what you have. The budget is not a productivity hack; it is a synthetic price signal replacing the one that used to exist naturally.

Prompt (message 1 of 5): “I have five messages to complete this. Message 1: Draft an executive summary of our Q2 pipeline for the board. Two paragraphs. Here are the numbers: [data].”

Give the model a draft, not a question. Starting with “how should I structure this proposal?” invites exploration. Starting with “here is my rough draft, clean it up” anchors the conversation to an artifact that already exists. Editing something is structurally different from discussing something, and the former has a natural endpoint that the latter lacks.

Prompt: “Here is my rough draft of the partnership proposal introduction. Tighten the language, fix logical gaps, and keep it under 200 words. Do not add sections or raise new considerations.”

Name the tangent out loud when you see one. The drift from task to rabbit hole happens silently, through a series of follow-up questions that each feel relevant. Making it visible breaks the pattern. When a response opens an interesting but off-scope thread, acknowledge it explicitly and defer it. This is not about being rigid; it is about making the departure from scope a conscious choice rather than an unconscious slide.

Prompt: “That point about channel conflict is interesting and I want to come back to it. For now, stay on the current task: finish the competitive positioning section.”

Separate research from production, in different sessions. If a task genuinely requires exploratory research before you can produce output, do the research in one session and the production in another. Research sessions are for branching, following tangents, and building understanding. Production sessions are for generating output and shipping it. The rabbit hole lives in the gap between these two activities, and the gap exists only when they share a conversation window.

Research prompt: “I need to understand the competitive landscape for NEMT technology platforms before I write a strategy memo. Key players, their positioning, primary technology differentiators.”

Production prompt (separate session): “Write a 1-page strategy memo recommending Medicaid managed care plan integration as our primary platform differentiator. Direct, assertive tone. No background section.”

Ban the improvement loop after the second pass. AI will always find something to refine. A first review is almost always valuable. A second catches things the first missed. A third is where diminishing returns begin to dominate, and most people cannot perceive the decline because each individual suggestion still looks reasonable. Two rounds of AI-assisted revision, then ship.

Prompt (second and final pass): “This is my final revision pass. Read the full document and flag only errors of fact or logic. Do not suggest stylistic improvements. Do not expand any section.”

Time-box the session, not the scope. “I will work on this until it is good enough” is a commitment with no boundary, because AI makes improvement perpetually available at near-zero cost. “I have 30 minutes for this, and I ship whatever exists at 10:30” imposes the same kind of external constraint that used to come from libraries closing, meetings starting, and colleagues needing your draft by end of day. The quality ceiling drops slightly. The completion rate rises dramatically. For most knowledge work, that is the better trade.

Discussion about this post

Ready for more?