The Closest Thing to Sustainable AI Is Knowing When Not to Use It
April 11, 2026 · Restorative · 7 min read
Everyone's hitting their Claude limits in minutes. My Pro plan runs fine. The difference isn't the plan. It's intention.
The past week has been full of people complaining that their Claude plans are burning out in minutes. Posts everywhere: "Anthropic broke something," "the limits are too tight," "I need to upgrade."
My Pro plan keeps running perfectly fine.
I'm not surprised. And I'm not special. I just use the tool like I understand what it actually is.
A Statistical Model Is Not a Calculator
The core confusion driving most of this is a category error. People are feeding deterministic work to a statistical model and then wondering why it takes so many tokens to get a reliable result.
Deterministic work is anything with a correct answer that doesn't change. Sorting a list. Parsing a JSON structure into a predictable format. Validating whether a string matches a regex. Running a database query. These things have exactly one right output for any given input. A large language model is not built for this. It's built to navigate probability distributions across language. When you ask it to do deterministic work, you're using a jet engine to boil water.
You get there eventually. You waste a lot of fuel.
The people running out of tokens in twenty minutes are the ones asking Claude to do things that a five-line Python function, a shell script, or a basic SQL query would handle instantly, deterministically, and for free.
The MCP Problem Nobody Wants to Admit
The MCP (Model Context Protocol) situation is its own category of abuse. The tool is genuinely useful when applied with care. What's happening instead is that developers are stuffing every MCP server they can find into every call, regardless of whether those servers are relevant to the task at hand.
Most of them have no clear idea how MCP actually works under the hood. They see "more connections" as "more capability," the same way someone might think that having twelve browser tabs open makes their research better. It doesn't. It makes the model reason over a larger, noisier context window, which burns tokens, increases latency, and often produces worse results because the signal-to-noise ratio drops.
MCP works when you're precise. You include the server that's relevant to the current task. You remove it when the task is done. You treat the context window like a workspace you're responsible for keeping clean, not a junk drawer you keep stuffing.
Vague Requirements Are an Energy Tax
I was at ChangeNOW 2026 in Paris, and I had a conversation with Jin from Peace Therapist, an NGO working on mental recovery for people affected by war. Her use case for AI was careful and considered. She had a specific problem. She came with context. She knew what output she needed.
That interaction stuck with me because it's exactly the opposite of how most people approach AI tools. They arrive vague. They type something like "help me with my marketing" and then iterate through five rounds of clarification that could have been avoided by spending three minutes writing a proper brief.
Every clarification round is tokens. Every tokens is compute. Compute is energy. Energy is water, cooling, carbon.
I keep coming back to something I said at ChangeNOW: being ADHD is not an excuse for mistreating energy and water resources. I was talking about vibe coding specifically, the habit of building yet another to-do app or habit tracker because you can, not because you should. But the same logic applies to how people prompt. Vagueness isn't a personality trait you're allowed to export onto a data center.
Write the brief. Know what you want before you open the chat window. The model will reward you with a shorter, better answer. Your plan will last the month.
The Actual Limit Isn't the Plan
When someone tells me they've burned through their Claude limit by noon, my first question is never "which tier are you on?" It's always "what were you doing with it?"
Nine times out of ten the answer involves at least one of these:
- Using Claude as a search engine (use Perplexity or just Google)
- Using Claude to write boilerplate code that a template or a snippet library would generate in seconds
- Using Claude to summarise documents they haven't read, when the summary they need is in the first two paragraphs
- Running long regeneration loops because the output wasn't what they wanted, when the real problem is the prompt didn't say what they wanted
- Keeping a conversation thread alive for days and treating it like a persistent memory system it was never designed to be
None of this is a plan limit problem. All of it is an intention problem.
Sustainable AI Is Just AI With Standards
The sustainability conversation in AI usually focuses on the infrastructure side. Data centre water consumption. Chip manufacturing emissions. Grid energy sources. These are real and worth addressing.
But there's a demand side that gets almost no attention, and it's the one individual practitioners actually control.
Every unnecessary token request is a small withdrawal from a shared resource. When millions of people make those withdrawals carelessly, the aggregate is not small. And the irony is that the people making the most careless withdrawals are also getting the worst results, because intentionless use produces intentionless output.
This is the argument I keep making across my work, whether I'm advising organisations on AI adoption, finishing a production, or thinking through what tools belong in a workflow. AI used without intention extracts more than it creates. From the infrastructure. From your own skill development, because you never learned to write the brief or build the function. From the creative or business problem you were trying to solve, because the output reflects the quality of your thinking, not the capability of the model.
The model is capable. The question is whether you showed up capable too.
What Actually Works
For the record, here is what keeps my Pro plan running fine:
**I use Claude for genuinely ambiguous work.** Synthesis across conflicting sources. Finding the argument in a pile of notes. Editing for tone when I can't see the text clearly anymore. These are problems where probabilistic reasoning over language is exactly the right tool.
**I write the context once, at the start.** Not across six messages. One clear brief. Role, goal, constraints, format. The model doesn't need to guess, so it doesn't waste tokens guessing.
**I don't ask it to remember things.** I keep a local notes file. If I need context carried across sessions, I paste the relevant part at the start of the next session. Clean context, not accumulated noise.
**I route deterministic work elsewhere.** If it's a calculation, I use a calculator. If it's data transformation, I write the code or use a purpose-built tool. If it's a search, I search.
**I load MCP servers for the task, then remove them.** Not a permanent configuration that runs with everything.
None of this is secret knowledge. It's just treating a precision tool with the respect a precision tool requires.
The People Who Get This Already
The Jins of the world get this. When you're building something that actually matters, something with a clear problem at the center and real consequences if it fails, you can't afford to be vague. You can't afford waste. The constraint of purpose sharpens how you work.
The people burning through their plans in twenty minutes are usually, if I'm honest, the ones building things they don't need, for reasons they haven't fully examined, using tools they don't fully understand. The token limit isn't their enemy. It's feedback.
Sustainable AI doesn't start in a data centre. It starts in the moment before you open the chat window, when you ask yourself whether this is actually the right tool for what you're about to do.
Most of the time, if you answer that question honestly, you use less. And you get more.
Want the next post?
Leave your email and I’ll send you the next one. No spam, no nonsense.