Opus 4.7 has a new parameter called task_budget. Pay attention to it.

On 16 April, Anthropic released Claude Opus 4.7. Most of the coverage focused on the headline capabilities. A million-token context window at standard API pricing. A new high-resolution vision mode. A tokenizer change. Benchmark wins across the board.

However, a quieter change to parameters matters more.

Anthropic removed temperature, top_p, top_k, and the extended thinking budget. Four parameters developers had been using to tune how the model sampled and reasoned. In their place, one new parameter. task_budget. An advisory cap on how much thinking the model should do for a given task, with a running countdown the model can see as it works. The deletions and the addition tell the same story. Stop tuning how the model thinks. Start bounding how much thinking it can afford.

That is a different kind of relationship with the model, and it is the direction the frontier has just committed to.

From configuration to delegation

The removed parameters were all knobs for shaping the model’s internal behaviour. Temperature tuned the randomness of token selection. Top-p and top-k shaped the sampling distribution. Extended thinking budgets let callers dial how deeply the model reasoned before answering. Each of them treated the model as a machine whose internal workings the developer was expected to configure.

Deprecating them is a statement. The model will handle that itself, adaptively, based on what the task needs. What the developer is being asked to specify instead is not how the model should think, but how much thinking is appropriate for the task at hand.

That is delegation, not configuration. A more adult relationship with the model. You are no longer adjusting dials on the machine. You are telling it what the job is worth and trusting it to plan within that constraint.

The first thing most technical readers will assume is that task_budget is max_tokens with better marketing. It is not, and the distinction matters.

max_tokens is a hard per-request ceiling the model is not aware of. It is a firewall. When the request hits it, the response gets truncated, regardless of whether the model was in the middle of a sentence or a reasoning step. task_budget is a budget the model can see, with a running countdown, that it reasons over as it plans. It shapes how work is done inside the request, not whether the request completes. The difference is the difference between an HTTP timeout and a priority input to a scheduler. One kills the work. The other shapes how the work happens.

That shift, from external cap to internal planning input, is what makes this meaningful.

The catch

The parameter is only useful if the agent has something for the countdown to act on.

An agent that was architected as “query all the relevant sources, assemble the context, produce the answer” has no trade-off surface. There are no decisions for the budget to shape. The agent will either ignore the countdown and blow through the budget, or see the countdown, lack any way to adapt, and produce worse output or refuse the task entirely.

Anthropic’s own docs admit this. If the model is given a task budget that is too restrictive for a given task, the documentation warns, it may complete the task less thoroughly or refuse to do the task entirely.

That is not a model limitation. It is an architecture problem. If the loop has no trade-off surface, no budget can fix it.

The trade-off surface

An agent has a trade-off surface when its decision loop is structured so that it can choose between doing more and stopping.

Iteration limits that let the agent stop when it has gathered enough. Prioritisation rules that decide which tool calls matter most. Confidence thresholds that trigger early completion when the answer is clear. Tool calls weighted by expected value rather than made by default. All of these give the agent something to vary when pressure is applied.

Without them, the agent has one shape. With them, the agent can flex. task_budget is pressure applied through the planner. The planner needs something to flex.

This is the bit most teams will get wrong. They will see task_budget in the release notes, add it to their API call, and discover that their agent either ignores it or chokes. The problem will not be the parameter. The problem will be that the agent was architected to run until it felt like stopping, and there is no way to retrofit a planning decision into a loop that was never designed to make planning decisions.

What this looks like when it works

We built a chat-based expert system for a coffee manufacturing business in the Netherlands. I have written about it before, but the architectural detail is worth revisiting now, because it is the exact shape task_budget was designed to work with.

The agent had access to 23 data sources. Demand data from the ERP. Demand predictions. Crop forecasts for raw coffee. Weather forecasts. Currency exchange rates. Market intelligence reports. Users could ask questions like “what coffee buying opportunities are there over the next few months” or “what issues do we need to address.”

The agent was instructed to query these sources iteratively. Assess the question. Decide which sources are most likely to be relevant. Pull data. Assess what it has learned. Decide if it needs more. Pull again. Build its own understanding of the situation before answering.

Without explicit iteration limits, the agent would keep going. Not because it was broken. Because it was doing what we asked. There is always another source that might be relevant. Another angle to check. Another forecast to cross-reference. We capped the iterations. The agent works within a cost envelope. It still queries multiple sources, still builds knowledge iteratively, still gives thorough answers. But it does so within bounds we set deliberately.

We did not build this system with task_budget in mind. The parameter did not exist when we built it. We built it with iteration caps because we understood that agents with unbounded decision loops produce unpredictable costs and unpredictable behaviour. Constraint was the engineering discipline that made the system viable.

The architecture that makes it disciplined is exactly the architecture that makes it compatible with task_budget. The trade-off surface is already there. The agent prioritises. The agent stops when it has enough. The planner has places where it can choose. Applying pressure through a budget parameter would shape the work, not break it.

This is what budget-awareness looks like when it has not been retrofitted.

Do the work now

The organisations that have been architecting for bounded autonomy will find their existing systems already compatible with what just shipped. They will add task_budget to their API calls and their agents will do roughly what they were already doing, but now the caller can shape the envelope from outside.

The organisations that built agents designed to run unbounded are about to discover something harder. The parameter is not a config change for them. It is a rewrite. The planner has to be restructured so that trade-offs are possible in the first place, and that kind of work does not get done in a sprint.

Start with the loops that matter most. Look for the points in the decision cycle where the agent is doing something by default rather than by choice. Those are the places trade-off surfaces need to exist. The agent that queries every source because it does not know how to prioritise needs to learn to prioritise. The agent that thinks for as long as it takes needs to learn to weigh the value of more thinking against the cost of producing an answer now. The agent that calls a tool because it always calls that tool needs to learn when the tool is worth calling.

Do that work now and the next parameter Anthropic ships, because there will be more, lands on a foundation that can use it. Anthropic has shipped the first version of what will become a family. Risk budgets. Tool budgets. Time budgets. Whatever comes next will ask the same question task_budget asks. Can your agent make a trade-off? If the answer is no, nothing in the API will save you.

Opus 4.7 has a new parameter called task_budget. Pay attention to it.

From configuration to delegation

The catch

The trade-off surface

What this looks like when it works

Do the work now

More from the blog

Build like your model can be switched off on Friday.

Your AI can't add up. So we stopped asking it to.

GitHub changed the deal. Here is how we built ours to hold.

Want to discuss this?