Lost at the Intersection: Re-Thinking AI Tooling When Everything Changes Monthly
Tracking the developments in AI tooling is like driving in a foreign country and navigating a busy intersection after busy intersection. As soon as I learn and comprehend what one tool does or does not, a new tool with a new design philosophy comes available. It is making me do a re-think about how to use AI.
Let's take Cline's new Kanban tool for managing multiple agents. It nudges the human-in-the-loop verification more towards automation than Cline CLI. It offers a much improved reviewing capability during code generation. Having the ability to make CR comments in-line with the diff output is a familiar workflow. There are other tools such as Nimbalyst and Vibe Kanban (recently open-sourced). The catch with Cline Kanban is being aware that changes are tracked on independent Git working trees. Those changes then have to be reconciled with the original local repo. A series of dynamic prompts are issued to the model guiding it on how best to merge or cherry-pick the code changes with the original repo. The new point of workflow friction is verifying the Git worktree reconciliation was successfully completed.
RAG and vector databases are becoming obsolete. The advancement in model capabilities is pushing out the point where adding complexity is warranted. Except for edge cases, having AI-generated code that's good enough will become the norm.
I have already thrown away my own earlier MCP+RAG work and looking at Docker Agents. And disbanded my concerns about managing YAML frontmatter associated with sharing skill files. With Microsoft releasing a Go-compiled version of Typescript, providing a 10x performance improvement, I now wonder if it will flag the trajectory of Python's popularity.
Nanoclaw is a good example of the emerging model-centric agent implementations. Amazon's Strands is another example of leveraging the model over conventional GenAI management tools. Feyman, an agentic researching tool is another. It's built with pi and its pi extensions are – you guessed it – written in Markdown.
My rethink: LLM providers aren't selling compute, they're minting currency. The token has a fixed dollar price, but its purchasing power—how much real work it buys—floats with the model behind it. A token of frontier reasoning can be worth ten tokens of a less-capable model on a hard task, and the ratio inverts on an easy one. Model provider selection is an arbitrage problem.
The exchange rate moves on three axes: capability per token, rate limits, and subscription structure. The teams that will benefit from this realization will be the ones treating model choice the way a trading desk treats FX exposure. Continuously monitoring, per workload, with a clear view of where each provider's currency is strong and where it's weak.
The practice of switching between models without losing context, mid-workflow, will be the norm. Tools similar to Opencode's multi-provider strategy or pi-multi-pass, a pi extension, will become more common place.
For developers, coding without AI-assistance is going the way of the hammer in favor of nail-guns. Spec-driven AI design, phased implementations and extensive unit testing are the way forward.
Taking the time to optimize code is harder to justify. Instead, it's worth waiting until the tooling and models improve given the current rate of product evolution.
I am both excited and unsettled.
NOTE: I have updated this article since its original publication.