Coasean Bargaining at Scale

Decentralization, coordination, and co-existence with AGI

Sep 26, 2025

Today’s guest post is a long read by Seb Krier, who leads the Frontier Policy Development team at Google DeepMind. He writes in a personal capacity. If you want to pitch us an article, please send us a suggestion using this form.

Sergius, the Builder by Nicholas Roerich (1925)

Much has been written about how AI can pose risks to society, particularly in aging Western countries where a sense of latent anxiety has taken over the discourse on technology for the past decade. Sometimes this is legitimate, and sometimes it feels like a continuation of existing Western pessimism. Few have been able to advance a positive vision of what we should be striving for at a socio-political level. Here I’d like to make an attempt. This essay explores how, by providing cognition-and-agency on demand, AI agents could amplify human agency to the point where we can escape the zero-sum traps that have plagued political economy for centuries.

There is a timeless question at the heart of any (free) society: how do we allow individuals to pursue their own interests when one person's actions inevitably affect the well-being of another in ways that are negative-sum? Economists have a name for this: “externalities,” which can take either physical or financial forms. The sheer scale of this challenge was crystallized in a groundbreaking 1986 paper by economists Bruce Greenwald and Joseph Stiglitz. They demonstrated that because our world is rife with imperfect information, moral hazards, and incomplete markets, externalities are not the exception, but the rule. This pervasive market failure became the intellectual bedrock for modern regulatory regimes. But the solution has always been the same: the coercive hand of the state and a top-down micro-management of society. We are told that only a central authority, a government board or commission, can resolve these conflicts by dictating who can do what, and where.

But as economists since Hayek have explained, the planner in Washington (or in your state capital) simply cannot possess the dispersed, specific knowledge of time and place known only to the individuals on the ground. This isn't the kind of theoretical knowledge you find in books, but the contextual, practical, intuitive, experiential and immediate knowledge that emerges from a particular situation in time. Writing about urban planning, Alain Bertaud argued that “planners cannot possibly know the reasons households may have for selecting a specific housing location,” so mandates often end up becoming blunt and arbitrary. Such information is tacit and is only revealed through the actions and choices of individuals within a market. This blindness points to an alternative: letting people solve these conflicts themselves.

This is the essence of the work of Nobel laureate Ronald Coase, who argued that if bargaining were cheap and easy, a polluter and their neighbor could strike a private deal without any need for regulation. Of course sometimes some pollution would still happen, but the payoff to the neighbor would ensure that both parties are better off than the zero pollution or no-limits pollution counterfactuals. The tragedy is not the existence of the conflict, but the transaction costs that prevent these mutually beneficial deals from being discovered and executed. It’s also the lesson from Elinor Ostrom, who documented how real-world communities successfully govern shared resources like fisheries and forests through their own intricate local rules.

Their shared insight is that structures that encourage bottom-up order can work better than attempting to impose top down approximations for every conflict that requires a resolution. But their work also highlighted the formidable barrier that has tends to stand in the way: transaction costs. Transaction costs are not just legal fees; they are the friction of discovery, the difficulty of negotiation, and the expense of enforcement. They are the cognitive and logistical effort required to identify affected parties and strike a deal.

Historically, because these transaction costs were insurmountable, societies defaulted back to the planner. The inability to coordinate from the bottom up became the enduring justification for control from the top down. The result was always the same: clumsy, one-size-fits-all rules that stifle innovation, distort incentives, and are inevitably captured by special interests who learn to work the system for their own benefit. Today, we are repeating this same failure of imagination in the discussion around AGI. There is a rush to assume that the only way to manage its risks is through the same top-down control model, treating AGI as a centralizing technology by its very nature. If AGI is analogous to a weapon of mass destruction, a genie in a bottle - then surely a central authority is the “optimal” answer?

I find this frame quite myopic. It fixates on the risks of a powerful new technology while completely overlooking its potential to strengthen the governance mechanisms needed for a safe, coordinated society, which could well absolve the need for a centralized solution. As a general purpose technology, AGI is well placed to help us fix our decaying social and public institutions. Better cognitive capabilities also means better coordination, better governance, and better safeguards. Instead of empowering the central planner, AGI could finally empower the individual bargainers of Coase and Ostrom by arming them with the price system: what Michael Levin and Benjamin Lyon call the “cognitive glue” of a free society.

Obliterating transaction costs

The difficulty of millions of people discovering one another's preferences, negotiating, and enforcing agreements has always been the chief justification for government intervention. It’s why your neighbor’s leaf blower, my unwillingness to fund the local park, and a factory’s emissions all end up in blunt bans and political fights; we can’t cheaply find each other, state exact terms, and lock in a deal. The “transaction costs” are simply too high. But this may no longer be the case once we have AGI agents.

Before we begin, I think it’s important to avoid conceptualizing AGI as some sort of single omniscient brain-God - even though agents will effectively individually and collectively be ‘superintelligent’ and highly capable, and increasingly so over time. That is the central planner’s fallacy all over again. While we will continue to see ever-larger training runs creating powerful foundation models, I think it's a mistake to assume this results in a singular AGI that carries all economically valuable tasks; economics ultimately favors efficiency at the point of delivery (inference).

Running a hyper-general model for every specialized task is incredibly expensive, and so this reality drives specialization: general models will be compressed, distilled, and optimized for specific uses. The future landscape will therefore be a hybrid: a vast ecology of personalized agents, services, applications, and robots with varying degrees of generality. While many may descend from a few common foundational ancestors, their deployment will be diverse and specialized. As such, imagining 'AGI' as a singular entity is like talking about “Finance” as a singular thing.

I think it’s more helpful to imagine these agents through a different lens: consider AGI deployed as a vast ecology of personalized agents and systems. This emerging ecosystem is what Tomašev et al. (2025) characterize as the “virtual agent economy,” a new economic layer where agents transact and coordinate at scales and speeds beyond direct human oversight. While this ecology will contain countless specialized agents, let's focus on the one that matters most from an individual's perspective: your personal advocate. Think of it as a fiduciary extension of yourself: a tireless, extremely competent digital representative, closely tied to you, its principal.

What could such an agent do? In principle, it can negotiate, calculate, compare, coordinate, verify, monitor, and much more in a split second. Through many multi-turn conversations, tweaking knobs and sliders, and continuous learning, it could also develop an increasingly sophisticated (though never perfect) model of who you are, your preferences, personal circumstances, values, resources, and more. This should evolve over time - an agent’s alignment should follow the principal’s own evolution. Recent research on negotiation agents finds that “human-agent alignment” is profoundly personal. Users expect agents to not only execute goals but also embody their identity, requiring alignment on everything from preferred negotiation tactics to personal ethical boundaries and the specific public reputation they wanted to project. There are of course important privacy considerations here, but none of these seem fundamentally intractable. For example these systems could be built on technologies like zero-knowledge proofs and differential privacy, ensuring that preferences are communicated and aggregated without revealing sensitive underlying data.

Such an agent should also be able to communicate your preferences to millions of other agents in real-time, with a nuance and specificity that is currently impossible. It knows that you’ll tolerate loud music on a Saturday, but not on a Sunday; that you’d be happy to carpool, but only if it adds less than ten minutes to your commute; that you’d willingly pay a fraction of a cent more for clean electricity, but only during off-peak hours. All this in a split-second, at the right moment, for the right purpose. In other words, AGI could enable hyper-granular contracting. The friction that has always hindered us, the transaction costs that Coase and Ostrom identified as the great barrier to cooperation, could be massively reduced. So what can we now do in such a world that was otherwise not possible?

Pollution and road-traffic negotiations

Think of the agents as a built-in coordination device: instead of each actor guessing everyone else’s move (what economists would call a Nash deadlock), they can condition their actions on shared signals and contracts, unlocking deals that were previously out of reach - a correlated equilibrium.

Consider the implications. Your agent knows you have a child with asthma. A blanket “just ban the emissions” rule sounds tidy, but it flattens everything into the same position: trivial harms and intolerable ones, essential trips and frivolous ones. When a delivery truck’s agent plans its route, it doesn't need a government mandate to be considerate. It simply sees a higher “price” for entry onto your street, a signal broadcast by your agent, representing your strong preference to avoid diesel fumes. The truck's agent can then calculate, instantly, whether it is cheaper to pay the “clean air fee” to you and your neighbors, or to take a different route. Conversely, if your neighbor’s agent flags an emergency, for example if she’s in labor and needs the fastest route to the hospital, then everyone’s agents can auto-drop (or even invert) the price to clear a corridor, because they actually value her getting through fast. It's true that in some cases, enforcement of these contracts might cost more than their value; but this could be solved through automated escrows and reputation systems. Ideally the agent system transforms enforcement from a costly legal battle into a near-instantaneous computational verification.

In this scenario, the externality doesn't vanish, but it does get a price tag. And once a cost is made clear, the marvel of the market can solve it. The problem was never the pollution itself; it was the fact that the polluter was allowed to impose a health and financial cost onto you for free. To be clear, not all agent negotiations need to be purely financial. The system I’m envisaging could enable two distinct modes: economic negotiations where willingness-to-pay determines outcomes (useful for commercial activities like delivery routes), and as I’ll outline later on in the essay, democratic negotiations where each person gets equal voting weight regardless of wealth (essential for community values like neighborhood character). Agents can seamlessly switch between these modes depending on the issue at stake - using market mechanisms for efficiency where appropriate, while preserving democratic legitimacy for fundamental community decisions.

What’s key though is that agents make that payment possible, managing a million micro-transactions in the background, all based on how your values generalize across countless situations and contexts. When I lived in London, residents of my neighborhood were unhappy with congestion on roads so decided to essentially prohibit cars from going through it at certain times; taxis and local merchants were naturally pretty annoyed. With the agent-bargaining system, these low-traffic-neighbourhood detours stop being absolute: taxis can pay a dynamically discovered “cut-through” fee, while verified emergencies glide through at zero (or negative) price.

Neighborhood character negotiations

This mechanism clarifies plenty of other thorny disagreements too. Imagine a developer wants to build an ugly building in a residential neighborhood. Today, that is a political battle of influence: who can capture the local planning authority most effectively? In an agent-based world, it becomes a simple matter of economics. The developer’s agent must discover the price at which every single homeowner would agree. If the residents truly value the character of their neighborhood, that price may be very high. The project will only proceed if the developer values the location more than the residents value the status quo. Conversely, if the residents’ asking price is lower than the developer's willingness to pay, the project proceeds, and the residents are compensated. In either case, the true economic costs and benefits are accounted for. This mechanism forces the discovery of the most valuable use of the resource, moving beyond the current system where projects are either blocked entirely (socializing the loss of potential gains) or forced through politically (socializing the costs on the neighborhood).

But what if a resident decides to game the system and go for a really absurd price, holding everyone ransom? This is why you need a new secondary layer of institutions on top of these agents. Crucially, these institutions can be voluntary. In this neighborhood, homeowners can pool their agents into a simple bargaining club: each person privately inputs the minimum they’d accept; the software aggregates that into a single take‑it‑or‑leave‑it offer. This is essentially mechanism design in action: creating rules where being honest about your true minimum is the smartest move, not gaming the system. Overstating just risks killing the deal (you get zero), and if it clears, the payout is at the common clearing price - so padding your number doesn’t boost your check. The group speaks with one voice without surrendering property rights, and the developer sees a single, fair number instead of a hundred ransom demands.

Skeptics might reasonably worry that NIMBYs can still name absurd buy-off prices. This is a classic political economy dilemma. The benefits of blocking a project are concentrated among a few motivated homeowners, while costs such as higher rents, longer commutes, and slower growth are diffused across a wide, unorganized public. As Janan Ganesh puts it, the potential losers are an “unconscious blob of people” who don't even know what they're losing. Two guardrails fix this.

First, chronic hyper-bidders see their voting weight fade or must pay an “option fee”: a Harberger-style tax in which you periodically pay a percentage of the price you claim; overstate, and it soon hurts. For example, if you claim your property is worth $10 million to block a development, you must be prepared to pay taxes on that valuation too! Second, and more importantly, AGI agents can give that “unconscious blob” a powerful voice. Any coalition that vetoes must then reimburse that quantified loss, with agents handling the transfers automatically. The diffuse cost becomes a concentrated, explicit price. Stonewalling remains possible, but it now carries a real, rising cost. Moreover, with this setup, bargaining isn't just between NIMBYs and developers; other residents, now aware of the potential gains, can bargain directly with the holdouts.

Sugar and healthcare externalities

Consider another example: sugar/junk food consumption and public health. Proponents of a sugar tax correctly identify an externality: poor diet choices impose costs on the shared healthcare system. Their solution, however, is (shock!) a clumsy, top-down tax. This harms food producers, is regressive (as it affects the poor more than the rich), and ultimately imposes a cost on many people who would not in fact be “guilty” of imposing costs on the healthcare system. An agent-based market addresses the same problem with bottom-up precision.

Instead of lobbying the government, your health insurer's agent communicates with your advocate agent. It looks at your eating habits, calculates the projected future cost of your diet and makes a simple offer: a significant, immediate discount on your monthly premium if you empower your agent to disincentivize high-sugar purchases. At that very moment of decision, the market responds. Acting like a hyper-alert Kirznerian entrepreneur spotting a profit opportunity, a soft drink company's agent, to retain your business, might instantly propose a deep discount on a healthier drink.

Now consider smoking bans in public places. A simple free-market approach would let every restaurant or bar owner decide their own policy. But non-smokers value having a broad range of options for a night out; if smoking becomes the default, their social world narrows significantly. This loss of choice is a cost that a full-on ban tries to crudely handle. AI agent negotiation, however, allows for a more precise, Millian solution. Instead of banning the externality, we can finally price it in through voluntary, real-time negotiation. Once again, we’re not banning the externality, but pricing it in. This price wasn't imposed by a committee of very smart policymakers sitting in a grey room in Westminster, but discovered through voluntary, real-time negotiation. The choice remains with the individual, but it is now a truly informed choice, where the full costs and alternatives are transparent.

Another example can be seen in the rules we have on airplanes and the air we share with fellow passengers in this private space. Even when we don’t use government rules, airlines generally have to come up with a generic rule that works okay for who it expects to be on a typical airplane ride. During the COVID pandemic, even many people who wanted mask mandates for airplanes did not wear masks on flights themselves, as they considered the value of wearing a mask while others would not to be minimal.

Similarly, airlines generally do not make accommodations for people sensitive to airborne allergens. Virgin Airlines can't tell if your peanut allergy is life-threatening or just a mild inconvenience. To avoid opening the floodgates to thousands of hard-to-verify requests (“I'm allergic to perfume,” “I'm sensitive to blue lights,”) they just make a simple, inflexible rule, like “we will serve nuts.” Much of this is, of course, due to an aversion to what seems like inevitable lobbying for accommodations that would come from conceding the principle. However, if flight policies are negotiated over by AI agents, we don’t have to choose between all or nothing on masking. We don’t have to rule out accommodations to people with allergen sensitivities for fear of frivolous requests; instead we move from all-or-nothing mandates to nuanced, negotiated outcomes, where the intensity of a person's need is accurately represented and compensated.

This agent-negotiated world delivers three principles essential to a free and effective society.

First, accountability. A billionaire who wants to close a public beach for a private party faces a new constraint: his agent must make a public, auditable offer to every single person who would be deprived of access. The cost of his desire becomes explicit and traceable. Of course, he might still try the old route of bribing a bureaucrat in secret - but this parallel transparent market creates pressure and comparison points. When combined with AGI-enhanced governance (automated auditing, pattern detection for corruption etc.), the corrupt path becomes even more risky and costly.

Second, the power of voluntary coalitions. Today, diffuse interests are often ignored because the transaction costs of organizing are too high. A single person in a low-income neighborhood has little bargaining power. A multinational polluter is more likely to get away with building a monstrosity in a Brazilian favela than in the Hamptons, even if the true social cost is higher. But what if the agents of 10,000 residents, seeing a factory’s proposal to increase emissions, can form a bargaining coalition in a nanosecond? They can spontaneously band together and declare, “Our collective price to accept this pollution is X million dollars, non-negotiable.” They solve the collective action problem instantly, creating what is effectively a powerful digital union to counterbalance concentrated wealth.

Third, continuous self-calibration. Because every agent streams its user’s context-rich preferences into live markets, the rules themselves flex in real time. Noise caps, curb uses, even peak-hour electricity rates slide automatically as new bids and conditions roll in, rather than waiting for a city-council vote five years from now. Tacit desires, like how much quiet you need for a newborn’s nap or what premium you’d pay for a car-free street, become explicit, machine-readable prices. The system therefore functions as a permanent feedback loop. It detects mismatches between policy and lived reality, reprices the externality within seconds, and nudges behavior accordingly. Governance shifts from statute to thermostat: sometimes through formal institutions built on top of these agents, such as professional guilds, and sometimes through instantaneous ad-hoc “flash coalitions” - emergent order.

So what’s the catch?

The Coasean vision amplified by AGI agents is powerful, but it's not a panacea. Ronald Coase himself was no naive utopian, emphasizing that his theorem was a theoretical benchmark for a world without transaction costs, not a description of reality. In practice, the theorem has faced decades of rigorous critique from economists, legal scholars, and behavioral scientists, who argue that its assumptions crumble under real-world frictions. These limitations explain why Coasean bargaining rarely materializes today, leading societies to default to clumsy government interventions or inaction.

My response to this is twofold. First, the Coase theorem and its instantiation forces us to identify and analyze frictions, like imperfect information, legal costs, or strategic holdouts, that prevent efficient private solutions. This is not to say it solves everything, but it’s a powerful toolkit to prompt us to look for creative, private, and market-based solutions to problems where we might have only considered government regulation or violence before. Second, many of these critiques ignore what governance technologies and institutional arrangements AGI can enable in the first place - and I think there are good reasons to think that this technology can help us bypass limitations that would otherwise block progress on cooperation.

It’s true that even with perfect agent coordination, there remains what Acemoglu calls the 'political Coase theorem' problem: those with political power cannot credibly commit to not exploiting that power tomorrow, since no external enforcer exists for contracts with the sovereign itself. The sovereign can always renege. This is a tricky challenge, but the agent system offers countermeasures. First, the transparency created by agent negotiations raises the political cost of expropriation or bribery: it’s harder to steal what is clearly priced and publicly recorded. Second, AGI must be deployed not just for market transactions, but to enhance institutional accountability. In other words, we should automate many aspects of how we govern: automated auditing, real-time monitoring of regulatory capture, automated dispute resolution, automated public spending monitoring, and agent-based anti-corruption measures can harden the governance mechanisms that constrain the arbitrary use of power. Institutions matter!

Just as agents can aggregate citizen preferences for market negotiations, they can also transform how the “machinery of government” itself operates. The information asymmetries and coordination failures that James C. Scott describes in Seeing Like a State, where central authorities operate with crude categories that miss local knowledge, can finally be resolved as well. On the “executive” side, agent networks can provide governments with high-resolution, real-time feedback about policy impacts, citizen preferences, and emerging problems. On the “civil” side, the automation of key protections against executive corruption, overreach, and misalignment protects people against the erosion of liberal democracy.

Here, I'll explore some of the most salient critiques, preempt common objections to applying them in an AGI-agent context, and propose some countermeasures. The goal isn't to dismiss the critiques but to show how agents can substantially mitigate them. This strengthens the case for a hybrid system: agents handling the micro-coordination, with carefully designed institutions (including the state) addressing the rest.

Zero transaction costs, really? And what about inequality?

Skeptics might say agents don't eliminate costs entirely; they just shift them to compute overhead, data privacy setups, agent configurations etc. This is true! But it also underestimates the scale of reduction. Agents aren't burdened by human limitations like fatigue, bias in communication, logistical hurdles, social awkwardness, irrational decision making and so on. What costs $10,000 in legal fees today might cost pennies to compute tomorrow. Consider how a billionaire’s phone today is no more powerful or effective than yours.

Even then though, you might reasonably think that this still creates inequity in the short run. To prevent cost barriers for low-income users, governments or philanthropies could provide baseline agent services (similar to public defenders) and guarantees. This is a low cost to pay for the efficiencies gained by a system that otherwise promises to save society orders of magnitude more by slashing legal overhead, unlocking stalled projects, and turning countless externalities into win-win trades. In other words, underwriting entry-level agents for the poorest citizens is like funding public roads: a modest civic outlay that makes the whole market run faster, fairer, and vastly more productively.

This is unlikely to be a huge growing cost over time, as agent tech commoditizes, these costs approach zero asymptotically. The model here could mirror school voucher systems like Sweden's, where the government provides credits that ensure universal access to essential services while allowing choice and competition. Just as educational vouchers guarantee every child can attend school regardless of family income, “agent vouchers” or compute credits could ensure everyone can participate in democratic deliberation, access legal representation, or navigate essential government services. The key is targeting subsidies where they matter most for civic participation and fundamental rights - you'd want generous credits for democratic decision-making, healthcare choices, or educational planning, but not for negotiating garage parking disputes or lawn ornament preferences.

Alternatively, or complementarily, the system could employ direct redistribution in highly sensitive areas - providing everyone with a base allocation of compute credits or “agent wealth” to spend as they see fit. This approach avoids the paternalism of defining the above “essential services” centrally, which would recreate the very social planner problem we’re trying to avoid. Individuals could allocate their resources according to their own priorities rather than predetermined categories. A hybrid might work best: a universal basic compute allocation for personal use, plus additional targeted support for specific democratic and legal functions where equal participation is constitutionally guaranteed.

This tiered approach ensures equity where it counts without creating an unsustainable fiscal burden, while still allowing market dynamics to operate in less critical domains. In practice however, this does mean a lot of infrastructure will be required. For example, built-in protocols for multi-party discovery. For high-volume scenarios, hierarchical agents could aggregate at neighborhood or city levels. But much of this will need to be designed as part of a wider push for improving institutional decision making.

Which rights are the ‘default’?

Another important consideration here is that you still need an agreed “default position” - do people have a right to make noise, or a right to quiet? What's the basic right that is being negotiated - the right to pollute, or the right to be free from pollution? The machinery runs either way. What does change is who ends up richer, which is why the initial allocation of these rights is a constitutional choice, not a technicality. Even if bargaining is cheap, outcomes aren't invariant to initial property rights because wealth influences willingness to pay. A poor farmer might sell pollution rights cheaply to a rich factory not because it's efficient, but because they need cash now. Conversely, changing who starts with the rights changes the wealth distribution, which affects what people can afford to bid and therefore changes which 'efficient' outcome the market settles on. Beyond wealth effects, behavioral factors like the endowment effect - people demanding far more to give up a right than they'd pay to acquire it - make initial allocations stick even with perfect bargaining. Agents might correct for such biases, though whether we want them to 'debias' negotiations or faithfully represent our psychological quirks remains an open design question.

So how do we ensure fairness without reverting to top-down control? What is the “default position” to start with? Well, that baseline of who starts with which entitlement is a normative, collective choice. Agents don’t magic it away; they only make it explicit, contestable, and cheap to renegotiate. My view here is that we already have many of these rights set up by centuries of jurisprudence, and this is the right starting point. To the extent that these need to change or adapt, our democratic political systems are the right mechanism to do so. The bad news is that these systems are now pretty ossified, slow, captured, and dysfunctional. The good news is that agents can improve them materially.

Beyond periodic voting on baseline entitlements, agents could fundamentally transform how citizens deliberate and exercise their democratic rights. Recent papers show that AI systems can effectively learn and represent human preferences with remarkable efficiency. Studies like ConstitutionMaker show how natural language principles can be extracted from preference data, while Inverse Constitutional AI proves that just a handful of preferences can be compressed into interpretable principles that accurately reconstruct individual and group values. This suggests agents could continuously learn citizens' nuanced policy preferences through ongoing interactions, creating rich, privacy-protected preference profiles.

Currently, we delegate representation to biological agents - mayors, councilors, representatives - who operate within opaque, underfunded institutions plagued by accountability problems, information asymmetries, and the impossibility of truly representing thousands of diverse constituents. With agent infrastructure, we could significantly improve these systems. Imagine every citizen having a personal agent that deeply understands their values, can engage in sophisticated policy deliberation on their behalf, and coordinate with millions of other agents to find optimal compromises in real-time.

These agents wouldn't just vote every few years but could participate in continuous liquid democracy, dynamically delegating expertise to trusted entities for specific domains, instantly aggregating or constructing preferences on emerging issues, and ensuring that policy truly reflects the evolving will of the people rather than the frozen snapshot captured at the last election. Of course, this risks enabling digital NIMBYism at unprecedented scale, and we certainly don't want everyone's agents micromanaging nuclear safety protocols or monetary policy - but these are mechanism design and governance challenges, not fundamental obstacles.

Today, citizens already don't vote on every financial regulation or technical standard; agent-mediated democracy needn't change that. To the extent that enhanced coordination could enable minorities to hold majorities hostage, we'll need clever mechanisms to prevent such digital paralysis. There's plenty of work ahead for policymakers, economists, evaluation designers, sociologists, and game theorists to get these institutional designs right!

Lastly, the system would also need to balance dynamism with stability. Markets require predictable rules; indeed, the constant renegotiation of property rights would destroy investment incentives. But just as options markets price volatility, an agent-mediated system could explicitly price the value of stability versus flexibility, letting some rules ossify by mutual agreement (basic property rights, contract enforcement) while others remain perpetually negotiable (noise ordinances, parking rules). The agents themselves would likely converge on stable equilibria for most issues simply to reduce computational overhead - constant renegotiation is expensive even for AGI.

But what about catastrophic risks?

A lot of people working in AI governance are interested in catastrophic risks where a few actors can impose great harm on others at scale; many will rightly say “this all sounds great but doesn’t address CBRN risks.” They’re not wrong.

A malicious actor intending to release a pathogen is not a market participant to be bargained with, and admittedly, the agent system can do little to stop them. Instead this is the state's first and most important job: to enforce law and order and protect citizens from violence, whether from a foreign army or a domestic bioterrorist. The Coasean multi-agent framework relies on this protection to even exist in the first place: the state needs to enforce contracts. If the delivery truck’s agent agrees to the “clean air fee” but the company refuses to pay, there must be a court system: a neutral arbiter with the power to enforce the agreement. This is a non-negotiable role for the state.

In AI governance discussions, the aversion to the totalising, centralising proposals espoused by some communities has been met with the inverse prescription: various flavours of free for all e/acc libertarianism or anarchy. This falls into the opposite trap, and wrongly assumes you can do away with the state entirely. The Coasean framework does not eliminate the state, but it transitions its role from “central planner” to “framework guarantor’” focusing its power on what it alone can do. It allows the market, supercharged by agents, to handle the complex work of coordinating preferences and pricing externalities, a job the state has always done poorly. This should in principle appeal to conservatives wary of big government and liberals wary of power abuses. But it doesn’t do away with the state, and nor should it - it just makes it a lot leaner. The (gradually automated) state continues to define and enforce basic property rights, contract law, criminal justice, and constitutional rights - the bedrock rules without which agent negotiations would be meaningless.

So the Coasean multi-agent system, for all its genius, has a critical limit: it is designed to price trade-offs. It can put a price on diesel fumes, noise, or a blocked view. It cannot, however, price the non-negotiable. What happens if technology unlocks a true “recipe for ruin”? A discovery, like “easy nukes” or a simple method for creating a devastating pathogen, that allows a single actor to threaten civilization itself? This is not an externality to be bargained over!

Such a risk is a form of ultimate coercion, and its prevention falls squarely within the most fundamental duty of the state: protecting its citizens from violence. Therefore, the state’s role is not just to enforce the contracts that agents make, but to define the absolute boundaries of what they are permitted to do in the first place like prohibiting actions that create catastrophic, un-priceable risks like man-made pandemics. While the Coasean framework itself does not price these existential risks, the underlying cognitive infrastructure it creates is part of what a modern state may need to manage them: enabling the high-speed coordination and automated governance required to make the state a more effective protector.

Matryoshkan Alignment

So what does this mean for how we think about the normative/sociopolitical question of who agents should be aligned to? To whom, or what, is an agent ultimately loyal? Is it fully and solely aligned to the user, like a computer that will execute any command? Or is it aligned to an amorphous set of collective values decided by some citizen jury? Or a global institution that sets top down directives? Or is it all up to the model developer? As with humans, I think the answer is not a single master. The meta-framework is a series of nested layers of governance, like a set of Matryoshka dolls. This nested structure mirrors what Levin calls “scale-free cognition” where each level of organization (from cells to tissues to organisms) maintains its own goals and decision-making capacity, with larger-scale goals emerging from but not replacing smaller-scale ones. There are many possible layers, but for the sake of simplicity I’ll outline three here.

The outermost, and largest, doll is the law. This is the non-negotiable boundary enforced by the state. An agent, no matter how personalized, cannot be a tool for committing crimes. Your agent cannot help you orchestrate fraud, DDOS a hospital, hire a hitman, or procure materials for a bioweapon, any more than your word processor can grant you immunity for writing a fraudulent cheque. To a large extent, existing laws basically criminalize all of the above; although some ways need some updating to account for agentic capabilities, difficulty in establishing a mens rea, delineation of responsibilities across the “value chain,” and so on. There’s plenty of interesting work going on in legal circles trying to work this out, and legitimate arguments for why certain gaps may need to be filled.

Within that legal boundary operates the second layer: the free market of different services, deployers, products, and providers. A company offering an agent service is not the government; it is a voluntary association with its own rules. One social media site can cultivate a different environment from another, and users are free to choose. If a provider’s agent refuses to engage with topics the company deems harmful to its brand or community, that is their right. A user who finds these policies too restrictive is not a captive; they are a customer who can, and will, take their business to a competitor offering more customization or utility. This competition will be the primary force pushing agents to become powerful and loyal advocates for their users. Today, we arguably have an increasing number of developers of general purpose models of all kinds, costs are going down, and importantly many more actors are able to customize, fine-tune, and modify models deployed through cloud infrastructure. Unlike social media, network effects are also far weaker. And in a competitive market with switching costs approaching zero, parasitic agents get quickly identified and abandoned.

Finally, at the core, is the individual. Within the bounds of law and the terms of service you voluntarily accept, the agent’s purpose is to be your tireless, personal advocate. This is where the power of user-level customization and alignment is unleashed, where a private “cognitive DNA” can be grown. The user should have immense freedom to tune their agent to their unique preferences and values. They should also have complete privacy and control over their “cognitive profile” developed by the agent, for obvious reasons. Practically speaking though, this is the hardest part: how do you design and evidence an agent (mostly) aligned to a user? How do you evaluate this and the continuous learning? We don't need the perfect answer to these questions - alignment is not something to be “solved.”

There are of course important technical questions that are not fully addressed - the right norms, the right level of agreeableness, the right level of deference and corrigibility, fully addressing reward hacking, ensuring agents aren’t deceptive, the right evaluations to test for user alignment, and more. Few of these have a single right answer however, and markets are generally fairly well incentivised to solve them - no company or person wants a reward hacking agent. My intention here is not to dismiss them away - rather, I think the way the “alignment problem” is often conceptualized is out of date and comparable to asking “how do we ensure what is written always leads to truth? How do we solve the ‘truth problem’?” after the invention of writing or typewriters. There isn’t and cannot be any guarantee. In fact, the starting point should be reversed; as Dan Williams notes, the real question should be “why do we even have truth at all?” This is a question of institutions and governance, and not one solved by software engineering. It’s an unsatisfactory answer only for those seeking centralized guarantees. You mean we’re going to have to muddle through things? Yes. As put by Leibo et al, we should model societal and technological progress as sewing an “ever-growing, ever-changing, patchy, and polychrome quilt.”

What we need to ensure is that agents that genuinely serve their users' interests will outcompete those that don't and build the right governance mechanisms. From a commercial point of view, these agents won't just be adopted and used by everyone out of the box. They need to actually produce value to their principals too. People will want AIs for financial planning, solving scheduling and calendars, helping with negotiations with roommates, finding romantic partners, etc. I expect this adoption to begin in the immediate sphere: agents negotiating the office thermostat, allocating shared resources in an apartment block, or fairly dividing household chores. As these systems prove their worth in reducing daily friction, people will trust them more. The same mechanisms used to settle a parking dispute can be adapted and scaled up to manage urban planning conflicts or discover the true cost of local externalities in other contexts. Eventually, this bottom-up architecture provides a credible pathway to solving the grand challenges, from funding national public goods to perhaps one day even navigating the complexities of interstate disputes.

In some cases, you may need more than market forces. Just as public defenders ensure legal representation regardless of ability to pay, we may need guaranteed access to advocacy agents through voucher systems, market intermediaries, “right to an agent” provisions, oversight mechanisms for automated governance, and so on. For the most part though, users will gravitate toward agents that actually help them achieve their goals. Providers whose agents consistently deliver value will gain market share. Markets remain one of the most powerful forces discovered to date; and with agents, we can surely improve both the rules governing them (e.g. through political agents and automated governance) and the mechanisms that ensure their efficiency (e.g. through Coasean bargaining agents).

The vision presented here shifts the locus of governance from centralized coercion to decentralized negotiation. AGI agents can help us create a vastly more efficient, accountable, and adaptable society. There is no need to centralize all the labs into a government monopoly, nor should we just accelerate aimlessly and do away with the state. And unlike solving alignment for a singular, centralized AGI (where failure is catastrophic), the distributed model of millions of user-agent relationships creates a massive parallel experiment. It is a system that learns, adapts, and continuously aligns itself over time, allowing us to build a society that is both more free and more coordinated than anything that has come before.

***

Thanks to Nathaniel Bechhofer, Roberto-Rafael Maura-Rivero, Lee B. Cyrano, Andrew Codrington, Max Nadau, Ivan Vendrov, Alex Obadia, Benoit Lepine, Ryan Murphy, Harry Law, Conor Griffin, and Benjamin Lyons for comments.

Cosmos Institute is the Academy for Philosopher-Builders, technologists building AI for human flourishing. We run fellowships, fund fast prototypes, and host seminars with institutions like Oxford, Aspen Institute, and Liberty Fund.

Seth

Sep 26

I haven't had time to read carefully, but I feel like there are obvious objections outside those you address.

Namely, not all frictions are obviously "bad". Some transaction costs are useful because overcoming them constitutes a useful signal about the real preferences or abilities of the parties involved.

In the case of labor markets, currently AI makes it trivial to spam custom resumes, making the matching problem worse. On the other hand, people are working on AI interviewers that may make the problem better (I actually did one the other day and came away very impressed). So it's an arms race and it's not obvious the net result will positive.

(Apologies if you do address this and I missed it; my daughter is home from school today and I'm getting my essay fix in the brief interludes of her independent occupation.)

Expand full comment

1 reply