AI and Liability

Earlier this month, a German court ruled that Google is liable for its AI search summaries. Rejecting defenses like “users can check for themselves,” and that they generally know “that information generated with AI should not be blindly trusted,” the court held that the AI’s summaries are reflections of the company and “above all an expression of Google’s business activities.”

This is the latest skirmish in a decades-old battle over internet publishing. Historically, there were two different types of information distributors: carriers and publishers. A phone company is a carrier. It’ll transmit whatever you say, even discussions about committing a crime. Words are words, and the phone company does not know—nor is it liable for—the words you choose to speak. A newspaper, on the other hand, is a publisher. It decides the words it publishes, and what quotes to include in its articles. If those words or quotes are defamatory or otherwise illegal, it’s liable.

Internet companies have long tried to play both ends of this distinction. They claim to be a carrier when it suits them, and also to be a publisher when that is advantageous. Section 230 of the 1996 Communication Decency Act enshrined this straddling when it shielded internet providers from liability for the speech of others on their platforms: “No provider or user of an interactive computer service shall be treated as the publisher or speaker of any information provided by another information content provider.”

For years, a debate has continued about how to apply this law to social media platforms. When platforms merely displayed people’s posts and comments in reverse-chronological order, they behaved largely like carriers, relaying people’s words without regard to their contents. But the next generation of platforms, like Facebook, curated feeds with algorithms and thereby acted more like publishers, making editorial decisions about who sees what. Some experts think section 230 has gone too far and needs reform; others think that it’s what holds the modern internet together.

Google’s AI overviews are far less nuanced. They work differently from traditional search, which courts have held involves archiving and facilitating access to the editorial content of third parties. AI overviews don’t just quote and republish words from different websites. With overviews, the AI rewrites other people’s words, exercising editorial discretion like a newspaper article or an original essay on a topic.

It’s not only Google’s AI that falls into this category. Imagine a restaurant review site that provides AI summaries, or a site summarizing laws and government procedures. Or a traditional publisher that uses AI to summarize its own publication. Accuracy matters, and liability is one of the most important ways we as a public can demand accuracy and hold companies accountable when they cause harm.

Two years ago, Air Canada learned this lesson. Its AI chatbot promised a discount the company later rescinded, arguing in court that the airline wasn’t responsible for the promises the bot made because it was a “separate legal entity that is responsible for its own actions.” The court sided with the flyer, saying that the airline was just as responsible for what its chatbot says as what’s on its website. The potential precedent here is that corporations have a duty of care for the performance of the AI chatbots they employ.

AI agents are agents of the person or organization that deploys them—and should be treated by the law as such. If a company hired human writers to write its summaries, that company would be liable for inaccuracies in those summaries. If a company’s human agent signed contracts in the company’s name, that company would be bound by those contracts. And if a doctor gave dangerously wrong medical advice, they would be liable for malpractice.

To allow businesses to hide behind the excuse of faulty AI in those same circumstances would be a massive handout to companies, and would introduce disastrous incentives for corporate misbehavior. Why hire human writers, lawyers or doctors when AIs are not only cheaper, but also absolve employers whenever they make a mistake?

We are rapidly moving to a world where AI-powered chatbots will be at the other end of all sorts of corporate communications channels. It makes no sense for a company to be able to honor its statements when it wants to and disavow them when it doesn’t.

Visa and OpenAI recently announced a partnership to build personal AI agents to, among other things, make purchases on our behalf. This is just one of many similar projects in the works, as companies race to provide us all with AI assistants. Will Visa take responsibility when its AI makes a purchase in your name that you don’t want? And if Visa won’t, why would anyone trust the system? Properly allocating liability is key to make this kind of thing work.

If the German ruling holds, it could be devastating for Google’s AI Overview feature. Tests from earlier this year found that it had mistakes about 10% percent of the time. At more than 5tn searches per year, that’s 16,000 erroneous summaries every second. And while most of those errors are benign, some of them will cause harm, be defamatory, or otherwise trigger liability.

Earlier this year, Google’s AI summary falsely identified the Canadian fiddler Ashley MacIsaac of being a sex offender. His lawsuit, filed in Ontario, is ongoing. If Google is forced to invest in improving its AI system until those kinds of errors are exceedingly rare, that seems like a good outcome for users, as well as the subjects of search, like MacIsaac.

More generally, liability concerns could mean that many current use cases for agents won’t be commercially viable. Companies may not be able to profitably operate AI lawyers, doctors and media influencers if they are held responsible for what they say and do.

We’re OK with this outcome. There’s nothing in the law that requires us to accommodate AI systems if they are fundamentally untrustworthy, just as we don’t need to accommodate untrustworthy human systems. Any company that won’t stand by the statements its agents make—whether human or AI—doesn’t deserve users’ time or money.

Posted on June 25, 2026 at 1:03 PM14 Comments

Interesting Paper Exploring Prompt Injection

This is a fascinating explotation of how LLMs fall for prompt injection attacks. It turns out that they learn to recognize the style of text in different role/instruction blocks, and not just the tags.

Their conclusion:

Role tags were a formatting trick that became the security architecture and the cognitive scaffolding of modern LLMs. We’ve shown that this architecture doesn’t survive into the model’s actual representations, and that such role confusion is linked to prompt injection.

Unless LLMs achieve genuine role perception, we think injection defense will remain a perpetual whack-a-mole game. And the continuous nature of role boundaries opens the threat of injections designed to subtly shift LLM states through seemingly innocuous text, legally and at scale.

More generally, roles are quietly one of the most important abstractions in the LLM stack, providing the boundaries meant to separate self from other, thought from communication, instruction from data. They’re human-controlled switches in an otherwise continuous system. We think they deserve a lot more study than they’ve gotten.

Full paper: “Prompt Injection as Role Confusion.” Simon Willison comments.

Posted on June 25, 2026 at 7:23 AM8 Comments

Embedding Forbidden Text in Spyware to Discourage AI Analysis

At least one malware developer is adding text about nuclear and biological weapons to their spyware, in an effort to stop automatic AI analysis.

Details:

The _index.js payload begins with a large JavaScript block comment containing fake system instructions and policy-triggering content. Because it is inside a comment, it does not affect JavaScript execution. The runtime skips it. The real malware begins after the comment with a try{eval(…)} wrapper around a large character-code array and a ROT-style substitution function.

This header appears designed for AI-mediated analysis, not for Node, Bun, or Python. It attempts to derail scanners or analyst copilots that feed the beginning of a file to a language model without clearly isolating the content as untrusted data. In weak pipelines, this can cause refusal behavior, prompt confusion, context pollution, or premature classification before the scanner reaches the actual malware.

This is not a magical bypass against static detection. YARA rules, entropy checks, AST parsing, string extraction, deobfuscation, and behavioral rules still work. But it is a practical anti-analysis trick against naive LLM-first triage systems.

Posted on June 24, 2026 at 7:03 AM5 Comments

Professional Athletes and Wearables

I haven’t thought about the privacy issues surrounding professional athletes and wearables.

Wearables present serious privacy issues for “Average Joe” consumers, who are entrusting tech companies to safely store and protect their biometric data. Imagine the stakes for a professional athlete, whose entire livelihood could be affected by a single biometric data point. To give one of many realistic hypotheticals: a basketball player has a terrible game, and the coach wonders if they showed up to the gym hungover. The coach has access to the player’s wearable data, and checks to see when they went to sleep, as well as what their heart rate looked like during the night. Should the player have been out partying before a game? No. Should the coach be able to surveil them? Definitely not.

It will not surprise you to learn that there’s an emergent gambling angle here: sports leagues would love to commercialize players’ biometric data, and sharp bettors would love access to data about, say, a hungover player. “We’re going to get to a spot where people are betting not just on the velocity of the puck that was shot by a player in the NHL playoffs, but on what the heart rate of a certain player is going to be running down the field,” said Helen “Nellie” Drew, the director of the University of Buffalo’s Center for the Advancement of Sport, and a professor of practice in sports law.

There are other practical considerations, too. What if wearable data reveals that a player isn’t as speedy as they were before, and a team uses that data against the player during contract negotiations? What if a wearable reveals a player is favoring their leg, or is at greater risk of injury? This information is potentially beneficial to a training staff and an athlete, so long as it’s disclosed and used in a responsible manner—­a critical, mostly unresolved caveat. “Aging and injured players are the most at-risk” of wearable data being used against them, said Michael LeRoy, who researches sports labor laws and AI, and is a professor at the University of Illinois’s School of Labor and Employment Relations.

The bit about gamblers is particularly scary.

I have often said that surveillance tech is generally deployed first against people with diminished rights: children, prisoners, military personnel, the mentally impaired. This is another early use case with different dynamics. The surveilled are wealthy and powerful, and—in many cases—unionized.

Posted on June 22, 2026 at 7:02 AM13 Comments

Anthropic’s Fable and the State of AI

On June 9th, Anthropic released its Fable generative AI model. Three days later, the US government classified it as a dangerous munition, and used its export-control authority to prohibit any foreign nationals from accessing it. Unable to differentiate between Americans and foreigners, the company shut off access for everyone.

The government’s actions won’t help. The problem isn’t any one particular model; it’s the general trend of increasing AI capabilities. And any real solution requires the sort of collective action that just isn’t possible right now.

Fable is the constrained version of Mythos, the AI model Anthropic announced in April. Anthropic only released it to a few selected organizations, because the company claimed it was so good at finding and exploiting vulnerabilities in computer code that releasing it more generally would be dangerous.

It was an obviously self-serving announcement, and because few were able to verify Anthropic’s claims they were met with some skepticism. Those with access used Mythos to find and patch many vulnerabilities in their own software. But one UK group found the latest, already public, OpenAI model to be just as powerful.

Fable is just another incremental improvement in the years-long climb of AI capabilities. But just as important as the AI model is the “harness.” This is typically not AI. It’s ordinary computer code that interfaces with the user. It stitches together AI models, decides how and for what purposes they can be used, and gives them useful tools such as web search and the ability to run their own computer code.

When Mythos first entered limited release, there was widespread debate whether its power came from the model or the harness. With Mythos demonstrating that it was possible, the open-source community scrambled to build harnesses that could steer other AI models towards similar capabilities. Harness improvements don’t need massive data or data centers.

They largely succeeded. For example, a Prague company was able to replicate Anthropic’s few verifiable cybersecurity capabilities with a much smaller and cheaper model—and a more sophisticated harness. Last week, a group showed that multiple cheaper models harnessed in concert matches Fable’s performance.

The broader community had only a few days with Fable, but that time we learned some about its capabilities. Its difference is less the new model’s raw analytical and problem solving capabilities, and more that the model doesn’t need that sophisticated harness.

Fable requires much less expertise and detailed prompting from the human user. You can give it a difficult goal and it will figure out novel and unexpected ways to satisfy it, finding loopholes in whatever constraints you or the system have imposed on it.

“Relentlessly proactive” is how AI researcher Simon Willison described it. Another descriptor might be “creative.” Experienced AI developers have had that combination of creativity and proactivity since last year, but Fable puts it within easy reach of everyone.

In the hands of someone with a legitimate problem that needs solving, that can be an incredibly useful capability. But in the hands of someone who wants to do harm, it can be equally dangerous. AIs don’t have a moral compass in the same way that people do. They are agents of the wants and desires of the people who prompt them.

That points to the real problem with relentlessly proactive AI. In language, wants and desires are always underspecified. If I ask you to get me some coffee, you would probably pour me a cup from the coffeepot, or buy one from a nearby coffee shop.

You couldn’t buy me a pound of raw beans, or a coffee plantation. You wouldn’t order a cup of coffee for delivery next month. You wouldn’t find a nearby person, rip a cup of coffee out of their hands, and bring it to me. I wouldn’t have to specify any of the million limitations to my request; you would just know.

Human stories are filled with warnings about underspecified desires. King Midas wished that everything he touch turn to gold, forgetting to add “but not my food, drink, and daughter.” And genies are notorious for granting your wish in a way you wish they hadn’t.

The deeper point is that it’s impossible to list all limitations and restrictions, and like a malicious genie, a creative AI will find the ones you forgot. Block a database you don’t want it to have access to, and it might figure out how to bypass your control. Ask it to book a flight, and it might hack the airline because the website says the flight is sold out. Ask it to save money on your cellphone plan, and it might cancel it altogether—or get someone else to pay for it. As far as we know now AI has not done any of this yet, but you get the idea.

Malicious intent is not required. To an AI model, constraints are just things to get around and not general truisms about the world. They are creative problem solvers and natural rule breakers. They “hack” in the sense that they find and exploit loopholes.

Human systems rely on so many norms that we scarcely recognize the existence of until they are broken. AIs naturally think outside the box, because they don’t have any real conception of what the box is or why it’s there in the first place.

There is no foolproof way to prevent people from using AI models to complete harmful tasks. There is no way to prevent the models from incidentally causing harm while completing benign tasks. AI models are no longer isolated from the real world. They browse the internet and answer emails.

They trade stocks and make purchases. They control physical systems. They are, in effect, robots that affect life and property. We have no technical mechanisms to verify the integrity of an AI system. This level of capability and creativity in the hands of us untrustworthy humans will have both great and terrible results.

The problem is not unique to Anthropic. Mythos/Fable might currently be the most capable rules hacker, but more sophisticated harnesses give other models similar capabilities. And we should assume that the other frontier models are no more than a few months behind, and that open-source models are less than a year behind. At best, any ban only serves to delay the problem for a short while.

That delay might be useful if we—as a society, as a planet—would use that time to come together and figure out what to do. This isn’t a US/China arms race problem; this a species-level problem that requires coordinated action at that scale. Unfortunately, we have no mechanism to do that. I first wrote about this problem five years ago, but it was all too futuristic.

Today, when its right in front of us, there is no world government that can impose constraints on the for-profit corporations currently controlling AI models and research. The US has no appetite to effectively and even-handedly regulate those corporations, even as they do catastrophic damage to the environment, democracy, and—in this case—society in general.

This all makes an AI public option all the more necessary, and urgent. Today’s AIs can be fast, smart and secure, but only two of the three are possible for any given system. These safety tradeoffs are tightly held secrets of companies racing to beat one another, and they tell us we have to trust them. Instead, the choices and their consequences need to be brought out into the sunlight.

We should be funding open-source harnesses that balance capability and safety—that achieve useful goals without so much power—and open-source AI models whose provenance and biases are public and well understood. We have opened the AI Pandora’s box. Now we have to make the best of it.

This essay originally appeared in The Guardian.

Posted on June 19, 2026 at 7:03 AM19 Comments

Embedding Forbidden Text in Spyware to Discourage AI Analysis

At least one malware developer is adding text about nuclear and biological weapons to their spyware, in an effort to stop automatic AI analysis.

Details:

The _index.js payload begins with a large JavaScript block comment containing fake system instructions and policy-triggering content. Because it is inside a comment, it does not affect JavaScript execution. The runtime skips it. The real malware begins after the comment with a try{eval(…)} wrapper around a large character-code array and a ROT-style substitution function.

This header appears designed for AI-mediated analysis, not for Node, Bun, or Python. It attempts to derail scanners or analyst copilots that feed the beginning of a file to a language model without clearly isolating the content as untrusted data. In weak pipelines, this can cause refusal behavior, prompt confusion, context pollution, or premature classification before the scanner reaches the actual malware.

This is not a magical bypass against static detection. YARA rules, entropy checks, AST parsing, string extraction, deobfuscation, and behavioral rules still work. But it is a practical anti-analysis trick against naive LLM-first triage systems.

Posted on June 18, 2026 at 7:04 AM6 Comments

Sidebar photo of Bruce Schneier by Joe MacInnis.