The Hallucination Defense

(niyikiza.com)

26 points | by niyikiza 3 hours ago ago

68 comments

JohnFen 3 hours ago ago

> “The AI hallucinated. I never asked it to do that.”
> That’s the defense. And here’s the problem: it’s often hard to refute with confidence.
Why is it necessary to refute it at all? It shouldn't matter, because whoever is producing the work product is responsible for it, no matter whether genAI was involved or not.

[-]
- nerdsniper 2 hours ago ago
  
  The distinction some people are making is between copy/pasting text vs agentic action. Generally mistakes "work product" as in output from ChatGPT that the human then files with a court, etc. are not forgiven, because if you signed the document, you own its content. Versus some vendor-provided AI Agent which simply takes action on its own that a "reasonable person" would not have expected it to. Often we forgive those kinds of software bloopers.
  
  [-]
  - ori_b an hour ago ago
    
    If you put a brick on the accelerator of a car and hop out, you don't get to say "I wasn't even in the car when it hit the pedestrian".
    
    [-]
    - Shalomboy an hour ago ago
      
      This is true for bricks, but it is not true if your dog starts up your car and hits a pedestrian. Collisions caused by non-human drivers are a fascinating edge case for the times we're in.
      
      [-]
      - ori_b an hour ago ago
        
        In the USA, at least, it seems pet owners are liable for any harm their pets do.
      - cess11 13 minutes ago ago
        
        Legally, in a lot of jurisdictions, a dog is just your property. What it does, you did, usually with presumed intent or strict liability.
        
        [-]
        
        gowld 7 minutes ago ago
        
        What if you planted a bush that attracted a bat that bit a child?
      - victorbjorklund an hour ago ago
        
        I don’t know where you from but at least in Sweden you have strict liability for anything your dog does
      - freejazz an hour ago ago
        
        Prima facie negligence = liability
  - niyikiza 30 minutes ago ago
    
    > if you signed the document, you own its content. Versus some vendor-provided AI Agent which simply takes action on its own
    Yeah that's exactly the I think we should adopt for AI agent tool calls as well: cryptographically signed, task scoped "warrants" that can be traceable even in cases of multi-agent delegation chains
    
    [-]
    - embedding-shape 16 minutes ago ago
      
      Kind of like https://github.com/cursor/agent-trace but cryptographically signed?
      > Agent Trace is an open specification for tracking AI-generated code. It provides a vendor-neutral format for recording AI contributions alongside human authorship in version-controlled codebases.
      
      [-]
      - niyikiza 10 minutes ago ago
        
        Similar space, different scope/Approach. Tenuo warrants track who authorized what across delegation chains (human to agent, agent to sub-agent, sub-agent to tool) with cryptographic proof & PoP at each hop. Trace tracks provenance. Warrants track authorization flow. Both are open specs. I could see them complementing each other.
  - kazinator 22 minutes ago ago
    
    That's the same thing. You signed off on the agent doing things on your behalf; you are responsible.
    If you gave a loaded gun to a five year old, would "five-year-old did it" be a valid excuse?
  - observationist an hour ago ago
    
    To me, it's 100% clear - if your tool use is reckless or negligent and results in a crime, then you are guilty of that crime. "It's my robot, it wasn't me" isn't a compelling defense - if you can prove that it behaved significantly outside of your informed or contracted expectations, then maybe the AI platform or the Robot developer could be at fault. Given the current state of AI, though, I think it's not unreasonable to expect that any bot can go rogue, that huge and trivially accessible jailbreak risks exist, so there's no excuse for deploying an agent onto the public internet to do whatever it wants outside direct human supervision. If you're running moltbot or whatever, you're responsible for what happens, even if the AI decided the best way to get money was to hack the Federal Reserve and assign a trillion dollars to an account in your name. Or if Grok goes mechahitler and orders a singing telegram to Will Stancil's house, or something. These are tools; complex, complicated, unpredictable tools that need skillfull and careful use.
    There was a notorious dark web bot case where someone created a bot that autonomously went onto the dark web and purchased numerous illicit items.
    https://wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww.bitnik.or...
    They bought some ecstasy, a hungarian passport, and random other items from Agora.
    >The day after they took down the exhibition showcasing the items their bot had bought, the Swiss police “arrested” the robot, seized the computer, and confiscated the items it had purchased. “It seems, the purpose of the confiscation is to impede an endangerment of third parties through the drugs exhibited, by destroying them,” someone from !Mediengruppe Bitnik wrote on their blog.
    In April, however, the bot was released along with everything it had purchased, except the ecstasy, and the artists were cleared of any wrongdoing. But the arrest had many wondering just where the line gets drawn between human and computer culpability.
    
    [-]
    - b00ty4breakfast an hour ago ago
      
      that darknet bot one always confuses me. The artists/programmers/whatever specifically instructed the computer, through the bot, to perform actions that would likely result in breaking the law. It's not a side-effect of some other, legal action which they were trying to accomplish, it's entire purpose was to purchase things on a marketplace known for hosting illegal goods and services.
      If I build an autonomous robot that swings a hunk of steel on the end of a chain and then program it to travel to where people are likely to congregate and someone gets hit in the face, I would rightfully be held liable for that.
    - cess11 8 minutes ago ago
      
      "computer culpability"
      That idea is really weird. Culpa (and dolus) in occidental law is a thing of the mind, what you understood or should have understood.
      A database does not have a mind, and it is not a person. If it could have culpa, then you'd be liable for assault, perhaps murder, if you took it apart.
- ibejoeb an hour ago ago
  
  Yeah. Legal will need to catch up to deal with some things, surely, but the basic principles for this particular scenario aren't that novel. If you're a professional and have an employee acting under your license, there's already liability. There is no warrant concept (not that I can think of right now, at least) that will obviate the need to check the work and carry professional liability insurance. There will always be negligence and bad actors.
  The new and interesting part is that while we have incentives and deterrents to keep our human agents doing the right thing, there isn't really an analog to check the non-human agent. We don't have robot prison yet.
- doctorpangloss 2 hours ago ago
  
  Wait till you find out about “pedal confusion.”
- salawat 3 hours ago ago
  
  Except for the fact that that very accountability sink is relied on by senior management/CxO's the world over. The only difference is that before AI, it was the middle manager's fault. We didn't tell anyone to break the law. We just put in place incentive structures that require it, and play coy, then let anticipatory obedience do the rest. Bingo. Accountability severed. You can't prove I said it in a court of law, and skeevy shit gets done because some poor bloke down the ladder is afraid of getting fired if he doesn't pull out all the stops to meet productivity quotas.
  AI is just better because no one can actually explain why the thing does what it does. Perfect management scapegoat without strict liability being made explicit in law.
  
  [-]
  - pixl97 an hour ago ago
    
    Hence why many life and death things require licencing and compliance, and tend to come with very long paper trails.
    The software world has been very allergic to getting anywhere near the vicinity of a system like that.
- niyikiza 3 hours ago ago
  
  You're right, they should be responsible. The problem is proving it. "I asked it to summarize reports, it decided to email the competitor on its own" is hard to refute with current architectures.
  And when sub-agents or third-party tools are involved, liability gets even murkier. Who's accountable when the action executed three hops away from the human? The article argues for receipts that make "I didn't authorize that" a verifiable claim
  
  [-]
  - bulatb 2 hours ago ago
    
    There's nothing to prove. Responsibility means you accept the consequences for its actions, whatever they are. You own the benefit? You own the risk.
    If you don't want to be responsible for what a tool that might do anything at all might do, don't use the tool.
    The other option is admitting that you don't accept responsibility, not looking for a way to be "responsible" but not accountable.
    
    [-]
    - tossandthrow 2 hours ago ago
      
      Sounds good in theory, doesn't work in reality.
      Had it worked then we would have seen many more CEOs in prison.
      
      [-]
      - walt_grata 2 hours ago ago
        
        There being a few edge cases where it doesn't work in doesn't mean it doesn't work in the majority of cases and that we shouldn't try to fix the edge cases.
      - freejazz an hour ago ago
        
        This isn't a legal argument and these conversations are so tiring because everyone here is insistent upon drawing legal conclusions from these nonsense conversations.
      - bulatb an hour ago ago
        
        We're taking about different things. To take responsibility is volunteering to accept accountability without a fight.
        In practice, almost everyone is held potentially or actually accountable for things they never had a choice in. Some are never held accountable for things they freely choose, because they have some way to dodge accountability.
        The CEOs who don't accept accountability were lying when they said they were responsible.
      - NoMoreNicksLeft 2 hours ago ago
        
        The veil of liability is built into statute, and it's no accident.
        Such so magic forcefield exists for you, though.
  - phoe-krk 2 hours ago ago
    
    > "I asked it to summarize reports, it decided to email the competitor on its own" is hard to refute with current architectures.
    If one decided to paint a school's interior with toxic paint, it's not "the paint poisoned them on its own", it's "someone chose to use a paint that can poison people".
    Somebody was responsible for choosing to use a tool that has this class of risks and explicitly did not follow known and established protocol for securing against such risk. Consequences are that person's to bear - otherwise the concept of responsibility loses all value.
    
    [-]
    - im3w1l 2 hours ago ago
      
      > otherwise the concept of responsibility loses all value.
      Frankly, I think that might be exactly where we end up going. Finding a responsible person to punish is just a tool we use to achieve good outcomes, and if scare tactics is no longer applicable to the way we work, it might be time to discard it.
      
      [-]
      - phoe-krk an hour ago ago
        
        A brave new world that is post-truth, post-meaning, post-responsibility, and post-consequences. One where the AI's hallucinations eventually drag everyone with it and there's no other option but to hallucinate along.
        It's scary that a nuclear exit starts looking like an enticing option when confronted with that.
        
        [-]
        
        im3w1l 2 minutes ago ago
        
        Ultimately the goal is to have a system that prevents mistakes as much as possible adapts and self-corrects when they do happen. Even with science we acknowledge that mistakes happen and people draw incorrect conclusions, but the goal is to make that a temporary state that is fixed as more information comes in.
        I'm not claiming to have all the answers about how to achieve that, but I am fairly certain punishment is not a necessary part of it.
  - LeifCarrotson 2 hours ago ago
    
    > "I asked it to summarize reports, it decided to email the competitor on its own" is hard to refute with current architectures.
    No, it's trivial: "So you admit you uploaded confidential information to the unpredictable tool with wide capabilities?"
    > Who's accountable when the action executed three hops away from the human?
    The human is accountable.
    
    [-]
    - gowld 4 minutes ago ago
      
      What if you carried a stack of papers between buildings on a windy day, and the papers blew away?
    - pixl97 an hour ago ago
      
      As the saying goes
      ----
      A computer can never be held accountable
      Therefore a computer must never make a management decision
  - QuadmasterXLII 2 hours ago ago
    This doesn't seem conceptually different from running
```
    [ $[ $RANDOM % 6] = 0 ] && rm -rf / || echo "Click"
```
    on your employer's production server, and the liability doesn't seem murky in either case
    [-]
    - staticassertion 2 hours ago ago
      What if you wrote something more like:
      # terrible code, never use ty def cleanup(dir): system("rm -rf {dir}") def main(): work_dir = os.env["WORK_DIR"] cleanup(work_dir)
      and then due to a misconfiguration "$WORK_DIR" was truncated to be just "/"?
      At what point is it negligent?
      [-]
      - direwolf20 2 hours ago ago
        
        This is not hypothetical. Steam and Bumblebee did it.
        
        [-]
        
        a_t48 22 minutes ago ago
        
        Bungie, too, in a similar way.
        
        extraduder_ire 2 hours ago ago
        
        That was the result of an additional space in the path passed to rm, IIRC.
        Though rm /$TARGET where $TARGET is blank is a common enough footgun that --preserve-root exists and is default.
        
        [-]
        
        niyikiza 24 minutes ago ago
        
        You'd be surprised to see how often we're seeing those types of semantic attack vulnerabilities in Agent frameworks: https://niyikiza.com/posts/map-territory/
  - groby_b an hour ago ago
    
    "And when sub-agents or third-party tools are involved, liability gets even murkier."
    It really doesn't. That falls straight on Governance, Risk, and Compliance. Ultimately, CISO, CFO, CEO are in the line of fire.
    The article's argument happens in a vacuum of facts. The fact that a security engineer doesn't know that is depressing, but not surprising.
  - freejazz an hour ago ago
    
    The burden of substantiating a defense is upon the defendant and no one else.
  - groby_b 2 hours ago ago
    
    "Our tooling was defective" is not, in general, a defence against liability. Part of a companys obligations is to ensure all its processes stay within lawful lanes.
    "Three months later [...] But the prompt history? Deleted. The original instruction? The analyst’s word against the logs."
    One, the analysts word does not override the logs, that's the point of logs. Two, it's fairly clear the author of the fine article has never worked close to finance. A three month retention period for AI queries by an analyst is not an option.
    SEC Rule 17a-4 & FINRA Rule 4511 have entered the chat.
    
    [-]
    - niyikiza 2 hours ago ago
      
      Agree ... retention is mandatory. The article argues you should retain authorization artifacts, not just event logs. Logs show what happened. Warrants show who signed off on what
      
      [-]
      - groby_b an hour ago ago
        
        FFIEC guidance since '21: https://www.occ.gov/news-issuances/bulletins/2021/bulletin-2...
thedudeabides5 a few seconds ago ago

a machine can never be held accountable
but the person who turned it on can
simple as
stronglikedan 7 minutes ago ago

If one of my reports came to me with that defense, I'd write them up twice. Once for whatever they did wrong, and once for insulting my intelligence and wasting my time with that "defense".
On the contrary, if they just owned up to it, chances are I wouldn't even write them up once.
kazinator 23 minutes ago ago

This article is well-written insanity.
With no amount of detailed logging makes "the AI did it" a valid excuse.
It's just a tool.
It's like blaming a loose bolt in a Boeing 737 on "screwdriver did it".
andrewflnr 38 minutes ago ago

How does the old proverb go?
> A computer must never make a management decision, because a computer cannot be held accountable.
RobotToaster 2 hours ago ago

If an employee does something during his employment, even if he wasn't told directly to do it, the company can be held vicariously liable, how is this any different?

[-]
- apercu 2 hours ago ago
  
  I agree with you but you can’t jail a gen-ai model, I guess, is where the difference lies?
  
  [-]
  - LeifCarrotson an hour ago ago
    
    "The company can be held vicariously liable" means that in this analogy, the company represents the human who used AI inappropriately, and the employee represents the AI model that did something it wasn't directly told to do.
  - phailhaus an hour ago ago
    
    Nobody tries to jail Microsoft Word, they jail the person using it.
    
    [-]
    - gorjusborg an hour ago ago
      
      Nobody tries to jail the automobile being driven when it hits a pedestrian when on cruise control. The driver is responsible for knowing the limits of the tool and adjusting accordingly.
tboyd47 2 hours ago ago

Anytime someone gives you unverified information, they're asking you to become their guinea pig.
0xTJ an hour ago ago

Why would that be any better of a defense than "that preschooler said that I should do it"? People are responsible for their work.
noitpmeder an hour ago ago

This is some absolute BS. In the current day and age you are 1000% responsible for the externalities of your use of AI.
Read the terms and conditions of your model provider. The document you signed, regardless if you read or considered it, explicitly removes any negative consequences being passed to the AI provider.
Unless you have something equally as explicit, e.g. "we do not guarantee any particular outcome from the use of our service" (probably needs to be significantly more explicitly than that, IANAL) all responsibility ends up with the entity who itself, or it's agents, foists unreliable AI decisions on downstream users.
Remember, you SIGNED THE AGGREMENT with the AI company the explicitly says it's outputs are unreliable!!
And if you DO have some watertight T&C that absolves you of any responsibility of your AI-backed-service, then I hope either a) your users explicitly realize what they are signing up for, or b) once a user is significantly burned by your service, and you try to hide behind this excuse, you lose all your business

[-]
- ceejayoz an hour ago ago
  
  T&Cs aren't ironclad.
  One in which you sell yourself into slavery, for example, would be illegal in the US.
  All those "we take no responsibility for the [valet parking|rocks falling off our truck|exploding bottles]" disclaimers are largely attempts to dissuade people from trying.
  As an example, NY bans liability waivers at paid pools, gyms, etc. The gym will still have you sign one! But they have no enforcement teeth beyond people assuming they're valid. https://codes.findlaw.com/ny/general-obligations-law/gob-sec...
  
  [-]
  - noitpmeder an hour ago ago
    
    So I can pass on contact breaches due to bugs in software I maintain due to hallucinations by the AI that I used to write the software?? Absolutely no way.
    "But the AI wrote the bug."
    Who cares? It could be you, your relative, your boss, your underling, your counterpart in India, ... Your company provided some reasonable guarantee of service (whether explitly enumerated in a contact or not) and you cannot just blindly pass the buck.
    Sure, after you've settled your claim with the user, maybe TRY to go after the upstream provider, but good luck.
    (Extreme example) -- If your company produces a pacemaker dependent on AWS/GCP/... and everyone dies as soon as cloudflare has a routing outage that cascades to your provider, oh boy YOU are fucked, not cloudflare or your hosting provider.
    
    [-]
    - ceejayoz an hour ago ago
      
      More than one person/organization can be liable at once.
      
      [-]
      - noitpmeder an hour ago ago
        
        The point of signing contracts is you explicitly set expectations for service, and explicitly assign liability. You can't just reverse that and try to pass the blame.
        Sure, if someone from GCP shows up at your business and breaks your leg or burns down your building, you can go after them, as it's outside the reasonable expectation of the business agreement you signed.
        But you better believe they will never be legally responsible for damages caused by outages of their service beyond what is reasonable, and you better believe "reasonable outage" in this case is explicitly enumerated in the contact you or your company explicitly agreed to.
        Sure they might give you free credits for the outage, but that's just to stop you from switching to a competitor, not any explicit acknowledgement they are on the hook for your lost business opportunity.
        
        [-]
        
        ceejayoz an hour ago ago
        
        > The point of signing contacts is you explicitly set expectations for servkce, and explicitly assign liability.
        Sure, but not all liability can be reassigned; I linked a concrete example of this.
        > But you better believe they will never be legally responsible for damages caused by outages of their service beyond what is reasonable, and you better believe "reasonable outage" in this case is explicitly enumerated in the contact you or your company explicitly agreed to.
        Yes, on this we agree. It'd have to be something egregious enough to amount to intentional negligence.
      - freejazz an hour ago ago
        
        "Can" isn't the same as "is"
bitwize 32 minutes ago ago

aka the Shaggy Defense for the 2020s.
rpodraza an hour ago ago

What problem is this guy trying to solve? Sorry, but in the end, someone's gonna have to be responsible and it's not gonna be a computer program. Someone approved the program's use, it's no different to any other software. If you know agent can make mistakes then you need to verify everything manually, simple as.

[-]
- pixl97 41 minutes ago ago
  
  While we're a long way off from the day science fiction becomes fact, the world is going to shit itself if a self actionable AI bootstraps and causes havoc.
gamblor956 an hour ago ago

It's not a legal defense at all.
Licensed professionals are required to review their work product. It doesn't matter if the tools they use mess up--the human is required to fix any mistakes made by their tools. In the example given by the blog, the financial analyst is either required to professional review their work product or is low enough that someone else is required to review their work product. If they don't, they can be held strictly liable for any financial losses.
However, this blog post isn't about AI Hallucinations. It's about the AI doing something else separate from the output.
And that's not a defense either. The law already assigns liability in situations like this: the user will be held liable (or more correctly: their employer, for whom the user is acting as an agent). If they want to go after the AI tooling (i.e., an indemnification action) vendor the courts will happily let them do so after any plaintiffs are made whole (or as part of an impleader action).
freejazz an hour ago ago

What a stupid article from someone that has no idea when liability attaches.
It is the burden of a defendant to establish their defense. A defendant can't just say "I didn't do it". They need to show they did not do it. In this (stupid) hypothetical, the defendant would need to show the AI acted on its own, without prompting from anyone, in particular, themselves.