All it takes to poison AI training data is to create a website

(schneier.com)

5 points | by u1hcw9nx 8 hours ago ago

2 comments

  • ticulatedspline 4 hours ago ago

    In fairness it may take even less than that to poison a human. Judging the legitimacy of a data source is not a new problem. Though this is still interesting for a couple of reasons and I'm surprised Google would fall for it.

    Google's bread and butter is rankings, I would think their skill in ranking sites would trickle into some algorithm for weighting whether a source has some legitimacy. It's interesting it would pick it up and weight it so heavily with an N of 1 from some random website. That without corroboration literally anywhere else for something that would clearly have at least some other presence would be ranked so high.

    I actually wonder if this is an artifact of a naive implementation of "search and regurgitate" or if the system had good reason to believe information from whoever "Thomas Germain" is was trustworthy.

  • andsoitis 8 hours ago ago

    > These things are not trustworthy, and yet they are going to be widely trusted.

    Sure, but this is also the Achilles heel. The resistance has a simple method of sabotage.

    Human minds are able to jump outside the system, unlike the system itself.