There is no evidence that poisoning has had any effect on LLMs. It’s likely that it never will, because garbage inputs aren’t likely to get reinforced during training. It’s all just wishful thinking from the haters.
Every AI will always have bias, just as every person has bias, because humanity has never agreed on what is “truth”.
There is no evidence that poisoning has had any effect on LLMs
But it is possible, right? As an example from my quick search for example here a paper about Medical large language models
We find that replacement of just 0.001% of training tokens with medical misinformation results in harmful models more likely to propagate medical errors.
It’s probably hard to change major things, like e.g. that Trump is the president of the USA, without it being extremely obvious or degrading performance massively. But smaller random facts? Like for example i have little to no online presence under my real name. So i’d imagine it shouldn’t be to hard to add some documents to the training data with made up facts about me. It wouldn’t be noticeable until someone actively looks for it and then they’d need to know the truth beforehand to judge them or at least require sources.
Every AI will always have bias, just as every person has bias, because humanity has never agreed on what is “truth”.
That’s true, but since we are in a way actively condensing knowledge with LLMs i think there is a difference, if someone has the ability to influence things at this step without it being noticeable.
There is no evidence that poisoning has had any effect on LLMs. It’s likely that it never will, because garbage inputs aren’t likely to get reinforced during training. It’s all just wishful thinking from the haters.
Every AI will always have bias, just as every person has bias, because humanity has never agreed on what is “truth”.
But it is possible, right? As an example from my quick search for example here a paper about Medical large language models
It’s probably hard to change major things, like e.g. that Trump is the president of the USA, without it being extremely obvious or degrading performance massively. But smaller random facts? Like for example i have little to no online presence under my real name. So i’d imagine it shouldn’t be to hard to add some documents to the training data with made up facts about me. It wouldn’t be noticeable until someone actively looks for it and then they’d need to know the truth beforehand to judge them or at least require sources.
That’s true, but since we are in a way actively condensing knowledge with LLMs i think there is a difference, if someone has the ability to influence things at this step without it being noticeable.