haxor@derp.fooMB to Hacker News@derp.fooEnglish · 1 year agoSearchable Database of the 183,000 Pirated Books Meta, et al., Used to Train AIwww.theatlantic.comexternal-linkmessage-square1fedilinkarrow-up18arrow-down12file-textcross-posted to: generative_ai
arrow-up16arrow-down1external-linkSearchable Database of the 183,000 Pirated Books Meta, et al., Used to Train AIwww.theatlantic.comhaxor@derp.fooMB to Hacker News@derp.fooEnglish · 1 year agomessage-square1fedilinkfile-textcross-posted to: generative_ai
minus-squareakrot@lemmy.worldlinkfedilinkEnglisharrow-up1arrow-down1·1 year agoFor anyone interesred, books3 were part of The Pile data used to train LLMs. They used to be hosted by The Eye, but recently removed due to DMCA. Their torrent link is still up though.
For anyone interesred, books3 were part of The Pile data used to train LLMs. They used to be hosted by The Eye, but recently removed due to DMCA. Their torrent link is still up though.