• bionicjoey@lemmy.ca
    link
    fedilink
    arrow-up
    62
    ·
    7 months ago

    I really liked this article. If you know how AI works, it’s tempting to call what it does “lying”, but I like “bullshit” as a distinct concept for describing being totally indifferent to facts. And it’s interesting that 30% of people in the UK view themselves as having a “bullshit” job, one where they don’t think they contribute anything of value to society. You can totally see why language models would be appealing to so many people.

    • cheese_greater@lemmy.world
      link
      fedilink
      arrow-up
      10
      arrow-down
      2
      ·
      edit-2
      7 months ago

      confabulating

      I think this is a better term for it

      Edit: ChatGPT == Trump[consummate bullshitter]? I feel like its “intentions” or “programming” are slightly above that but maybe not…

      Edit: its interesting how a program can be written that basically replicates Trump’s speech patterns in a replicable and recognizeable way. Where did Trump derive his speech pattern? Like, I know he reads Hitler speeches on his nightstand but is it only Hitler that informs his parole (individual speech patterns)?

      • bionicjoey@lemmy.ca
        link
        fedilink
        arrow-up
        7
        ·
        7 months ago

        The only issue I have there is that it isn’t as widely known and understood. Also if you say someone is confabulating, it means they don’t realize that they are bullshitting, whereas language models are literally designed to bullshit in this manner.

        • cheese_greater@lemmy.world
          link
          fedilink
          arrow-up
          6
          arrow-down
          1
          ·
          edit-2
          7 months ago

          Confabulation

          • An informal conversation
          • (psychiatry) a plausible but imagined memory that fills in gaps in what is remembered

          Like there’s no perfect analogy but I am partial to this characterization, I dunno.

          Discuss aha

          Edit: I really hope I’m not the originator of this, don’t need that in my life right now. It just seems to fit better imao

          • bionicjoey@lemmy.ca
            link
            fedilink
            arrow-up
            2
            ·
            edit-2
            7 months ago

            I know what it means. For me it also implies an attempt at saying something which is true, even if simply interpolating from other facts. It doesn’t fully reflect the total indifference to reality that is exhibited by language models

  • huginn@feddit.it
    link
    fedilink
    arrow-up
    57
    arrow-down
    4
    ·
    7 months ago

    It’s interesting to me how many people I’ve argued with about LLMs. They vehemently insist that this is a world changing technology and the start of the singularity.

    Meanwhile whenever I attempt to use one professionally it has to be babied and tightly scoped down or else it goes way off the rails.

    And structurally LLMs seem like they’ll always be vulnerable to that. They’re only useful because they bullshit but that also makes them impossible to rely on for anything else.

    • mkhoury@lemmy.ca
      link
      fedilink
      arrow-up
      23
      arrow-down
      3
      ·
      7 months ago

      I’ve been using LLMs pretty extensively in a professional capacity and with the proper grounding work it becomes very useful and reliable.

      LLMs on their own is not the world changing tech, LLMs+grounding (what is now being called a Cognitive Architecture), that’s the world changing tech. So while LLMs can be vulnerable to bullshitting, there is a lot of work around them that can qualitatively change their performance.

      • huginn@feddit.it
        link
        fedilink
        arrow-up
        10
        ·
        7 months ago

        I’m a few months out of date in the latest in the field and I know it’s changing quickly. What progress has been made towards solving hallucinations? The feeding output into another LLM for evaluation never seemed like a tenable solution to me.

        • mkhoury@lemmy.ca
          link
          fedilink
          arrow-up
          6
          ·
          7 months ago

          Essentially, you don’t ask them to use their internal knowledge. In fact, you explicitly ask them not to. The technique is generally referred to as Retrieval Augmented Generation. You take the context/user input and you retrieve relevant information from the net/your DB/vector DB/whatever, and you give it to an LLM with how to transform this information (summarize, answer a question, etc).

          So you try as much as you can to “ground” the LLM with knowledge that you trust, and to only use this information to perform the task.

          So you get a system that can do a really good job at transforming the data you have into the right shape for the task(s) you need to perform, without requiring your LLM to act as a source of information, only a great data massager.

          • Blóðbók@slrpnk.net
            link
            fedilink
            arrow-up
            5
            ·
            7 months ago

            That seems like it should work in theory, but having used Perplexity for a while now, it doesn’t quite solve the problem.

            The biggest fundamental problem is that it doesn’t understand in any meaningful capacity what it is saying. It can try to restate something it sourced from a real website, but because it doesn’t understand the content it doesn’t always preserve the essence of what the source said. It will also frequently repeat or contradict itself in as little as two paragraphs based on two sources without acknowledging it, which further confirms the severe lack of understanding. No amount of grounding can overcome this.

            Then there is the problem of how LLMs don’t understand negation. You can’t reliably reason with it using negated statements. You also can’t ask it to tell you about things that do not have a particular property. It can’t filter based on statements like “the first game in the series, not the sequel”, or “Game, not Game II: Sequel” (however you put it, you will often get results pertaining to the sequel snucked in).

            • BluesF@feddit.uk
              link
              fedilink
              arrow-up
              2
              ·
              7 months ago

              Yeah, it’s just back exactly to the problem the article points out - refined bullshit is still bullshit. You still need to teach your LLM how to talk, so it still needs that cast bullshit input into its “base” before you feed it the “grounding” or whatever… And since it doesn’t actually understand any of that grounding it’s just yet more bullshit.

          • huginn@feddit.it
            link
            fedilink
            arrow-up
            4
            ·
            7 months ago

            Definitely a good use for the tool: NLP is what LLMs do best and pinning down the inputs to only be rewording or compressing ground truth avoids hallucination.

            I expect you could use a much smaller model than gpt to do that though. Even llama might be overkill depending on how tightly scoped your DB is

    • Conradfart@lemmy.ca
      link
      fedilink
      arrow-up
      19
      arrow-down
      1
      ·
      7 months ago

      They are useful when you need to generate quasi meaningful bullshit in large volumes easily.

      LLMs are being used in medicine now, not to help with diagnosis or correlate seemingly unrelated health data, but to write responses to complaint letters or generate reflective portfolio entries for appraisal.

      Don’t get me wrong, outsourcing the bullshit and waffle in medicine is still a win, it frees up time and energy for the highly trained organic general intelligences to do what they do best. I just don’t think it’s the exact outcome the industry expected.

      • sugar_in_your_tea@sh.itjust.works
        link
        fedilink
        arrow-up
        9
        ·
        7 months ago

        I think it’s the outcome anyone really familiar with the tech expected, but that rarely translates to marketing departments and c-suite types.

        I did an LLM project in school, and while that was a limited introduction, it was enough for me to doubt most of the claims coming from LLM orgs. An LLM is good at matching its corpus and that’s about it. So it’ll work well for things like summaries, routine text generation, and similar tasks (and it’s surprisingly good at forming believable text), but it’ll always disappoint with creative work.

        I’m sure the tech can do quite a bit more than my class went through, but the limitations here are quite fundamental to the tech.

      • huginn@feddit.it
        link
        fedilink
        arrow-up
        3
        ·
        7 months ago

        That’s kinda the point of my above comment: they’re useful for bullshit: that’s why they’ll never be trustworthy

    • WasPentalive@beehaw.org
      link
      fedilink
      arrow-up
      10
      arrow-down
      1
      ·
      edit-2
      7 months ago

      I use chatgpt to make up stuff, imagine things that don’t exist for fun - like a ‘pitch’ for the next new Star Trek series, or to reword my much too succinct prose for a manual for a program I am writing (‘Calcula’ in gitlab) or ideas for a new kind of restaurant (The chef teaches you how to cook the meal you are about to eat) - but never have it code or ask it about facts, it makes them up just as easily as the stuff I just described.

    • Damage@feddit.it
      link
      fedilink
      arrow-up
      16
      arrow-down
      10
      ·
      7 months ago

      It’s a computer that understands my words and can reply, even complete tasks upon request, nevermind the result. To me that’s pretty groundbreaking.

      • huginn@feddit.it
        link
        fedilink
        arrow-up
        27
        arrow-down
        3
        ·
        7 months ago

        It’s a probabilistic network that generates a response based on your input.

        No understanding required.

        • 0ops@lemm.ee
          link
          fedilink
          arrow-up
          15
          ·
          7 months ago

          It’s a probabilistic network that generates a response based on your input.

          Same

        • Eheran@lemmy.world
          link
          fedilink
          arrow-up
          8
          arrow-down
          2
          ·
          7 months ago

          Ask it to write code that replaces every occurrence of “me” in every file name in a folder with “us”, but excluding occurrences that are part of a word (like medium should not be usdium) and it will give you code that does exactly that.

          You can ask it to write code that does a heat simulation in a plate of aluminum given one side of heated and the other cooled. It will get there with some help. It works. That’s absolutely fucking crazy.

          • sugar_in_your_tea@sh.itjust.works
            link
            fedilink
            arrow-up
            6
            ·
            7 months ago

            Maybe, that really depends on if that task or a very similar task exists in sufficient amounts in its training set. Basically, you could get essentially the same result by searching online for code examples, the LLM might just make it a little faster (and probably introduce some errors as well).

            An LLM can only generate text that exists in its training data. That’s a pretty important limitation, which has all kinds of copyright-related issues associated with it (e.g. I can’t just copy a code example from GitHub in most cases).

            • Eheran@lemmy.world
              link
              fedilink
              arrow-up
              3
              arrow-down
              3
              ·
              7 months ago

              No, it does not depend on preexisting tasks, which is why I told you those 2 random examples. You can come up with new, never before seen questions if you want to. How to stack a cable, car battery, beer bottle, welding machine, tea pot to get the highest tower. Whatever. It is not always right, but also much more capable than you think.

              • huginn@feddit.it
                link
                fedilink
                arrow-up
                2
                ·
                7 months ago

                It is dependent on preexisting tasks, you’re just describing encoded latent space.

                It’s not explicit but it’s implicitly encoded.

                And you still can’t trust it because the encoding is intrinsically lossy.

          • huginn@feddit.it
            link
            fedilink
            arrow-up
            6
            arrow-down
            2
            ·
            edit-2
            7 months ago

            Ask it to finish writing the code to fetch a permission and it will make a request with a non-existent code. Ask it to implement an SNS API invocation and it’ll make up calls that don’t exist.

            Regurgitating code that someone else wrote for an aluminum simulation isn’t the flex you think it is: that’s just an untrustworthy search engine, not a thinking machine

        • iopq@lemmy.world
          link
          fedilink
          arrow-up
          6
          arrow-down
          7
          ·
          7 months ago

          Yet it can outperform humans on some tests involving logic. It will never be perfect, but that implies you can test its IQ

          • huginn@feddit.it
            link
            fedilink
            arrow-up
            11
            arrow-down
            2
            ·
            7 months ago
            1. Not consistently and not across truly logical tests. They abjectly fail at abstract reasoning. They do well only in very specific cases.
            2. IQ is an objectively awful measure of human intelligence. Why would it be useful for artificial intelligence?
            3. For these tests that are so centered around specific facts: of course a model that has had the entirety of the Internet encoded into it has the answers. The shocking thing is that the model is so lossy that it doesn’t ace the test.
            • Feathercrown@lemmy.world
              link
              fedilink
              English
              arrow-up
              4
              arrow-down
              3
              ·
              7 months ago

              IQ correlates with a good number of things though. It’a not perfect but it’s not meaningless either.

              • huginn@feddit.it
                link
                fedilink
                arrow-up
                3
                arrow-down
                2
                ·
                7 months ago

                And global warming correlates with the decline in piracy rates. IQ is a garbage statistic invented by early 20th century eugenicists to prove that white people were the best.

                You can’t boil down the nuance of the most complex object in the known universe to a single number.

                • Feathercrown@lemmy.world
                  link
                  fedilink
                  English
                  arrow-up
                  3
                  arrow-down
                  3
                  ·
                  7 months ago

                  Not perfectly you can’t. But similarly to how people’s SAT scores predict their future success, IQ tests in aggregate do have predictive power.

            • iopq@lemmy.world
              link
              fedilink
              arrow-up
              1
              arrow-down
              2
              ·
              7 months ago

              IQ is objectively a good measure of human intelligence. High IQ people have higher educational achievement, income, etc.

                • iopq@lemmy.world
                  link
                  fedilink
                  arrow-up
                  1
                  arrow-down
                  1
                  ·
                  7 months ago

                  I never said it’s the cause. We’re trying to find a measure that correlates well with actual intelligence g

                  IQ correlates with g, but also income/education correlates with g because smarter people do better in these metrics.

                  IQ doesn’t make you smarter, but smarter people can do better on IQ tests

          • exponential_wizard@lemm.ee
            link
            fedilink
            arrow-up
            8
            arrow-down
            4
            ·
            7 months ago

            “Test it’s IQ”. The fact that you think IQ is a useful test for intelligence tells me everything I need to know

            • iopq@lemmy.world
              link
              fedilink
              arrow-up
              1
              arrow-down
              2
              ·
              7 months ago

              The fact you went out of your way to write it’s when I wrote the correct “its” tells me everything I need to know about your educational achievement

      • amki@feddit.de
        link
        fedilink
        arrow-up
        18
        arrow-down
        1
        ·
        7 months ago

        That is exactly what it doesn’t. There is no “understanding” and that is exactly the problem. It generates some output that is similar to what it has already seen from the dataset it’s been fed with that might correlate to your input.

  • RobotToaster
    link
    fedilink
    arrow-up
    27
    arrow-down
    1
    ·
    7 months ago

    I’ve said before it writes like a corporate middle manager.

  • memfree@beehaw.org
    link
    fedilink
    English
    arrow-up
    20
    ·
    7 months ago

    “Godfather of AI” Geoff Hinton, in recent public talks, explains that one of the greatest risks is not that chatbots will become super-intelligent, but that they will generate text that is super-persuasive without being intelligent, in the manner of Donald Trump or Boris Johnson. In a world where evidence and logic are not respected in public debate, Hinton imagines that systems operating without evidence or logic could become our overlords by becoming superhumanly persuasive, imitating and supplanting the worst kinds of political leader.

    Why is “superhumanly persuasive” always being done for stupid stuff and not, I don’t know, getting people to drive fuel efficient cars instead of giant pickups and suvs?

    • huginn@feddit.it
      link
      fedilink
      arrow-up
      20
      ·
      7 months ago

      Because superhuman persuasion generally only works on things people don’t have to change to accept.

      Racist dogma is so persuasive to some because it lets them recontextualize their hardships as inflicted on them by {insert race here} without materially changing their life in any way and despite it being horseshit.

      Ultimately it’s not superhuman persuasion: it’s extremely human. It only works when it’s playing to your existing biases, connects to the lazy, gut-based feeling method of navigating life rather than logic or reason.

      But if popular enough it can be extremely dangerous: because the reason that persuasion machine is wrong will not be readily apparent until after the mistakes have been made.

      • Waraugh@lemmy.dbzer0.com
        link
        fedilink
        arrow-up
        10
        ·
        7 months ago

        Great write up. I would also include the establishment of an ‘enemy’ to blame shortcomings or adversity on. I’m not going to be able to explain as well as you do but your post had be thinking of how young, depressed, disadvantaged, etc wind up indoctrinated into racism, incels, and other such hate groups.

    • bionicjoey@lemmy.ca
      link
      fedilink
      arrow-up
      7
      ·
      7 months ago

      Because there’s no money in getting people to use less. Persuasion in capitalism is about marketing. It’s about convincing people to buy things they don’t need.

      • memfree@beehaw.org
        link
        fedilink
        English
        arrow-up
        2
        arrow-down
        2
        ·
        7 months ago

        Wanna be the bigwig on your block? Have I got a product for YOU! Solar Panels! Make your house shine with newfangled tech that’ll be the envy of all your neighbors! Go solar, baby! Stick it to the electric company and make THEM pay for a change. Solar! You’ll be beaming.

        ok, I suck at faking ai chat

  • flop_leash_973@lemmy.world
    link
    fedilink
    arrow-up
    11
    arrow-down
    1
    ·
    edit-2
    7 months ago

    To me, things like ChatGPT are just more efficient ways to search sites they scrap data from like Stack Overflow. If they ever drive enough traffic away from their sources to kill the sources the likes of ChatGPT will become mostly useless.

  • nayminlwin@lemmy.ml
    link
    fedilink
    arrow-up
    4
    ·
    7 months ago

    Not and AI expert but I’ve never been convinced by AI that’s trained on human provided data. It’s just gonna be garbage in, garbage out. To get something substantially useful from AI, it needs to be… axiomatic, I guess. A few years ago, there was Alpha Zero learning only the rules of chess but within just a few hours, it learned all the chess openings/theories that took human chess masters centuries to formulate. It even has it’s own effective opening lines that used to be considered wasteful/unsound before. Granted chess game rules and win conditions are relatively simple compared to real life problems. So may be, it’s too early for general purpose AI research to billions into.

  • 0x4E4F@infosec.pub
    link
    fedilink
    English
    arrow-up
    1
    ·
    7 months ago

    It says what you want to hear. Sure, it can help sometimes, but those cases are very rare and are mostly info related things, that you (me) might be too lazy to google and go through pages of results.

    I’ve fed it scripts and programs that have flaws in them and asked it to check them, it says it will run… each and every time. If the mistakes are syntaxical, it will point to them, but other than that, it won’t correct logical mistakes (what the output will be) unless you point it to the mistake, in which case, yes it will correct it, but that’s not the point, the point is for it to tell me that this thing might run, but it won’t do anything of value or it will return garbled results.

  • MxM111@kbin.social
    link
    fedilink
    arrow-up
    1
    ·
    7 months ago

    I think the author of this article really like the word “bullshit” for clickbaitery reasons.

    • sab@kbin.social
      link
      fedilink
      arrow-up
      2
      ·
      7 months ago

      Well, he does reason it by rooting his definition in Harry Frankfurt’s On Bullshit and elaborating both why bullshit is different from lies and why Frankfurt’s definition of bullshiti is applicable to AI even when it happens to be right about something.

      I’m not sure how you could make the word bullshit any less clickbaity. Personally I suspect he might just have a valid point.

      • MxM111@kbin.social
        link
        fedilink
        arrow-up
        1
        ·
        7 months ago

        The industry uses word “hallucination” quite long and successfully. This is clickbait, but I agree that the author borrowed this word from another author.

        • sab@kbin.social
          link
          fedilink
          arrow-up
          2
          ·
          7 months ago

          The hallucination concept refers to something different - it’s when the language model starts digging itself into a hole and making up stuff to fit it’s increasingly narrow story. Nobody would say a language model is hallucinating if what it says is accurate.

          The author here makes a different case - that the AI is constantly bullshitting, similar to what you’d expect from Boris Johnson. It doesn’t matter whether it’s wrong or right in any particular case - it goes ahead with the exact same confidence no matter what. There’s no real difference between an LLM when it’s “hallucinating” and when it’s working correctly.

          LLMs are not always hallucinating, but they’re always bullshitting. Frankly, I think using the term hallucination to describe a spiralling algorithm might be a load of bullshit in its own right, fashioned by people who are desperate to liken their predictive model to the workings of a human brain.

          • MxM111@kbin.social
            link
            fedilink
            arrow-up
            1
            ·
            7 months ago

            I can say exactly the same thing LLM always hallucinate, just sometimes they do it correctly and sometimes not.

            • sab@kbin.social
              link
              fedilink
              arrow-up
              1
              ·
              7 months ago

              So you’d say it’s a hallucination machine rather than a bullshit generator?

              I think you’re on to a good point - the industry seems to say their model is hallucinating whenever is does something they don’t approve of, but the fact of the matter is that it does the exact same thing as it always does.