• 1stTime4MeInMCU
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    1 year ago

    An oversimplification but Imagine you have an algebraic math function where every word in English can be assigned a number.

    x+y+z=n where x y z are the three words in a sentence. N is the next predicted word based on the coefficients of the previous 3.

    Now imagine you have 10 trillion coefficients instead of 3. That’s an LLM, more or less. Except it’s done procedurally and there’s actually not that many input variables (context window) just a lot of coefficients per input