• BB84
    link
    fedilink
    English
    arrow-up
    2
    arrow-down
    5
    ·
    edit-2
    2 days ago

    Can someone explain why I am being downvoted and attacked in this thread? I swear I am not sealioning. Genuinely confused.

    @sc_griffith@awful.systems asked how request frequency might impact cost per request. Batch inference is a reason (ask anyone in the self-hosted LLM community). I noted that this reason only applies at very small scale, probably much smaller than what OpenAI is operating at.

    @dgerard@awful.systems why did you say I am demanding someone disprove the assertion? Are you misunderstanding “I would be very very surprised if they couldn’t fill [the optimal batch size] for any few-seconds window” to mean “I would be very very surprised if they are not profitable”?

    The tweet I linked shows that good LLMs can be much cheaper. I am saying that OpenAI is very inefficient and thus economically “cooked”, as the post title will have it. How does this make me FYGM? @froztbyte@awful.systems

    • self@awful.systems
      link
      fedilink
      English
      arrow-up
      9
      ·
      2 days ago

      Can someone explain why I am being downvoted and attacked in this thread? I swear I am not sealioning. Genuinely confused.

      my god! let me fix that