• tellmeaboutit@lemmygrad.ml
    link
    fedilink
    English
    arrow-up
    1
    ·
    5 天前

    That might change now that companies are creating “reasoning” models like DeepSeek R1. They aren’t really all that different architecturally but they produce longer outputs which just requires more compute.