• Lvxferre
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    10 months ago

    That’s perhaps why image generators are comparatively better than text generators. But there’s still something off, by your example it seems that the model cannot reliably use clues like position to understand “this is a «leg»”. And I don’t know much about image generators but I think that they’re still statistics- and probability-based.