• j4k3@lemmy.world
    link
    fedilink
    English
    arrow-up
    5
    ·
    8 months ago

    I think the main issue may become the control over new models. Maybe not as much in the image diffusion space, but more in the LLMs. The models are static. We live in a time when a Llama2 70B is accessible and current. The question is will these models stay relevant if large models are not continuously provided. We can patch and merge, but at what point are we going to need an updated base model.

    Personally, I think most of current AI is like an analog to the early days of microprocessors line the 6502, 68k, and 8088. The real future is in the expansion and integration of AI peripherals into more complex agents and model loader code. We need easily accessible ways to run a small model that handles gate keeping with boolean outputs integrated into the larger model. We need something like a model that can spot errors faster than real time as the large model output is streaming, the smaller running in parallel on a single CPU logical core. If an error is found, step back the generation by a single step and try again. Stuff like this can be done with langchain but they need to be made easy for end users.

    I think we’ll see more models with neural networks designed to run on specific hardware because this is more marketable. Open Source (I’m a diehard fan of FOSS here) may struggle to keep up with models designed for devices like phones where the only option is to design the model for the device hardware. I really hope that does not happen.

    I would like to see people start selling high quality LoRAs for niche stuff. I would spend a few bucks for a LoRA embedding generated specifically for my favorite 70B chat model that covers all of Isaac Author’s YT content on Science and Futurism.