Couldn't have happened to a nicer guy

☆ Yσɠƚԋσʂ ☆@lemmy.ml · 1 month ago

Couldn't have happened to a nicer guy

☆ Yσɠƚԋσʂ ☆@lemmy.ml · 1 month ago

I didn’t say that using LoRA makes it more open, I was pointing out that you don’t need the original data to extend the model.

Basically what you’re talking about is being able to replicate the original model from scratch given the code and the data. And since the data component is missing you can’t replicate the original model. I personally don’t find this to be that much of a problem because people could create a comparable model from scratch if they really wanted to using an open data set.

The actual innovation with DeepSeek lies in the use of mixture-of-experts approach to get far better performance. While it has 671 billion parameters overall, it only uses 37 billion at a time, making it very efficient. For comparison, Meta’s Llama3.1 uses 405 billion parameters used all at once. That’s the really interesting part of the whole thing. That’s the part where openness really matters.

And I full expect that OpenAI will incorporate this idea into their models. The disaster for open AI is in the fact that their whole business model around selling subscriptions is now dead in the water. When models were really expensive to run, then only a handful of megacorps could do it. Now, it turns out that you can get the same results at a fraction of the cost.