django@sh.itjust.works to

LocalLLaMA@sh.itjust.worksEnglish · 1 year ago

Any way to prune LLMs?

2

13

Any way to prune LLMs?

django@sh.itjust.works to

LocalLLaMA@sh.itjust.worksEnglish · 1 year ago

2

Hey, I’m working on some local LLM applications and my goal is to run the smallest model possible without crippling performance. I’m already using 4 bit GPTQ but I want something smaller. These models have been trained on such a massive amount of data but my specific use case only touches a very very small fraction of that, so I would imagine it’s possible to cut away large chunks of the model that I don’t care about. I’m wondering if there has been any work on runtime pruning of LLMs (not just static pruning based on model weights) based on “real world” data. Something like: you run the model a bunch of times with your actual data and monitor the neuron activations to inform some kind of pruning process. Does anyone here know about something like that?

You must log in or register to comment.

Chat

Zeth0s@lemmy.world
link
fedilink
English
arrow-up
2·
1 year ago
The closest that I know is distillation, you can google to get few resources (e.g. https://huggingface.co/papers/2306.08543). I don’t know if it is what you are looking for
minipasila@lemmy.fmhy.ml
link
fedilink
English
arrow-up
2·
1 year ago
I don’t know about that, but you could try GGML (llama.cpp). It has quantization up to 2-bits so that might be small enough.

LocalLLaMA@sh.itjust.works

localllama@sh.itjust.works

You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !localllama@sh.itjust.works

Community to discuss about LLaMA, the large language model created by Meta AI.

This is intended to be a replacement for r/LocalLLaMA on Reddit.

Visibility: Public

This community can be federated to other instances and be posted/commented in by their users.

1 user / day
1 user / week
115 users / month
740 users / 6 months
7 local subscribers
2.09K subscribers
212 Posts
808 Comments
Modlog