Smaug-72B-v0.1: The New Open-Source LLM Roaring to the Top of the Leaderboard

hexual@lemmy.world · 9 months ago

Smaug-72B-v0.1: The New Open-Source LLM Roaring to the Top of the Leaderboard

Fisch@lemmy.ml · edit-2 9 months ago

Unfortunately LLMs need a lot of VRAM. You could try using koboldcpp, it runs on the CPU but let’s you offload layers onto the GPU. That way you might be able to stay withing those 4gb even with larger models.

Edit: I forgot to mention there’s a fork of koboldcpp with rocm for AMD cards, which is about twice as fast if I remember correctly. Only relevant if you have an AMD card tho.

Edit 2: This is the model I use btw

Smaug-72B-v0.1: The New Open-Source LLM Roaring to the Top of the Leaderboard

Smaug-72B-v0.1: The New Open-Source LLM Roaring to the Top of the Leaderboard

abacusai/Smaug-72B-v0.1 · Hugging Face