Abacus.ai:

We recently released Smaug-72B-v0.1 which has taken first place on the Open LLM Leaderboard by HuggingFace. It is the first open-source model to have an average score more than 80.

  • simple@lemm.ee
    link
    fedilink
    English
    arrow-up
    1
    ·
    8 months ago

    I’m afraid to even ask for the minimum specs on this thing, open source models have gotten so big lately

    • TheChurn@kbin.social
      link
      fedilink
      arrow-up
      0
      ·
      7 months ago

      Every billion parameters needs about 2 GB of VRAM - if using bfloat16 representation. 16 bits per parameter, 8 bits per byte -> 2 bytes per parameter.

      1 billion parameters ~ 2 Billion bytes ~ 2 GB.

      From the name, this model has 72 Billion parameters, so ~144 GB of VRAM

  • Miss Brainfarts@lemmy.blahaj.zone
    link
    fedilink
    English
    arrow-up
    0
    ·
    7 months ago

    That’s nice and all, but what are some FOSS models I can run on GPU with only 4GB?

    I’ve tried Deepseek Coder, and it’s pretty nice for what I use it for. Then there’s TinyLlama, which… well it’s fast, but I need to be veeeery exact in how I prompt it.

    • Fisch@lemmy.ml
      link
      fedilink
      English
      arrow-up
      1
      ·
      edit-2
      7 months ago

      Unfortunately LLMs need a lot of VRAM. You could try using koboldcpp, it runs on the CPU but let’s you offload layers onto the GPU. That way you might be able to stay withing those 4gb even with larger models.

      Edit: I forgot to mention there’s a fork of koboldcpp with rocm for AMD cards, which is about twice as fast if I remember correctly. Only relevant if you have an AMD card tho.

      Edit 2: This is the model I use btw