NVIDIA’s new AI chatbot runs locally on your PC

catculation@lemmy.zip · 9 months ago

NVIDIA’s new AI chatbot runs locally on your PC

General_Effort@lemmy.world · 9 months ago

That was an annoying read. It doesn’t say what this actually is.

It’s not a new LLM. Chat with RTX is specifically software to do inference (=use LLMs) at home, while using the hardware acceleration of RTX cards. There are several projects that do this, though they might not be quite as optimized for NVIDIA’s hardware.

Go directly to NVIDIA to avoid the clickbait.

Chat with RTX uses retrieval-augmented generation (RAG), NVIDIA TensorRT-LLM software and NVIDIA RTX acceleration to bring generative AI capabilities to local, GeForce-powered Windows PCs. Users can quickly, easily connect local files on a PC as a dataset to an open-source large language model like Mistral or Llama 2, enabling queries for quick, contextually relevant answers.

Source: https://blogs.nvidia.com/blog/chat-with-rtx-available-now/

Download page: https://www.nvidia.com/en-us/ai-on-rtx/chat-with-rtx-generative-ai/

GenderNeutralBro@lemmy.sdf.org · 9 months ago

Pretty much every LLM you can download already has CUDA support via PyTorch.

However, some of the easier to use frontends don’t use GPU acceleration because it’s a bit of a pain to configure across a wide range of hardware models and driver versions. IIRC GPT4All does not use GPU acceleration yet (might need outdated; I haven’t checked in a while).

If this makes local LLMs more accessible to people who are not familiar with setting up a CUDA development environment or Python venvs, that’s great news.

General_Effort@lemmy.world · 9 months ago

I’d hope that this uses the hardware better than Pytorch. Otherwise, why the specific hardware demands? Well, it can always be marketing.

There are several alternatives that offer 1-click installers. EG in this thread:

AGPL-3.0 license: https://jan.ai/

MIT license: https://ollama.com/

MIT license: https://gpt4all.io/index.html

(There’s more.)

BertramDitore@lemmy.world · 9 months ago

They say it works without an internet connection, and if that’s true this could be pretty awesome. I’m always skeptical about interacting with chatbots that run in the cloud, but if I can put this behind a firewall so I know there’s no telemetry, I’m on board.

furzegulo1312@lemmy.dbzer0.com · 9 months ago

i have no need to talk to my gpu, i have a shrink for that

whodatdair@lemm.ee · 9 months ago

Idk I kinda like the idea of a madman living in my graphics card. I want to be able to spin them up and have them tell me lies that sound plausible and hallucinate things.

gaifux@lemmy.world · 8 months ago

Your shrink renders video frames?

femboy_bird@lemmy.blahaj.zone · edit-2 9 months ago

it gives the chatbot access to your files and documents

I’m sure nvidia will be trustworthy and responsible with this

Coldgoron@lemmy.world · 9 months ago

I recommend jan.ai over this, last I heard it mentioned it was a decent option.

PlexSheep@feddit.de · 9 months ago

I use https://huggingface.co/chat , you can also easily host open source models on your local machine