How to use GPUs over multiple computers for local AI?

marauding_gibberish142@lemmy.dbzer0.com · 3 days ago

How to use GPUs over multiple computers for local AI?

marauding_gibberish142@lemmy.dbzer0.com · 3 days ago

Thanks but I’m not going to run supercomputers. I just want to run 4 GPUs separately because of inadequate PCIe lanes in a single computer to run 24B-30B models

vane@lemmy.world · edit-2 2 days ago

I believe you can run 30B models on single used rtx 3090 24GB at least I run 32B deepseek-r1 on it using ollama. Just make sure you have enought ram > 24GB.

marauding_gibberish142@lemmy.dbzer0.com · 2 days ago

Heavily quantized?

vane@lemmy.world · edit-2 2 days ago

I run this one. https://ollama.com/library/deepseek-r1:32b-qwen-distill-q4_K_M with this frontend https://github.com/open-webui/open-webui on single rtx 3090 hardware 64gb ram. It works quite well for what I wanted it to do. I wanted to connect 2x 3090 cards with slurm to run 70b models but haven’t found time to do it.

marauding_gibberish142@lemmy.dbzer0.com · 2 days ago

I see. Thanks