Who here is messing with FOSS AI? What ya playing with?

@TheOtherJake@beehaw.org

What hardware does it take to run a 30B?

Veraxus

I’m running it in GPT4All (CPU-based) with 64GB of RAM, and it runs pretty well. I’m not sure what you’d need if you were running it on GPU instead.

@TheOtherJake@beehaw.org

WizardLM 30B at 4 bits with the GGML version on Oobabooga runs almost as fast as Llama2 7B on just the GPU. I set it up with 10 threads on the CPU and ~20 layers on the GPU. That leaves plenty of room for a 4096 context with a batch size of 2048. I can even run a 2GB Stable Diffusion model at the same time with my 3080’s 16GBV.

Have you tried any of the larger models? I just ordered 64GB of ram. I also got kobold mostly working. I hope to use it to try Falcon 40. I really want to try a 70B model at 2-4 bit and see how its accuracy is.

Who here is messing with FOSS AI? What ya playing with?

Who here is messing with FOSS AI? What ya playing with?

Technology