I don’t consider myself very technical. I’ve never taken a computer science course and don’t know python. I’ve learned some things like Linux, the command line, docker and networking/pfSense because I value my privacy. My point is that anyone can do this, even if you aren’t technical.

I tried both LM Studio and Ollama. I prefer Ollama. Then you download models and use them to have your own private, personal GPT. I access it both on my local machine through the command line but I also installed Open WebUI in a docker container so I can access it on any device on my local network (I don’t expose services to the internet).

Having a private ai/gpt is pretty cool. You can download and test new models. And it is private. Yes, there are ethical concerns about how the model got the training. I’m not minimizing those concerns. But if you want your own AI/GPT assistant, give it a try. I set it up in a couple of hours, and as I said… I’m not even that technical.

Yeah, I like it too. My only issue is ollama’s lack of intel support. I have been looking at issue 1590 on their GitHub. For now I have a 1050ti in a cardboard box PC with other hardware being 10+ years old and a mixed set of RAM totalling 12G. It also has a 100Mbit nic, so I can’t take advantage of full internet speed when downloading models. The worst part is they can support intel, but haven’t merged the solution because of an issue with the windows intel drivers. Linux is fine but I can 't have it. I wasn’t planning to rant, but I already typed it so… enjoy?

@klopstock@lemmy.specksick.com
link
fedilink
English
3
edit-2
1M

There is ipex-llm from Intel which you can use with your intel IGPU/GPU/CPU for llms which also supports ollama.

@chagall@lemmy.world
creator
link
fedilink
English
11
edit-2
1M

Yeah, I have an NVDIA GPU and it is magic. The best part is when you are using Ollama, open a second terminal window and enter the command, watch -n 0.5 nvidia-smi and you can see your GPU usage go up and down in real-time as you ask the GPT questions. Pretty cool.

Hopefully they get the ARC folks up and running soon.

BlueKey
link
fedilink
31M

You can also achieve this with nvtop and have a pretty TUI (terminal UI).

Possibly linux
link
fedilink
English
11M

I switched from OpenWebUI to Alpaca as I had no use for multi accounts.

@chagall@lemmy.world
creator
link
fedilink
English
21M

Open WebUI now has a docker environment variable so you can, by default, turn off the login page. You just declare it when you’re spinning up the container and you’re good to go.

Possibly linux
link
fedilink
English
21M

I like libadwaita which Alpaca has

I am going to be buying a monster high end machine and I want to do all the AI stuff on it.

Very cool! You can use something like Tailscale to access your local services remotely without exposing them to the internet.

@DavidGarcia@feddit.nl
link
fedilink
English
481M

people need to take a step back and realize we have the capability to trap quasi-omnipotent quasi-demons in our personal computers

yeah they lie a lot and rarely do what you want them to, but that’s just what demons do

And it’s all powered by some dark crystals created with light magic that slowly poison the planet

that’s some arcane bullshit

Last
link
fedilink
English
31M

How long can something like that really last, though? I wish we had a better idea of the timeline, before the quasi-demons start freelancing lol

BlackLaZoR
link
fedilink
81M

I access it both on my local machine through the command line

You really don’t have to - There’s GPT4ALL designed for normal users with very simple GUI

Also, with minimal command line knowledge you can install InvokeAI - probably the best UX for image generating AI on the market. Works both on Linux and Windows

@chagall@lemmy.world
creator
link
fedilink
English
61M

It’s so great that there is so much ongoing development of these types of tools out there. I’m currently using openweb ui as my GUI but I’ll give your suggestion a try next week. I haven’t figured out a use case for stable diffusion except for creating new content for the shitposting community on lemmy lol. But if you have any ideas, please let me know… I’d love to test it out if I have a good use case.

BlackLaZoR
link
fedilink
31M

But if you have any ideas

Both my avatar and channel cover are made with AI models - so this is a good start.

IMO the biggest potential is indie game dev - AI image generation is amazing for static backgrounds, character design, and with certain loras it absolutely shreds pixelart - I even saw entire workflows for building pixelart animations (I think it was for ComfyUi tho).

Also local image models are uncensored so… porn XD

Zos_Kia
link
fedilink
English
21M

If you like to write, I find that story boarding with stable diffusion is definitely an improvement. The quality of the images is what it is, but they can help you map out scenes and locations, and spot visual details and cues to include in your writing.

@Goodtoknow@lemmy.ca
link
fedilink
English
101M

Have you found much practical use for small models yet? I love the idea that even the 1.1B tinyllama model can run on my phone, but haven’t found much real world use for it yet. Llama3 8b feels better, but not much better for even emails as it’s a bit dumb

Imo it’s worthwhile to just run the biggest model available and rent expensive GPU time. It still amounts to very little overall and you get much better results. Project dependent of course

@chagall@lemmy.world
creator
link
fedilink
English
61M

I use my phone all the time, but I just use a wireguard VPN to tunnel into my home container of Open WebUI. Then I can interact with my desktop machine using a NVIDIA gpu. I’m currently testing mistral-nemo. It’s pretty great but it gets a bit verbose sometimes.

@kureta@lemmy.ml
link
fedilink
English
101M

I am also using open webui. Most LLMs are too verbose for me, so I created a model in open-webui with system prompt “Do not repeat the questions. Avoid giving lists as answers. Do not summarize the answer at the end. If asked a follow-up question, respond with only new information, do not repeat previously stated information.” and named it No Nonsense.

@chagall@lemmy.world
creator
link
fedilink
English
31M

That’s really smart. I just found out about fabric yesterday and it is helping me with things like what you stated. Prompt engineering is a huge thing.

kate
link
fedilink
English
21M

for some reason chatgpt responds well to “no yapping”

BlackLaZoR
link
fedilink
11M

deleted by creator

“learned some things like Linux, command line, docker, and networking/pfsense” “I don’t consider myself technical”

Don’t sell yourself short, I work in IT and have colleagues on our helpdesk who would struggle endlessly with those concepts.

I hereby dub you a tech person, like it or not, those skills can and do pay the bills.

Thank you for this. I consider myself technical and those words felt like a punch in the gut.

@chagall@lemmy.world
creator
link
fedilink
English
121M

I’m sorry if I offended. I can’t code or understand existing code and have always felt that technical people code. I guess I should expand my definition. Again, sorry that my words felt like a punch in the gut… wasn’t my intention at all.

@IsoKiero@sopuli.xyz
link
fedilink
English
111M

It depends heavily on what you do and what you’re comparing yourself against. I’ve been making a living with IT for nearly 20 years and I still don’t consider myself to be an expert on anything, but it’s a really wide field and what I’ve learned that the things I consider ‘easy’ or ‘simple’ (mostly with linux servers) are surprisingly difficult for people who’d (for example) wipe the floor with me if we competed on planning and setting up an server infrastructure or build enterprise networks.

And of course I’ve also met the other end of spectrum. People who claim to be ‘experts’ or ‘senior techs’ at something are so incompetent on their tasks or their field of knowledge is so ridiculously narrow that I wouldn’t trust them with anything above first tier helpdesk if even that. And the sad part is that those ‘experts’ often make way more money than me because they happened to score a job on some big IT company and their hours are billed accordingly.

And then there’s the whole other can of worms on a forums like this where ‘technical people’ range from someone who can install a operating system by following instructions to the guys who write assembly code to some obscure old hardware just for the fun of it.

@GBU_28@lemm.ee
link
fedilink
English
431M

It is done.

Flax
link
fedilink
English
101M

This gave me confidence as well, thank you 😆

With how low the average person is with tech skills, it’s very easy to be top 10%.

Scrubbles
link
fedilink
English
81M

I was just talking to a member of my devops team and I was talking about this exact thing and they said “I didn’t know you could attach a GPU to a container”. So, yup, just stay on top of this stuff at home and you’ll do fine

Possibly linux
link
fedilink
English
21M

Who uses GPUs for AI anyway. They cost more than a car sometimes

Can confirm, the GPU in my laptop costs more than all but my newest car.

@chagall@lemmy.world
creator
link
fedilink
English
401M

This made me smile. Thank you. The grass is always greener and I sometimes daydream of working in IT instead of healthcare. Maybe someday.

Join us. We have cookies (well at least until the end of our sessions)!

Biezelbob
link
fedilink
English
131M

Nah dont.

@GissaMittJobb@lemmy.ml
link
fedilink
English
81M

Healthcare is pretty rough, I’d be willing to bet that the grass actually is greener in this case.

Biezelbob
link
fedilink
English
4
edit-2
1M

I am actually considering switching to healthcare (been a professional programmer)

I’ve had a burnout: I wish it was due caring for people in need instead of a stupid deadline.

Besides, you can always do IT as a hobby/for free. Harder with healthcare, except maybe volunteering

@barsquid@lemmy.world
link
fedilink
English
11M

You’ll be saving lives, yeah, but between dealing with entitled assholes that won’t follow directions and then yell at you because they didn’t.

It’s maybe easy to burn out in any career. Society has deprioritized individual fulfillment for most of us because it harms the nesting levels of billionaires’ yachts.

@ZeroHora@lemmy.ml
link
fedilink
English
51M

hahahaha best advice ever.

Now that you’ve dubbed OP a tech person…

Hey OP, can you help me fix my printer? It’s only printing “RED RUM RED RUM” for some reason.

Sabata
link
fedilink
English
21M

Have you replaced the blood cartridge?

Have you tried giving it red rum?

Oh, and make sure you hold it out with the insides of your arms exposed, it’ll feel less threatening that way.

Uncensored models are so much better, too. chatGPT is like one of those plastic children’s toy hammers vs real models are titanium hammers

Together.ai has a number of uncensored models too. I’ve found that those are so cheap that it’s not worth trying to self just models unless you really need more privacy.

What kinds of specs do you need to run it well? I’ve got a laptop with a 3070.

You probably want 48gb of vram or more to run the good stuff. I recommend renting GPU time instead of using your own hardware, via AWS or other vendors - runpod.io is pretty good.

Terrasque
link
fedilink
English
41M

Llama3 8b can be run at 6gb vram, and it’s fairly competent. Gemma has a 9b I think, which would also be worth looking into.

@31337@sh.itjust.works
link
fedilink
English
31M

IDK, looks like 48GB cloud pricing would be 0.35/hr => $255/month. Used 3090s go for $700. Two 3090s would give you 48GB of VRAM, and cost $1400 (I’m assuming you can do “model-parallel” will Llama; never tried running an LLM, but it should be possible and work well). So, the break-even point would be <6 months. Hmm, but if Severless works well, that could be pretty cheap. Would probably take a few minutes to process and load a ~48GB model every cold start though?

ffhein
link
fedilink
English
11M

Assuming they already own a PC, if someone buys two 3090 for it they’ll probably also have to upgrade their PSU so that might be worth including in the budget. But it’s definitely a relatively low cost way to get more VRAM, there are people who run 3 or 4 RTX3090 too.

Kinda defeats the purpose of doing it private and local.

I wouldn’t trust any claims a 3rd party service makes with regards to being private.

@dan@upvote.au
link
fedilink
English
91M

It’s a much smaller scale but I use a Coral TPU with CodeProject AI to detect when people or animals are in front of my house. Works well with Blue Iris (NVR software for security cameras). I like it. That’s all the self-hosted AI I’ve got for now.

@toynbee@lemmy.world
link
fedilink
English
21M

With all respect, the first paragraph seems self contradictory.

Very technical vs not can be very subjective.
It can be a 50 year old sysadmin vs Adam I pulled from the street or a graybeard linux admin vs a beginner sysadmin only in it for thr career instead of the passion (those can be very non-technical but good problem solver folks)

I know my comparison is flawed

Dataprolet
link
fedilink
English
51M

Isn’t this using a lot of computing power?

@Swedneck@discuss.tchncs.de
link
fedilink
English
5
edit-2
1M

you hear that said about AI because companies are desperately throwing more and more resources at it to get 0.3% better results, and people are collectively running an insane amount of prompts all the time.

but on a personal level it’s not really any different from any other computations, people render videos all the time and no one complains about the resource usage from that, because companies aren’t trying to sell bloated video rendering services to gardening businesses.

Not really, it uses some GPU power when it’s actively generating a response, but otherwise it just sits idle.

@Toribor@corndog.social
link
fedilink
English
5
edit-2
1M

I’ve been testing Ollama in Docker/WSL with the idea that if I like it I’ll eventually move my GPU into my home server and get an upgrade for my gaming pc. When you run a model it has to load the whole thing into VRAM. I use the 8gb models so it takes 20-40 seconds to load the model and then each response is really fast after that and the GPU hit is pretty small. After I think five minutes by default it will unload the model to free up VRAM.

Basically this means that you either need to wait a bit for the model to warm up or you need to extend that timeout so that it stays warm longer. That means that I cannot really use my GPU for anything else while the LLM is loaded.

I haven’t tracked power usage, but besides the VRAM requirements it doesn’t seem too intensive on resources, but maybe I just haven’t done anything complex enough yet.

@chip@feddit.rocks
link
fedilink
English
31M

I also recently got into selfhosting LLM. Having an AMD card meant I had to scourge for solutions since everything expects to have CUDA suppport which means having Nvidia cards. Koboldcpp has a fork with ROCM support which works on my machine, so I’m content with that for now.

@Toribor@corndog.social
link
fedilink
English
21M

Do you have any links or guides that you found helpful? A friend wanted to try this out but basically gave up when he realized he’d need an Nvidia GPU.

Noctis
link
fedilink
English
31M

Look up kobold cpp yellow rose fork. It’s pretty easy to set up and run

Wasnt there a solution by AMD or someone close to them implementing a translation of CUDA for AMD hardware?

AMD asked them to shut it down. So the guy is going to go back to the pre-AMD release and work independently from there.

I really hate when companies do that kind of crap. I just imagine a little toddler stomping around going “No! No! Nooo!”

NVIDIA didn’t ask to shut it down, but AMD lawyer probably weren’t that hot to what the project had become and AMD asked the creator to shut down the project l, which he did.

But yeah, lots of work wasted caused by pencil pushers and bean counters.

Create a post

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don’t control.

Rules:

  1. Be civil: we’re here to support and learn from one another. Insults won’t be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it’s not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don’t duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

  • 1 user online
  • 279 users / day
  • 589 users / week
  • 1.34K users / month
  • 4.55K users / 6 months
  • 1 subscriber
  • 3.47K Posts
  • 69.3K Comments
  • Modlog