Faster Ollama alternative

RandomlyRight@sh.itjust.works · 6 days ago

Super cool! I’d be interested in how to fit this to my head shape too, it’s now on my list of contenders for the concert

RandomlyRight@sh.itjust.works · 2 months ago

I’ve read about this method in the GitHub issues, but to me it seemed impractical to have different models just to change the context size, and that was the point I started looking for alternatives

RandomlyRight@sh.itjust.works · 2 months ago

It was multiple models, mainly 32-70B

RandomlyRight@sh.itjust.works · 2 months ago

There are many projects out there optimizing the speed significantly. Ollama is unbeaten in the convenience though

RandomlyRight@sh.itjust.works · 2 months ago

Yeah, but there are many open issues on GitHub related to these settings not working right. I’m using the API, and just couldn’t get it to work. I used a request to generate a json file, and it never generated one longer than about 500 lines. With the same model on vllm, it worked instantly and generated about 2000 lines

RandomlyRight@sh.itjust.works · 2 months ago

Faster Ollama alternative

RandomlyRight@sh.itjust.works · 3 months ago

Take a look at NVIDIA Project Digits. It’s supposed to release in May for 3k usd and will be kind of the only sensible way to host LLMs then:

https://www.nvidia.com/en-us/project-digits/

RandomlyRight@sh.itjust.works · 8 months ago

Am I crazy or are you just completely wrong?

https://github.com/waydabber/BetterDisplay/wiki/MacOS-scaling,-HiDPI,-LoDPI-explanation

RandomlyRight@sh.itjust.works · 9 months ago

Someone tell me pls which browsers are developed by actually decent people? I’ll switch

RandomlyRight@sh.itjust.works · 9 months ago

What is the software with that graph that looks to be for your notes or something? In image 3