Doing the Lord’s work in the Devil’s basement

  • 0 Posts
  • 21 Comments
Joined 11 months ago
cake
Cake day: May 8th, 2024

help-circle
  • When you read that stuff on reddit there’s a parameter you need to keep in mind : these people are not really discussing Lemmy. They’re rationalizing and justifying why they are not on Lemmy. Totally different conversation.

    Nobody wants to come out and say “I know mainstream platforms are shit and destroying the fabric of reality but I can’t bring myself to be on a platform except it is the Hip Place to Be”. So they’ll invent stuff that paints them in a good light.

    You’ll still see people claiming that Mastodon is unusable because you have to select an instance - even though you don’t have to, you can just type Mastodon on Google, click the first link, and create an account in 2 clicks. It’s been ages. But the people still using Twitter need the excuse because otherwise what does it make them?









  • Yeh, i did some looking up in the meantime and indeed you’re gonna have a context size issue. That’s why it’s only summarizing the last few thousand characters of the text, that’s the size of its attention.

    There are some models fine-tuned to 8K tokens context window, some even to 16K like this Mistral brew. If you have a GPU with 8G of VRAM you should be able to run it, using one of the quantized versions (Q4 or Q5 should be fine). Summarizing should still be reasonably good.

    If 16k isn’t enough for you then that’s probably not something you can perform locally. However you can still run a larger model privately in the cloud. Hugging face for example allows you to rent GPUs by the minute and run inference on them, it should just net you a few dollars. As far as i know this approach should still be compatible with Open WebUI.






  • Very useful in some contexts, but it doesn’t “learn” the way a neural network can. When you’re feeding corrections into, say, ChatGPT, you’re making small, temporary, cached adjustments to its data model, but you’re not actually teaching it anything, because by its nature, it can’t learn.

    But that’s true of all (most ?) neural networks ? Are you saying Neural Networks are not AI and that they can’t learn ?

    NNs don’t retrain while they are being used, they are trained once then they cannot learn new behaviour or correct existing behaviour. If you want to make them better you need to run them a bunch of times, collect and annotate good/bad runs, then re-train them from scratch (or fine-tune them) with this new data. Just like LLMs because LLMs are neural networks.