What are your LocalLLaMA "hot takes"?

mudkip@lemdro.id · 25 days ago

What are your LocalLLaMA "hot takes"?

panda_abyss@lemmy.ca · edit-2 25 days ago

Thinking is an awful paradigm

Models would do better to revert and visit other token branches, but top p/k blocks that. Thinking tokens are a waste.

One of the reasons thinking majes models good is just reinforcement learning, but it tends to be very narrow.

Like math you can reinforcement learn until grad level. That’s fine. But it doesn’t actually improve problem solving.

SmokeyDope@lemmy.world · edit-2 25 days ago

Everyone is massively underestimating what’s going on with neural networks. The real significance is abstract. you need to stitch together a bunch of high-level STEM concepts to even see the full picture.

Right now, the applications are basic. It’s just surface-level corporate automation. Profitable, sure, but boring and intellectually uninspired. It’s being led by corpo teams playing with a black box, copying each other, throwing shit at the wall to see what sticks, overtraining their models into one trick pony agenic utility assistants instead of exploring other paths for potential. They aren’t bringing the right minds together to actually crack open the core question. what the hell is this thing? What happened that turned my 10 year old GPU into a conversational assistant? How is it actually coherent and sometimes useful?

The big thing people miss is what’s actually happening inside the machine. Or rather, how the inside of the machine encodes and interacts with the structure of informational paths within a phase space on the abstraction layer of reality.

It’s not just matrix math and hidden layers and and transistors firing. It’s about the structural geometry of concepts created by distinxt relationships between areas of the embeddings that the matrix math creates within high dimensional manifold. It’s about how facts and relationships form a literal, topographical landscape inside the network’s activation space.

At its heart, this is about the physics of information. It’s a dynamical system. We’re watching entropy crystallize into order, as the model traces paths through the topological phase space of all possible conversations.

The “reasoning” CoT patterns are about finding patterns that help lead the model towards truthy outcomes more often. It’s searching for the computationally efficient paths of least action that lead to meaningfully novel and factually correct paths. Those are the valuable attractor basins in that vast possibility space were trying to navigate towards.

This is the powerful part. This constellation of ideas. Tying together topology, dynamics, and information theory, this is the real frontier. What used to be philosophy is now a feasable problem for engineers and physicists to chip at, not just philosophers.

swelter_spark@reddthat.com · 25 days ago

I still use my favorite older models more than anything else. I can’t think of any of the thinking models that really impressed me for what I do. The MoEs made out of tons of tiny models didn’t seem that great, either. Despite how many are put together, they still seem basically dumb.

afansfw@lemmynsfw.com · 24 days ago

Sometimes speaking to an older model feels way more human and natural, newer ones seems to be trained too much on “helpful assistant” stuff and especially on the previous AI dialogues, to the point where some of them from time to time claim to be chatgpt because that’s what they have in their training data.

Datasets should be cleared and everything newer than the release of chatgpt should be carefully vetted to make sure the models are not just regurgitating generated output to the point where they all blend into the same style of speech.

Also, it seems like models should be rewarded more for saying “I’m not sure” or “I don’t know” for things that are not in their training data and context, because every one of them still has a huge tendency to be confidently wrong.

snikta@programming.dev · 25 days ago

GPT-OSS:120b is really good.

Tools are powerful and make local inference on cheap hardware good enough for most people.

DSPy is pretty cool.

Intel caught my attention at the begin of 2025, but seems to have given up on their software stack. I regret buying cheap Arcs for inference.

Inference on AMD is good enough for production.

hendrik@palaver.p3x.de · edit-2 25 days ago

The broader generative AI economy is a steaming pile of shit and we’re somehow part of it? I mean it’s nice technology and I’m glad I can tinker around with it, but boy is it unethical. From how datasets contain a good amount of pirated stuff, to the environmental impact and that we’ll do fracking, burn coal and all for the datacenters, to how it’s mostly an unsustainable investment hype and trillion-dollar merry-go-round. And then I’m not okay with the impact on society either, I can’t wait for even more slop and misinformation everywhere and even worse customer support.

We’re somewhere low on the food chain, certainly not the main culprit. But I don’t think we’re disconnected from the reality out there either. My main take is, it depends on what we do with AI… Do we do the same unhealthy stuff with it, or do we help even out the playing field so it’s not just the mega-corporations in control of AI? That’d be badly needed for some balance.

Second controversial take: I think AI isn’t very intelligent. It regularly fails me once I give real-world tasks to it. Like coding and it really doesn’t do a good job with the computer programming issues I have. I need to double-check everything and correct it 30 times until it finally gets maths and memory handling somewhat right (by chance), and that’s just more effort than coding something myself. And I’m willing to believe that transformer models are going to plateau out, so I’m not sure if that’s ever going to change.

Edit: Judging by the votes, seems I’m the one with the controversial comment here. Care to discuss it? Too close to the truth? Or not factual? Or not a hot take and just the usual AI naysayer argument?

Domi@lemmy.secnd.me · 24 days ago

From how datasets contain a good amount of pirated stuff

Personally, I do not care if datasets contain pirated stuff because the copyright laws are broken anyway. If the entirety of Disney movies and Harry Potter books are somewhere inside those datasets, I can play them a song on the world’s smallest violin.

Smaller artists/writers are the ones I empathize with. I get their concern about large corporations using their stuff and making money off of it. Not entirely something that applies to local AI since most people here do this for themselves and do not make any money out of it.

to the environmental impact

That’s actually the saddest part. Those models could be easily trained with renewables alone but you know, capitalism.

Do we do the same unhealthy stuff with it, or do we help even out the playing field so it’s not just the mega-corporations in control of AI?

The thing is, those models are already out there and the people training them do not gain anything when people download their open weights/open source models for free for local use.

There’s so much cool stuff you can do with generative AI fully locally that I appreciate that they are available for everyone.

Second controversial take: I think AI isn’t very intelligent.

If we are talking about LLMs here, I don’t think that’s much of a controversial take.

Most people here will be aware that generative AI hallucinates all the time. Sometimes that’s good, like when writing stories or generating abstract images but when you’re trying to get accurate information, it’s really bad.

LLMs become much more useful when they do not have to completely rely on their training data and instead get all the information they need provided to them (e.g. RAG).

I’m a huge fan of RAG because it cites where it got the information from, meaning you can ask it a question and then continue reading in the source to confirm. Like fuzzy search but you don’t have to know the right terms.

hendrik@palaver.p3x.de · 24 days ago

Agreed.

Those models could be easily trained with renewables alone but you know, capitalism.

It’s really sad to read the articles how they’re planning to bulldoze Texas and do fracking and all these massively invasive things and then we also run a lot of the compute on coal and want more nuclear plants as well. That doesn’t really sound that progressive and sophisticated to me.

The thing is, those models are already out there and the people training them do not gain anything when people download their open weights/open source models for free for local use.

You’re right. Though the argument doesn’t translate into anything absolute. I can’t buy salami in the supermarket and justify it by saying the cow is dead anyways and someone already sliced it up. It’s down to demand and that’s really complex. Does Mark Zuckerberg really gift an open-weights model to me out of pure altruism? Is it ethical if I get some profit out of some waste, or by-product of some AI war/competition? It is certainly correct that we here don’t invest money in that form. However that’s not the entire story either, we still buy the graphics cards from Nvidia and we also set free some CO2 when doing inference, even if we didn’t pay for the training process. And they spend some extra compute to prepare those public models, so it’s not no extra footprint, but it’s comparatively small.

I’m not perfect, though. I’ll still eat salami from time to time. And I’ll also use my computer for things I like. Sometimes it serves a purpose and then it’s justified. Sometimes I’ll also do it for fun. And that in itself isn’t something that makes it wrong.

I’m a huge fan of RAG because it cites where it got the information from

Yeah, that’s really great and very welcome. Though I think it still needs some improvement on picking sources. If I use some research mode from one of the big AI services, it’ll randomly google things, but some weird blog post or a wrong reddit comment will show up on the same level as a reputable source. So it’s not really fit for those use-cases. It’s awesome to sift through documentation, though. Or a company’s knowledgebase. And I think those are the real use-cases for RAG.

Baŝto@discuss.tchncs.de · 15 days ago

I can’t buy salami in the supermarket and justify it by saying the cow is dead anyways

That’s not comparable. You can’t compare software or even research with a physical object like that. You need a dead cow for salami, if demand increases they have to kill more cows. For these models the training already happened, how many people use it does not matter. It could influence whether or how much they train new models, but there is no direct relation. You can use that forever in it’s current state without any further training being necessary. I’d rather compare that with nazi experiments on human beings. Their human guinea pigs already suffered/died no matter whether you use the research derived from that or not. Doing new and proper training/research to get to a point where improper ones already got is somewhat pointless in this case, you just spend more resources.

Though it makes sense to train new models on public domain and cc0 materials if you want end results that protect you better from getting sued because of copyright violations. There are platforms who banned AI generated graphics because of that.

we still buy the graphics cards from Nvidia and we also set free some CO2 when doing inference

But you don’t have to. I can run small models on my NITRO+ RX 580 with 8 GB VRAM, which I bought 7 years ago. It’s maybe not the best experience, but it certainly “works”. Last time our house used external electricity was 34h ago.

Regarding RAG, I just hope it improves machine readability, which is also useful for non-AI applications. It just increases the pressure.

Domi@lemmy.secnd.me · 24 days ago

I can’t buy salami in the supermarket and justify it by saying the cow is dead anyways and someone already sliced it up. It’s down to demand and that’s really complex.

You pay for the salami and thus entice them to make more. There is monetary value for them in making more salami.

Does Mark Zuckerberg really gift an open-weights model to me out of pure altruism?

I don’t really know why they initially released their models but at least they kicked off a pissing contest in the open weight space on who can create the best open model.

Meta has not released anything worthwhile in quite a while. It’s pretty much Chinese models flexing on American models nowadays.

Still, their main incentive to train those models lies with businesses subscribing to their paid plans.

However that’s not the entire story either, we still buy the graphics cards from Nvidia and we also set free some CO2 when doing inference, even if we didn’t pay for the training process.

True, I exclusively run inference on AMD hardware (I recently got a Strix Halo board) so at least I feel a little bit less bad and my inference runs almost purely on solar power. I expect that is not the norm in the local AI community though.

If I use some research mode from one of the big AI services, it’ll randomly google things, but some weird blog post or a wrong reddit comment will show up on the same level as a reputable source.

I rarely use the commercial AI services but also locally hosted the web search feature is not really that great.

It’s awesome to sift through documentation, though. Or a company’s knowledgebase. And I think those are the real use-cases for RAG.

Yes, I prefer to use RAG with information I provide. For example, ask a question about Godot and provide it the full Godot 4 documentation with it.

Still working on getting this automated though. I would love to have a RAG knowledge base of Wikipedia, Stackoverflow, C documentation, etc. that you can query an LLM against.

Baŝto@discuss.tchncs.de · 15 days ago

I’m flip-flopping between running local models on my PC with solar power vs. using OpenAI’s free ChatGPT to drive them into ruin, which most of the time ends with me having stupid a stupid argument with an AI.

impact on society

Local AI will likely have a long lasting impact as it won’t just go away. The companies who released them can go bankrupt, but the models stay. The hardware which runs them will get faster and cheaper over time.

I have some hope with accessibility and making FLOSS development easier/faster. Generative AI can at least quickly generate mockup code or placeholder graphics/code. There are game projects who would release with generated assets, just like for a long time there were game projects who released assets which were modifications or redistribution of assets they didn’t have the rights for. They are probably less likely to get sued over AI generated stuff. It’s unethical but they can replace it with something self-made once the rest is finished.

Theoretically even every user could generate their own assets locally which would be very inefficient, also ethically questionable, but legally fine as they don’t redistribute them.

I like how Tesseract already uses AI for OCR and Firefox for realtime website translations on your device. Though I dunno how much they benefit from advancements in generative AI?

Though a different point/question: At what point is generative AI ethically and legally fine?

If I manage to draw some original style it transfers? But I’m so slow and inefficient with it that I can’t create a large amount of assets that way
When I create the input images myself? But in a minimalist and fast manner

It still learned that style transfer somewhere and will close gaps I leave. But I created the style and what the image depicts. At what point is it fine?

Like coding

I actually use it often to generate shell scripts or small simple python tools. But does it make sense? Sometimes it does work. For very simple logic they tend to get it right. Though writing it myself would probably been faster the last time I used, though at the moment I was too lazy to write it myself. I don’t think I’ve ever really created something usable with it aside from practical shell scripts. Even with ChatGPT it can be an absolute waste of time to explain why the code is broken, didn’t get at all why its implementation lead to a doubled file extension and a scoping error in one function … when I fixed them it actually tried to revert that.

hendrik@palaver.p3x.de · edit-2 15 days ago

Your experience with AI coding seems to align with mine. I think it’s awesome for generating boilerplate code, placeholders including images, and for quick mockups. Or asking questions about some documentation. The more complicated it gets, the more it fails me. I’ve measured the time once or twice and I’m fairly sure it’s more than usual, though I didn’t do any proper scientific study. It was just similar tasks and me running a timer. I believe the more complicated maths and trigonometry I mentioned was me yelling at AI for 90min or 120minutes or so until it was close and then I took the stuff around, deleted the maths part and wrote that myself. Maybe AI is going to become more “intelligent” in the future. I think a lot of people hope that’s going to happen. I think as of today we’re need to pay close attention if it fools us but is a big time and energy waster, or if it’s actually a good fit for a given task.

Local AI will likely have a long lasting impact as it won’t just go away.

I like to believe that as well, but I don’t think there’s any guarantee they’ll continue to release new models. Sure, they can’t ever take Mistral-Nemo from us. But that’s going to be old and obsolete tech in the world of 2030 and dwarfed by any new tech then. So I think the question is more, are they going to continue? And I think we’re kind of picking up what the big companies dumped when battling and outcompeting each other. I’d imagine this could change once China and the USA settle their battle. Or multiple competitors can’t afford it any more. And they’d all like to become profitable one day. Their motivation is going to change with that as well. Or the AI bubble pops and that’s also going to have a dramatic effect. So I’m really not sure if this is going to continue indefinitely. Ultimately, it’s all speculation. A lot of things could possibly happen in the future.

At what point is generative AI ethically and legally fine?

If that’s a question about development of AI in general, it’s an entire can of worms. And I suppose also difficult to answer for your or my individual use. What part of the overall environment footprint gets attributed to a single user? Even more difficult to answer with local models. Do the copyright violations the companies did translate to the product and then to the user? Then what impact do you have on society as a single person using AI for something? Does what you achieve with it outweigh all the cost?

Firefox for realtime website translations

Yes, I think that and text to speech and speech to text are massively underrated. Firefox Translate is something I use quite often and I can do crazy stuff with it like casually browse Japanese websites.

Baŝto@discuss.tchncs.de · 15 days ago

But that’s going to be old and obsolete tech in the world of 2030 and dwarfed by any new tech then.

My point was more the people they replace now they’ll replace indefinitely in the context of “impact on society”

a question about development of AI in general, it’s an entire can of worms

and

So I think the question is more, are they going to continue?

I just ran into https://huggingface.co/briaai/FIBO, which looks interesting in many ways. (At first glance.)

trained exclusively on licensed data

It also only works with JSON inputs. The more we split AIs into modules that can be exchanged, the more we can update pipelines module by module, tweak them…

It’s unlikely that there’ll never be new releases. It’s always interesting for new-comers to gain market penetration and show off.

What part of the overall environment footprint gets attributed to a single user?

It’s possible that there’ll be companies at some point who proudly train their models with renewable energy etc. like it’s already common in other products. It just has to be cheap/accessible enough for them to do that. Though I don’t see that for GPU production anytime soon.

hendrik@palaver.p3x.de · 15 days ago

Thanks.

FIBO, which looks interesting in many ways.

Indeed. Seems it has good performance, licensed training material… That’s all looking great. I wonder who has to come up with the JSON but I guess that’d be another AI and not my task. Guess I’ll put it on my list of things to try.

It’s possible that there’ll be companies at some point who proudly train their models with renewable energy

I said it in another comment, I think that’s a bit hypothetical. It’s possible. I think we should do it. But in reality we ramp up natural gas and coal. US companies hype small nuclear reactors and some some people voiced concerns China might want to take advantage of Russia’s situation for their insatiable demand for (fossil-fuel) energy. I mean they also invest massively in solar. It just looks to me we’re currently overall headed the other direction and we need substantial change to maybe change that some time in the future. So I categorize it more towards wishful-thinking.