

GOG’S AI take
What happened, exactly?
All I can find is someone used an AI image for some kind of marketing.


GOG’S AI take
What happened, exactly?
All I can find is someone used an AI image for some kind of marketing.


I mean, yes.
Steam is a scary monopoly, getting scarier.
It’s not their fault the industry (minus GOG) comitted mass seppuku.
Both can be true. One can worry about Valve, and use them hesitantly, while laughing at everything else like it’s a cartoon.


It completely depends what you use your computer for.
For example, do you game? DRM free or no, and where are they installed? On a seperate drive?
What about work stuff? Media? The larger question I’m getting at is “how much of what you do is portable, and easy to just plop on a USB stick, reinstall from the internet, or just leave on a second drive already in your desktop?”


Partially configured some parts via LLM but please don’t crucify me for that.
Slap in a spare GPU, and self-host one!
The 30B-class models are unbelievably good now, for being so small. They’re kinda where Claude was like a year ago, if not less. And (with the right backend) they aren’t expensive to host.


Better yet, download Qwen 3.5/3.6, with a “raw” notepad like Mikupad. Try it yourself:
https://huggingface.co/ubergarm/Qwen3.6-27B-GGUF
https://github.com/lmg-anon/mikupad
One might observe:
Chat formating, and how janky the “thinking” block is.
How words are broken up into tokens, not characters.
How particularly funky that gets with numbers.
Precisely how sampling “randomizes” the answers by visualizing “all possible answers” with the logprobs display.
And, thus, precisely how and why carb counting in ChatGPT fails, yet a measly local LLM on a desktop/phone could get it right with a little tooling or adjustment.
This is exactly what OpenAI/Anthropic don’t want you to do. They want users dumb and tethered, like a cloud subscription or social media platform. Not cognizant of how tools they are peddling as magic lamps actually work. And why, and how, they’re often stupid.


See, my Windows partition starts instantly. TBH its faster than linux, which takes an extra second to initialize SDDM, and then network connectivity.
…Perhaps because its so neutered. It’s not really a fair comparison, as Windows is a narrow-focus OS for me, a tool for running things, to the point I don’t trust it for anything security sensitive.


Perhaps they are talking handhelds, specifically?
Look. I am the biggest, most shameless CachyOS fanboy you will find. It’s like 90% of my desktop time, has been for years.
But I’ve benchmarked a few games on Windows and Linux, Proton and native, sparsely, and Windows still has an advantage, sometimes. Cyberpunk 2077 was the biggest outlier for Proton (eg faster on Windows, enough to visibly affect settings I can manage on my 3090).
And many native ports are still truly awful. Often where performance equates to simulation time, like modded Stellaris or Rimworld.
Mind you, that’s not always the case. Proton is faster in many games, and (for example) anything Java like Minecraft or Starsector are just hilariously faster on Linux.
The caveats:
My Windows 11 is neutered to hell. It’s a barren wasteland. Even Defender is disabled.
I’m running Nvidia.
Some of my testing is aging now.
Still, I am a Linux shill, and think the headline is a bit dramatic. Stripped Windows is still faster in plenty of realistic scenarios.
Since they’re referencing SteamOS, they’re probably talking about stock mobile systems, where the overhead from that mountain of background junk in Windows is much more painful.


I was there early in CachyOS’s history, and it was still great. The only huge issues I can remember (that wasn’t totally self inflicted) are upstream Nvidia problems, and some ambiguous manual package installs/uninstalls when some stuff was shuffled and renamed. But the later just taught me to watch the update log, as I should.
That, and I keep an LTS kernel around whenever something minor breaks. Which their setup makes totally painless.
I agree with others. CachyOS is the most stable Linux distro I’ve ever used, to the point where my laptop and desktop installs are years old now. It’s also that Linux desktop, overall, is in a good place, but still.


I think the problem is at the other end: the ads.
And platforms.
Some AI ad of Tom Hanks peddling a supplement, or a sexy ad of AI Taylor Swift, shouldn’t be distributed en masse in the first place, just because an algorithm or ad engine picked it up as engagement bait. It’s insane! There is nothing normal about it, and its about time we stop pretending the screwed up platforms profiting off this stuff are “free speech” and acceptable.
…Because scammers are always gonna scam. But they can only do this because the platforms are pourinf fuel on the fire.


It is! 143GB last I checked. I’m on 128GB RAM + 3090, 1 NUMA node, so I think it’s juuust barely too tight. But it should be perfect with a few of the “sparsest” MoEs quantized.
If KTransformers supports something like that, I may have to finally check it out, since v4 won’t need many esoteric features.
Actually… I have quite a negative perception of GIMP. I’m primarily a Linux user, but I just remember it as something that’s either always felt obtuse to use, missing something I need, or sluggish for the more narrow processing I’m trying to do.
AFAIK that perception is more pronounced outside Linux.
I don’t care about a brand either way. But if the GIMP project is ready, I think a “fresh start” to draw in users without any preconceived notions is a good thing.


Not to speak of the new attention scheme and the (IIRC) MLP changes.
I’m very much looking forward to ik_llama.cpp implementing it. I don’t think I can quite fit Flash on my rig (hence no Ktransformers for me) but a little quantization of the sparse layers, and it’d be perfect.


This is commonly cited, but not strictly true.
Prompt processing is completely compute limited. And at high batch sizes, where the weights are read once for many tokens generated in parallel, token generation is also quite compute limited. Obviously you want enough bandwidth to match the compute, but its very compute heavy.
You can see this for yourself. Try ~10 prompts in parallel on a CPU in llama.cpp, and it will slow to a crawl, while a GPU with a narrow bus won’t slow down much.
Training is a bit more complicated, but that’s not doable on CPUs anyway.
Now, local inference (aka a batch size of 1), past prompt processing, is heavily bandwidth limited. This is why hybrid inference works alright on CPUs. But this doesn’t really apply to servers, which process many users in parallel with each “pass”.


I’ve heard hints they sometimes give GPU farms busywork, to meet utilization targets apparently stipulated in sales contracts.


No. Not even close. Non-US models are trained (and run) on peanuts compared to big US models, because they don’t have mega GPU farms and have no other option. Deepseek in particular went all-in on software architecture efficiency.
…Ironically, the Nvidia GPU embargo was the best thing that ever happened to the Chinese devs. It made them thrifty.
Many tried to warn US regulators of this, but they had AI Bros whispering in their ears. The US tech system is just too screwed up, I guess.


…I mean, yeah? It’s obviously developed for the Chinese market.
But that’s theoretical, for now. No CPU backend I can find supports DSV4, and DeepSeek hasn’t contributed anything yet.


Just not power/cost efficiently on CPU only, is what I meant. CPUs don’t have the compute for batching (running generation requests in parallel). You need an accelerator, like Huawei’s, to be economical.
It’s fine for local inference, of course.


I just meant for mass inference serving.
Yeah, I haven’t seen much in the way of bitnet training savings yet, like regular old QAT. It does appear that Deepseek is finetuning their MoEs in a 4-bit format now, though.


What’s left unsaid is the software architecture is extremely interesting, and efficient.
Ironically, the Nvidia embargo was the best thing to ever happen to the Chinese labs (which Nvidia tried to tell the US govt). It forced them to get thrifty, unlike US labs which (allegedly) fill some GPU farms with busywork for the appearance of high utilization.
My guess: the “primitive” go-back-to-manly-cavemen manosphere stuff.
It sounds like some fantasy of what “should” happen to a pre-civilization woman.
Mix it up with ovulation, and I bet there’s a distant grain of truth there. If I were an isolated guy listening to podcast bros all day, I might even buy it.