Is AI inference getting cheaper or more expensive over time?

GamingChairModel@lemmy.world · 1 day ago

Is AI inference getting cheaper or more expensive over time?

GamingChairModel@lemmy.world · 1 day ago

New technology has always been horribly inefficient, it’s only once more people see it does it start to get optimized.

Well, I wonder if the frontier ends up looking like supersonic commercial flight (prohibitively expensive so that there wasn’t enough of a market for consumers at the actual cost of providing the service): technology that continues to exist but never really gets used, because the alternatives that aren’t as good are still much, much cheaper.

MagicShel@lemmy.zip · 18 hours ago

Not everyone needs a Lamborghini or Concorde to get where they are going.

Work is pushing us to use cloud models and I haven’t had time to experiment more than a few limited tests. Qwen 3.6 ~30B Q4 runs pretty well on 36GB of ram. It’s a very capable model. It did choke when I tried to connect Cline to it for Java dev. But when I just conversationally ask to write python scripts it works pretty well.

I can see a future where a goodly amount of ram and an AI chip can produce the results we are currently getting only from cloud models.

GamingChairModel@lemmy.world · 15 hours ago

Not everyone needs a Lamborghini or Concorde to get where they are going.

I agree with that. Still, Lamborghinis are still being built, operated, and maintained, while Concordes are not.

I’m wondering whether the future of AI looks like the last 50 years of aviation, where there aren’t that many generational advances because the cost of developing new stuff becomes prohibitively expensive, but where the commoditization of what has already been invented makes it so that the experience for the average person really isn’t that different between 2026 and 1976, where the sweet spot for cost effectiveness isn’t at the bleeding edge at all.

And for my own curiosity on this line of thinking, I wanted to know whether the day-to-day cost of running these models is going down, and in which contexts.