I have no idea how this is set up to work technically, but most of the heavy lifting is gonna be on the GPU. I’m not sure that it matters much whether the browser is what’s pushing data to the GPU or some other package.
Most people probably don’t have a dedicated GPU and an iGPU is probably not powerfull enough to run an LLM at decent speed. Also a decent model requires like 20GB of RAM which most people don’t have.
I have no idea how this is set up to work technically, but most of the heavy lifting is gonna be on the GPU. I’m not sure that it matters much whether the browser is what’s pushing data to the GPU or some other package.
Most people probably don’t have a dedicated GPU and an iGPU is probably not powerfull enough to run an LLM at decent speed. Also a decent model requires like 20GB of RAM which most people don’t have.
It doesn’t just require 20GB of RAM, it requires that in VRAM. Which is a much higher barrier to entry.