I write things on my blog sometimes https://fasterandworse.com/

  • 16 Posts
  • 267 Comments
Joined 1 year ago
cake
Cake day: June 29th, 2023

help-circle






  • I might be wrong but this sounds like a quick way to make the web worse by putting a huge computational load on your machine for the purpose of privacy inside customer service chat bots that nobody wants. Please correct me if I’m wrong

    WebLLM is a high-performance in-browser LLM inference engine that brings language model inference directly onto web browsers with hardware acceleration. Everything runs inside the browser with no server support and is accelerated with WebGPU.

    WebLLM is fully compatible with OpenAI API. That is, you can use the same OpenAI API on any open source models locally, with functionalities including streaming, JSON-mode, function-calling (WIP), etc.

    We can bring a lot of fun opportunities to build AI assistants for everyone and enable privacy while enjoying GPU acceleration.

    https://github.com/mlc-ai/web-llm