About This Tool

How to Use
  1. Select an AI model from the dropdown (smaller = faster download)
  2. Click 'Load Model' to download and cache the model in your browser
  3. Wait for the download to complete (700MB–2GB depending on model)
  4. Type your message in the input box and press Enter or click Send
  5. Optionally expand 'System Prompt' to customize the assistant's behavior
  6. Use 'Clear Chat' to start a fresh conversation
Common Use Cases
  • Private AI conversations with no data sent to any server
  • Offline AI assistance when internet is unavailable
  • Testing LLM behavior without API costs
  • Experimenting with different system prompts
  • Sensitive questions you don't want sent to cloud AI services
Tips & Tricks
  • The model is cached after the first download — subsequent visits load instantly
  • Smaller models (Llama 3.2 1B) are faster but less capable
  • Larger models (Phi 3.5 Mini) give better answers but need more RAM/VRAM
  • Requires Chrome 113+ or Edge 113+ with WebGPU support
  • Close other GPU-heavy tabs if the model fails to load

Related Tools