Guide: Choosing Your LLM Models
Nanobrowser's multi-agent system allows you to configure different Large Language Models (LLMs) for each agent, enabling you to balance performance, cost, and privacy.
Understanding the Agents
Nanobrowser primarily uses two types of agents:
- Planner: The high-level strategist. It analyzes your task, observes the current state of the webpage, and decides on the next steps. A more powerful and intelligent model is recommended for the Planner.
- Navigator: The executor. It takes instructions from the Planner and translates them into specific browser actions, such as clicking an element or typing text. A faster, more cost-effective model often works well for the Navigator.
Recommended Configurations
Here are some recommended model setups to get you started.
High-Performance Configuration
This setup prioritizes accuracy and advanced reasoning, making it ideal for complex, multi-step tasks.
- Planner:
Claude 3.5 Sonnet
orGPT-4o
- These models excel at reasoning, planning, and understanding complex instructions.
- Navigator:
Claude 3.5 Haiku
- Offers an excellent balance of speed, cost, and capability for executing navigation tasks efficiently.
Cost-Effective Configuration
This setup is designed to minimize API costs while still providing reasonable performance for simpler tasks.
- Planner:
Claude 3 Haiku
orGPT-4o
- These models offer good performance at a lower price point but may require more iterations for complex tasks.
- Navigator:
Gemini 1.5 Flash
orGPT-4o-mini
- These models are lightweight, fast, and highly cost-effective, making them suitable for basic navigation.
Note: Cost-effective configurations may produce less stable results and might require more attempts to complete complex tasks successfully.
Using Local Models with Ollama
For maximum privacy and zero API costs, you can run models locally using Ollama or another OpenAI-compatible provider.
-
Recommended Local Models:
Qwen2:7b
Llama3:8b
Mistral
-
Prompt Engineering for Local Models: Local models often require more specific and clearly defined prompts to perform well. When using them, it's best to:
- Avoid vague or high-level commands.
- Break down complex tasks into smaller, detailed steps.
- Provide clear context and specific constraints for the task.
Tip: Experiment with different model combinations! If you find a setup that works great, share it with the community on our Discord.