WebHarbor - We "dock" the real websites into local for web agents! [R]
Mirrored from r/MachineLearning for archival readability. Support the source by reading on the original site.
Hello! Excited to share our latest community-driven research project: WebHarbor: Docking Real Websites for Evolving GUI Agent Environments!
TL;DR: 15 popular websites (Amazon, GitHub, BBC News, arXiv, Booking, Hugging Face, etc.) packaged as self-contained Flask + SQLite apps in a single Docker image, with a control plane that resets each site to byte-identical state in <1 second, all by human-in-the-loop coding agent (e.g., Claude Code or CodeX). We support all 643 WebVoyager tasks out of the box.
Call for contribution: Our Next goal is 100+ popular websites — covering all of Online-Mind2Web (147 sites) and beyond. Two tracks:
- Contribute a new mirror site (use the coding-agent pipeline → human verify → open PR) → co-author on the final paper
- Review submitted PRs (5 reviews → co-author)
We also released useful skills for you(your coding agent) to work on it! Typically you can create a new mirron within 1 day! See more contribution details at Contribute Guide.
Why WebHarbor: running web agent benchmarks on the live web is a nightmare — reCAPTCHA, geo-blocks, content drift, network flakiness, and tasks that go stale within months. Plus you can't reset the live web, which rules out heavy RL training. You will need a lightweight, easy-to-reset, task-driven evolving environments for web agent, both evaluation and training!
Related Resources:
| Name | Link |
|---|---|
| 🏠 WebHarbor Project Page | WebHarbor |
| 🤗 HuggingFace Dataset | ChilleD/WebHarbor |
| 💻 WebHarbor GitHub | Code Repo |
| 📊 Contribution Guide | Guide Details |
| 📝 Contribution Request Form | Google Form |
Welcome suggestions and discussions!
[link] [comments]
More from r/MachineLearning
-
Continual Harness: Online Adaptation for Self-Improving Foundation Agents [R]
May 14
-
Your AI Use Is Breaking My Brain: Why 10 Minutes of Prompting Fries Us[D]
May 14
-
Trained transformer-based chess models to play like humans (including thinking time) [P]
May 13
-
Scenema Audio: Zero-shot expressive voice cloning and speech generation [N]
May 13
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.