agent-browser Snapshot & Screenshots: Make Your AI Understand the Page
Accessibility Tree extraction, interactive element filtering, annotated screenshots, and diff-based page change detection.
Snapshot: The AI's Eyes
Traditional scrapers locate elements by CSS selectors. AI agents need semantic understanding — what is this page doing, what elements can I interact with, and how are they related?
snapshot outputs the browser's Accessibility Tree — the semantic structure normally used by assistive technology like screen readers. AI reads this directly:
# Basic snapshot
agent-browser snapshot
# @e1 heading "Product List"
# @e2 link "Item A - $29" url=/product/a
# @e3 link "Item B - $39" url=/product/b
# @e4 button "Load More"
# @e5 textbox "Search products"
# @e6 button "Search"
Interactive Elements Only
On content-heavy pages, snapshot -i shows only actionable elements:
agent-browser snapshot -i
# @e1 link "Item A - $29"
# @e2 link "Item B - $39"
# @e3 button "Load More"
# @e4 textbox "Search products"
# @e5 button "Search"
| Option | Effect | Use Case |
|---|---|---|
-i / --interactive | Interactive elements only | Reduce noise for AI decisions |
-c / --compact | Remove empty structural elements | Smaller output |
-d <n> / --depth <n> | Limit tree depth | Deep page structures |
-s <sel> / --selector <sel> | Scope to CSS selector | Focus on specific area |
-u / --urls | Include link URLs | Link extraction |
Combined Options
agent-browser snapshot -i -d 5 -s "#main"
agent-browser snapshot -i -c -u
Annotated Screenshots
--annotate overlays numbered labels on the screenshot. Each number matches a ref:
agent-browser screenshot --annotate
# [1] @e1 button "Submit"
# [2] @e2 link "Home"
# [3] @e3 textbox "Email"
This gives AI agents both visual and text-based page understanding simultaneously.
Screenshot Options
agent-browser screenshot --full
agent-browser screenshot --screenshot-dir ./shots
agent-browser screenshot --screenshot-format jpeg --screenshot-quality 80
Diff: Page Change Detection
Snapshot Diff
# Compare against last snapshot
agent-browser diff snapshot
# Compare against saved baseline
agent-browser diff snapshot --baseline before.txt
# Scoped comparison
agent-browser diff snapshot --selector "#pricing"
Visual Diff
agent-browser diff screenshot --baseline before.png
agent-browser diff screenshot --baseline before.png -o diff.png
agent-browser diff screenshot --baseline before.png -t 0.2
URL Comparison
agent-browser diff url https://v1.example.com https://v2.example.com
agent-browser diff url https://v1.example.com https://v2.example.com --screenshot
Practical Use Cases
Price Monitoring
agent-browser open https://store.example.com/products
agent-browser snapshot -i -c > prices-$(date +%Y%m%d).txt
agent-browser close
agent-browser diff snapshot --baseline prices-20260629.txt
Page Change Alerts
# Establish baseline
agent-browser open https://example.com/pricing
agent-browser screenshot pricing-baseline.png
agent-browser snapshot > pricing-baseline.txt
agent-browser close
# Periodic check
agent-browser open https://example.com/pricing
agent-browser diff snapshot --baseline pricing-baseline.txt
agent-browser close
Data Annotation
agent-browser open https://example.com/data-table
agent-browser screenshot --annotate --full
# Team can discuss by ref: @e12 price data looks wrong
agent-browser close
Using with Proxies
export HTTP_PROXY=http://user:pass@proxy.16yun.cn:8888
export HTTPS_PROXY=http://user:pass@proxy.16yun.cn:8888
agent-browser open https://example.com
agent-browser screenshot --full --annotate
16Yun's Crawler Proxy with residential IPs reduces blocking, ensuring snapshots capture real page content.
Summary
| Feature | Command | Use |
|---|---|---|
| Full page structure | snapshot | AI page understanding |
| Interactive only | snapshot -i | Agent decision making |
| Labeled screenshot | screenshot --annotate | Visual + text dual channel |
| Structure diff | diff snapshot | Content monitoring |
| Visual diff | diff screenshot | UI monitoring |
Need an enterprise proxy plan?
We can tailor architecture to your target domains, concurrency, and reliability goals.