agent-browser + 16Yun Proxies: Enterprise AI Scraping Deployment

From Dev to Production

Previous articles covered individual features. Production requires combining them into a secure, stable, scalable scraping system.

Session Isolation Architecture

Each scraping task should use an isolated session:

AGENT_BROWSER_SESSION=crawler-task-1 agent-browser open https://example.16yun.cn
AGENT_BROWSER_SESSION=crawler-task-2 agent-browser open https://example.16yun.cn
 
agent-browser session list
# Active sessions:
# -> crawler-task-1
# -> crawler-task-2

Each session has its own:

Browser instance
Cookies and storage
Navigation history
Authentication state

Session + Proxy Binding

# Task 1: dedicated proxy + session
HTTP_PROXY=http://user:pass@dedicated-a.16yun.cn:8888 \
AGENT_BROWSER_SESSION=task-a \
agent-browser batch \
  "open https://example.16yun.cn/login" \
  "fill @e3 admin@a.com" \
  "fill @e4 pass-a" \
  "click @e5" \
  "wait 3000" \
  "screenshot dashboard-a.png"
 
# Task 2: crawler proxy tunnel + session
HTTP_PROXY=http://user:pass@proxy.16yun.cn:8888 \
AGENT_BROWSER_SESSION=task-b \
agent-browser batch \
  "open https://example.16yun.cn/products" \
  "snapshot -i -c" \
  "screenshot products-b.png"

Security Policies

agent-browser provides multi-layered security for AI agent scenarios:

Domain Allowlist

Restrict navigation to trusted domains:

export AGENT_BROWSER_ALLOWED_DOMAINS="example.16yun.cn,*.example.16yun.cn,api.example.16yun.cn"
 
# Allowed
agent-browser open https://example.16yun.cn/dashboard
 
# Blocked
agent-browser open https://blocked.example.16yun.cn  # *.example.16yun.cn wildcard covers this domain

Sub-resource requests and WebSocket connections are also restricted.

Action Confirmation

Require approval for high-risk actions:

export AGENT_BROWSER_CONFIRM_ACTIONS="eval,download"
 
agent-browser eval "document.cookie"  # requires confirmation

Action Policy File

cat > policy.json << 'EOF'
{
  "allowedDomains": ["example.16yun.cn"],
  "blockedDomains": ["evil.com"],
  "maxScreenshots": 100,
  "denyEval": true,
  "denyFileUpload": true
}
EOF
 
export AGENT_BROWSER_ACTION_POLICY=./policy.json

Content Boundaries

Wrap page output in clear markers:

export AGENT_BROWSER_CONTENT_BOUNDARIES=true
 
agent-browser get text body
# <BEGIN_PAGE_CONTENT>
# ... page text ...
# <END_PAGE_CONTENT>

Output Length Limits

Prevent context overflow:

export AGENT_BROWSER_MAX_OUTPUT=50000

Security Reference

Setting	Variable	Description
Domain allowlist	`AGENT_BROWSER_ALLOWED_DOMAINS`	Restrict navigation scope
Action confirmation	`AGENT_BROWSER_CONFIRM_ACTIONS`	Require approval for risk actions
Policy file	`AGENT_BROWSER_ACTION_POLICY`	JSON policy definition
Content boundaries	`AGENT_BROWSER_CONTENT_BOUNDARIES`	Mark tool output boundaries
Output limit	`AGENT_BROWSER_MAX_OUTPUT`	Prevent context flooding
Encryption	`AGENT_BROWSER_ENCRYPTION_KEY`	AES-256-GCM storage encryption

16Yun Proxy Integration

Product	Best For	Setup
Crawler Proxy (tunnel)	Many sessions sharing IP pool	`HTTP_PROXY=http://user:pass@proxy.16yun.cn:8888`
API Proxy	Fine-grained IP control	Extract IP pool, assign per session
Dedicated Proxy	Fixed identity, long-term sessions	Fixed proxy + fixed session + state persistence

Crawler Proxy + Bulk Sessions

#!/bin/bash
export HTTP_PROXY=http://user:pass@proxy.16yun.cn:8888
 
for i in $(seq 1 10); do
  AGENT_BROWSER_SESSION="batch-$i" \
  agent-browser batch \
    "open https://example.16yun.cn/page-$i" \
    "snapshot -i -c > snaps/page-$i.txt" &
done
wait

API Proxy + IP Control

#!/bin/bash
IP_LIST=$(curl -s "http://ip.16yun.cn:817/myip/pl/xxx/?s=xxx&u=user&format=json&count=5")
 
echo "$IP_LIST" | jq -c '.[]' | while read -r proxy; do
  IP=$(echo "$proxy" | jq -r '.ip')
  PORT=$(echo "$proxy" | jq -r '.port')
  SESSION_NAME="crawl-${IP//./-}"
 
  HTTP_PROXY="http://user:pass@$IP:$PORT" \
  AGENT_BROWSER_SESSION="$SESSION_NAME" \
  agent-browser batch \
    "open https://example.16yun.cn" \
    "screenshot shots/$IP.png" &
done
wait

Dedicated Proxy + Account Management

#!/bin/bash
ACCOUNTS=(
  "acc-01:http://user:pass@dedicated-01.16yun.cn:8888"
  "acc-02:http://user:pass@dedicated-02.16yun.cn:8888"
)
 
for entry in "${ACCOUNTS[@]}"; do
  NAME="${entry%%:*}"
  PROXY="${entry##*:}"
 
  HTTP_PROXY="$PROXY" \
  AGENT_BROWSER_SESSION_NAME="$NAME" \
  agent-browser batch \
    "open https://seller.example.16yun.cn/dashboard" \
    "snapshot -i -c"
done

Production Deployment Checklist

□ Session isolation
   □ Each task uses independent --session
   □ Configure AGENT_BROWSER_SESSION_NAME for auto persistence
 
□ Security
   □ Set AGENT_BROWSER_ALLOWED_DOMAINS
   □ Enable AGENT_BROWSER_CONTENT_BOUNDARIES
   □ Set AGENT_BROWSER_MAX_OUTPUT
   □ Configure AGENT_BROWSER_ENCRYPTION_KEY in production
 
□ Proxy selection
   □ Bulk anonymous → Crawler Proxy (tunnel)
   □ Fine-grained IP control → API Proxy (IP pool)
   □ Fixed identity → Dedicated Proxy
 
□ Error handling
   □ 407 proxy auth: verify credentials
   □ 429 rate limit: reduce concurrency
   □ Timeout retry: 3 attempts with backoff
 
□ Monitoring
   □ Log each session's operations
   □ Periodically clean expired sessions/states
   □ Monitor proxy traffic and consumption

Summary

agent-browser provides CLI-first browser automation for AI agents. Combined with 16Yun proxies, it forms a complete solution from single commands to large-scale distributed scraping:

Layer	Technology	Problem Solved
Automation	agent-browser CLI	Browser control, screenshots, extraction
Identity	Session + State management	Auth persistence, operation isolation
Security	Allowlist + Policy + Encryption	Prevent misoperation and data leaks
Network	16Yun proxies	IP masking, rate control, geo coverage