agent-browser CI/CD Testing: Version Locking, Headless Differences, and Flaky Tests
Chrome auto-update breaks CI overnight. Headless A11y tree differs from headed rendering. Concurrent tests hijack each other's sessions — AI browser testing in CI/CD has hidden traps.
16Yun Engineering TeamMay 4, 20261 min read
Problem 1: Chrome Auto-Update
The agent-browser M138 AutoDeElevate incident is the textbook case. One Chrome security update, all agent-browser daemon connections broke. Every CI pipeline went red. No code changed — Chrome auto-updated.
Solution: Version Locking
# Dockerfile.ci
FROM node:22-slim
RUN npx agent-browser install --version 137.0.7150.0
ENV CHROME_VERSION=137.0.7150.0
RUN test "$(google-chrome --version | grep -oP '\d+\.\d+\.\d+\.\d+')" = "$CHROME_VERSION"Problem 2: Headless vs Headful Differences
- A11y tree element order differs — invisible elements aren't rendered in headless mode
- Font rendering — CI container fonts differ from target user fonts
- GPU unavailable — WebGL pages behave differently headless
- Media queries — prefers-color-scheme defaults in headless
class HeadlessCompatibilityChecker:
def compare_a11y_tree(self, headless_tree, headed_tree):
headless_refs = {e["ref"]: e["text"] for e in headless_tree}
headed_refs = {e["ref"]: e["text"] for e in headed_tree}
for ref in headed_refs:
if ref not in headless_refs:
self.diffs.append(f"Element {ref} missing in headless mode")Problem 3: Concurrent Session Isolation
# Unsafe: shared browser
@pytest.mark.parametrize("url", test_urls)
async def test_extraction(url, shared_browser):
page = await shared_browser.new_page()
# Safe: independent browser per test
@pytest.mark.parametrize("url", test_urls)
async def test_extraction(url):
browser = await launch_browser()
page = await browser.new_page()
await browser.close()Problem 4: Flaky Tests
class FlakyTestHandler:
@retry(stop_max_attempt_number=3, wait_exponential_multiplier=1000)
async def flaky_operation(self, page, operation_fn):
return await operation_fn(page)Summary
- Version lock Chrome — prevent auto-update from breaking CI
- Headless vs headful differences — A11y tree, fonts, GPU behavior
- Session isolation — independent browser instances per test
- Flaky test handling — retry strategy + distinguish retryable errors
Need an enterprise proxy plan?
We can tailor architecture to your target domains, concurrency, and reliability goals.