-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Description
Summary
Stagehand currently relies heavily on XPaths for element targeting and caching. While effective, XPaths can be brittle—minor DOM changes (like a new wrapper div or reordered elements) often invalidate cached actions, causing unnecessary retries or failures even when the target element itself hasn't changed.
This proposal introduces an opt-in Stable Selectors feature that prioritizes resilient attributes (data-testid, id, unique classes) over structural XPaths, improving cache durability and reducing LLM re-inference.
Current Behavior
When observe or act runs, Stagehand generates XPaths for elements. If the page structure changes slightly (e.g., an advertisement inserts a div at the top of the body), all absolute XPaths shift, breaking the ActCache and requiring re-inference from the LLM.
There is currently no option to prefer stable attributes like data-testid, id, or unique class names.
Proposed Behavior
Users can configure a preferredSelectorType option. When enabled:
- The
observephase captures stable attributes (id,data-testid,class) alongside XPaths. actprioritizes these stable selectors over XPaths when executing actions.- The
ActCachevalidates against stable selectors, allowing cached actions to persist across minor layout shifts.
API Example
import { Stagehand } from "@browserbasehq/stagehand";
const stagehand = new Stagehand({
env: "LOCAL",
model: "openai/gpt-4.1-mini",
preferredSelectorType: "css", // NEW: 'xpath' (default) | 'id' | 'css'
});
await stagehand.init();
// observe() now returns stable selectors alongside xpath
const actions = await stagehand.observe("click the login button");
console.log(actions[0]);
// {
// selector: '/html/body/div[1]/button', // existing xpath
// id: 'login-btn', // NEW
// cssSelector: '[data-testid="login-btn"]', // NEW
// attributes: { 'data-testid': 'login-btn' }, // NEW
// ...
// }
// act() uses cssSelector first, falls back to xpath
await stagehand.act(actions[0]);Fallback Behavior
When preferredSelectorType is set to 'css' or 'id':
- First attempt: Use the preferred selector (
cssSelectororid). - Fallback: If the preferred selector is unavailable or fails to match, fall back to XPath.
A selector is considered "unavailable" when:
- No
idordata-testidattribute exists on the element - The generated CSS selector is not unique on the page
- The element lacks any distinguishing class names
Backward Compatibility: The default behavior remains unchanged (
preferredSelectorType: 'xpath'). Stable selectors are only used when explicitly opted-in.
Root Cause Analysis
The current domMapsForSession and snapshot logic primarily capture the DOM tree structure for XPath generation but do not explicitly extract or index attributes for selector generation. Additionally, the ActHandler lacks logic to utilize alternate selector strategies during execution.
Proposed Implementation
1. Type Updates
Add to StagehandOptions:
interface StagehandOptions {
// ... existing options
preferredSelectorType?: 'xpath' | 'id' | 'css';
}Extend ActionCandidate:
interface ActionCandidate {
selector: string; // existing xpath
id?: string; // NEW
cssSelector?: string; // NEW
attributes?: Record<string, string>; // NEW
}2. DOM Snapshot Enhancement
Update the snapshot module to capture id, data-testid, and class attributes during DOM traversal.
3. Selector Generation Utility
Add a generateCssSelector utility with the following priority:
[data-testid="value"]— highest stability#id— if unique.class1.class2— if combination is unique- Fall back to
null(use XPath)
4. ActHandler Update
Modify execution logic to:
- Check
preferredSelectorTypefrom options - Attempt preferred selector first
- Fall back to XPath on failure
- Log which selector type succeeded (for debugging)
Performance Considerations
Capturing additional attributes adds minimal overhead:
- Attribute extraction occurs during existing DOM traversal (no additional tree walks)
- CSS selector generation is O(1) per element
- Memory impact is negligible (3-4 additional string fields per candidate)
Environment
- Stagehand version: 3.0.7
- Environment: All (Local/Browserbase)
References
- [Playwright Best Practices: Locators] — advocates for user-facing attributes over XPaths
- Issue stagehand action occasionally return two xpath which locates the same object but one of them successfully click the button while the other does not #1434 — documents XPath brittleness causing action failures
- I can submit a PR for this feature
- Draft PR