Interaction
Interaction is where a script stops describing the page and starts changing it. WRC's action surface is deliberately small — eight verbs do everything from "click that button" to "drag a slider 120 px right". The point of this guide is to give you a feel for which verb fits what, what each one does for you behind the scenes, and the handful of options that come up often enough to be worth memorising.
- Every action takes a
Locator(see Targeting elements) and dispatches exactly once — there is no implicit retry. - Pre-flight: the server scrolls the element into view, then animates the cursor along a human-like path before the actual gesture fires.
- Go splits the bare and the customised variant into two methods (
Click/ClickWith). TypeScript keeps one method and an optionaloptsargument. - Three keyboard-level escape hatches sit alongside the element verbs:
InsertText(paste at caret),PressKey/ReleaseKey(raw down/up events).
The shared shape
Every element action follows the same pattern, so once you know one you know all of them:
- You pass a
Locator— built withCSS(...)/JS(...)/Node(...)/At(...). The locator's modifiers (.InFrame(...),.InAllFrames()) are honoured. - The server resolves the element, scrolls it into view if needed (walking nested scroll containers and out-of-process iframe chains), then animates the mouse cursor along a Perlin-noise path to a random point inside the element's bounding rect.
- The gesture (click, fill, drag…) fires once. No retry, no
implicit wait — if the element wasn't there, you get
ELEMENT_NOT_FOUNDstraight away. - You get back an
ElementResult(or aDragResult/ aSelectOptionResult) carrying the resolvedframeId,backendNodeId, post-scrollisVisible, the element'sboundsand the root-viewport coordinates (rootX/rootY) where the gesture actually landed.
The naming pattern differs across the two SDKs. Whenever an action
has options, Go ships two methods — the bare one and a
…With(opts) variant — so you don't pay for an empty struct in the
common case. TypeScript keeps the single method and tucks the options
behind a trailing optional argument.
// Bare call — no options.
_, _ = browser.Click(ctx, wrc.CSS("button.submit"))
// Customised call — same target, right double-click.
_, _ = browser.ClickWith(ctx, wrc.CSS("li.menu"), wrc.ClickOpts{
Button: "right",
ClickCount: 2,
})// Bare call — no options.
await browser.click(css("button.submit"));
// Customised call — same target, right double-click.
await browser.click(css("li.menu"), { button: "right", clickCount: 2 });Click
The workhorse. By default: scroll into view, hand-shaped mouse path,
full mouseDown + mouseUp at a randomised point inside the
element's bounding rect.
ClickOpts covers the three things you might want to vary:
Button—"left"(default),"right","middle".ClickCount—1(default) or2for a double-click.Action—"click"(default, full press + release),"press"(mouseDown only),"release"(mouseUp only — no mouse movement, fires at the current cursor position).
The press / release pair is the escape hatch for long-press
gestures, custom drag implementations, or any flow where you need to
hold the button down across other actions:
// Long-press: down, hold while doing something else, up.
_, _ = browser.ClickWith(ctx, wrc.CSS(".thumb"), wrc.ClickOpts{Action: "press"})
// ... do other things while the button is held ...
_, _ = browser.ClickWith(ctx, wrc.At(0, 0), wrc.ClickOpts{Action: "release"})// Long-press: down, hold while doing something else, up.
await browser.click(css(".thumb"), { action: "press" });
// ... do other things while the button is held ...
await browser.click(at(0, 0), { action: "release" });Note that release ignores any coordinates you pass and fires
at wherever the cursor currently is — exactly the behaviour you need to
close out a press started elsewhere on the page.
Click is the one action that also accepts At(x, y) (along with
MoveTo). Coordinate clicks are how you reach canvas elements,
captcha tiles, and HTML5-game targets that aren't addressable via
the DOM.
Fill
Anywhere you'd type into an input: <input>, <textarea>,
contenteditable divs. Under the hood: scroll into view, mouse path,
click to focus, optionally clear (Ctrl+A then Delete), then
character-by-character key events with QWERTZ keyboard simulation
and human-like timing.
The single option worth knowing is NoClear / noClear. The default
is to clear, which is what you want 95 % of the time — append-only
is the surprise:
// Default: clear the field first, then type.
_, _ = browser.Fill(ctx, wrc.CSS("input[name=email]"), "user@example.com")
// Keep the existing value, append to it.
_, _ = browser.FillWith(ctx, wrc.CSS("textarea"), " — appended", wrc.FillOpts{
NoClear: true,
})// Default: clear the field first, then type.
await browser.fill(css("input[name=email]"), "user@example.com");
// Keep the existing value, append to it.
await browser.fill(css("textarea"), " — appended", { noClear: true });Fill only accepts element locators — At(x, y) is rejected because
typing needs an actual focus target.
Because every keystroke goes through a real keydown / keypress /
keyup cycle, any JS handler the page has on the input (autocomplete,
live validation, inline search…) fires exactly as if a person typed
the value. That's the whole point — but it also means Fill is the
slowest action in WRC. For pasting large blocks of text without firing
per-key events, jump straight to InsertText (further down).
MoveTo
Move the mouse cursor over an element (or to coordinates) without
clicking. The same scroll-into-view + Perlin-noise animation as
Click, just stopping at the hover.
Useful for triggering CSS / JS hover states (dropdown menus,
tooltips), warming up a page that only reveals an action on
mouseover, or staging a cursor before a Click(at(...)):
// Hover to reveal a dropdown.
_, _ = browser.MoveTo(ctx, wrc.CSS("nav .menu-trigger"))
_, _ = browser.Wait(ctx, wrc.CSS("nav .submenu"))
_, _ = browser.Click(ctx, wrc.CSS("nav .submenu li:first-child"))// Hover to reveal a dropdown.
await browser.moveTo(css("nav .menu-trigger"));
await browser.wait(css("nav .submenu"));
await browser.click(css("nav .submenu li:first-child"));Like Click, MoveTo also accepts At(x, y).
ScrollTo
Bring an element into the viewport without clicking it. WRC actions
already scroll into view as part of their pre-flight, so ScrollTo
exists for the two cases that don't:
- You want the element visible for a
GetDOM/GetObservation/Evaluatepass that won't trigger a scroll on its own. - You're chaining multiple
At(x, y)coordinate clicks and need to position the page relative to known coordinates first.
_, _ = browser.ScrollTo(ctx, wrc.CSS("#footer"))
// Footer is now in view — go inspect, evaluate, or click coordinates.await browser.scrollTo(css("#footer"));
// Footer is now in view — go inspect, evaluate, or click coordinates.The server walks nested scroll containers (the inner div that
actually scrolls inside a paginated UI) and out-of-process iframe
chains (so an element deep inside an OOPIF still gets brought into
the root viewport) automatically. You don't have to think about it.
At(x, y) is rejected here — scrolling needs an actual element to
target.
Drag
Two variants depending on whether you want to drop relative to the pickup (slider handles, range inputs) or at a fixed page position (reordering cards, file-tree nodes).
// Drag by an offset — slider 120 px to the right.
_, _ = browser.DragBy(ctx, wrc.CSS(".slider .handle"), 120, 0)
// Drag to absolute root-viewport coordinates — reorder a card.
_, _ = browser.DragTo(ctx, wrc.CSS(".card.draggable"), 800, 400)// Drag by an offset — slider 120 px to the right.
await browser.dragBy(css(".slider .handle"), 120, 0);
// Drag to absolute root-viewport coordinates — reorder a card.
await browser.dragTo(css(".card.draggable"), 800, 400);Both variants go through the same gesture: mouse-move to the pickup
point inside the element, mouseDown, drag along a Perlin-noise path
to the target, mouseUp. The DragResult carries both endpoints —
startX / startY for the pickup, endX / endY for the drop — so
you can verify where the gesture actually started and ended.
Drag only accepts element locators. To drag between specific
coordinate pairs, fall back to two Click(At(...)) calls in
press / release mode.
Select (for <select> elements)
Three variants depending on what you know about the option to pick.
Same <select> Locator, different option key:
// Zero-based index.
_, _ = browser.SelectByIndex(ctx, wrc.CSS("select#country"), 2)
// Match the option's value attribute.
_, _ = browser.SelectByValue(ctx, wrc.CSS("select#country"), "DE")
// Match the option's visible text (trimmed, exact match).
_, _ = browser.SelectByText(ctx, wrc.CSS("select#country"), "Germany")// Zero-based index.
await browser.selectByIndex(css("select#country"), 2);
// Match the option's value attribute.
await browser.selectByValue(css("select#country"), "DE");
// Match the option's visible text (trimmed, exact match).
await browser.selectByText(css("select#country"), "Germany");Same pre-flight as every other action: the <select> is scrolled
into view and the cursor animates over to it along a human-like path
— picking an option goes through the same motions a real user would.
Once the option is chosen, the standard input and change events
fire so the page's listeners run exactly as if a person had picked
it.
Opt out of the events for the (rare) case where you want to mutate
state silently — note that Go's flag is NoEvents=true (negative
polarity), while TS uses fireEvents=false:
// Silent — no input/change events.
_, _ = browser.SelectByIndexWith(ctx, wrc.CSS("select#hidden"), 0, wrc.SelectOpts{
NoEvents: true,
})// Silent — no input/change events.
await browser.selectByIndex(css("select#hidden"), 0, { fireEvents: false });All three variants return a SelectOptionResult with the resolved
SelectedIndex, SelectedValue and SelectedText — useful as a
sanity check that the right option was actually chosen.
Keyboard fallbacks
For the cases that don't fit into a single element action, three keyboard-level primitives target whatever currently has focus:
InsertText — fast paste
Commits the entire string at once via the browser's IME path. No per-character key events fire — which is what you want for big text blocks where you don't need to trigger autocomplete or live validation:
// Focus the field first…
_, _ = browser.Click(ctx, wrc.CSS("textarea.bio"))
// …then paste the whole block in one shot.
_ = browser.InsertText(ctx, "Lorem ipsum dolor sit amet, …")// Focus the field first…
await browser.click(css("textarea.bio"));
// …then paste the whole block in one shot.
await browser.insertText("Lorem ipsum dolor sit amet, …");Because no keyboard events fire, anything the page listens for on
keydown / keyup will not see this input. That's the trade-off:
fast and silent, or slow and observable.
PressKey / ReleaseKey — raw key events
For keyboard shortcuts (Ctrl+A, Ctrl+S, Enter to submit) and any flow where you need a key event to be the actual signal. Both fire a single event — pair them for a full press cycle.
The modifier bitmask is Alt=1, Ctrl=2, Meta=4, Shift=8. Combine
with | for multi-modifier shortcuts.
// Ctrl+A on the focused element.
_ = browser.PressKey(ctx, "a", "KeyA", 2, 0)
_ = browser.ReleaseKey(ctx, "a", "KeyA", 2, 0)
// Just an Enter — to submit a form without clicking the button.
_ = browser.PressKey(ctx, "Enter", "Enter", 0, 0)
_ = browser.ReleaseKey(ctx, "Enter", "Enter", 0, 0)// Ctrl+A on the focused element.
await browser.pressKey("a", { code: "KeyA", modifiers: 2 });
await browser.releaseKey("a", { code: "KeyA", modifiers: 2 });
// Just an Enter — to submit a form without clicking the button.
await browser.pressKey("Enter");
await browser.releaseKey("Enter");The key value follows the DOM KeyboardEvent.key standard
("Enter", "ArrowLeft", "a", …). code follows
KeyboardEvent.code ("Enter", "KeyA", "ArrowLeft", …) and
falls back to key when omitted.
Auto-behaviours, at a glance
A quick recap of what the server does for you so you don't have to:
| Action | Scrolls into view | Animates cursor | Fires events |
|---|---|---|---|
Click, Fill | yes | yes (human-like path) | full DOM event stream |
MoveTo | yes | yes (human-like path) | mouseover, mouseenter |
Drag* | yes (pickup) | yes (pickup → drop) | mousedown / mousemove / mouseup |
ScrollTo | yes (the whole point) | — | scroll events on the container |
Select* | yes | yes (human-like path) | input + change (unless suppressed) |
InsertText | — | — | input (no key events) |
PressKey / ReleaseKey | — | — | keydown / keyup |
If you find yourself reaching for a manual ScrollTo before every
Click, you don't need to — Click already does it.
Gotchas
- Actions don't wait. Pair every action with a
Waitfor the element you're about to touch. See Waiting for the full discussion. - Validation is client-side, not server-side.
Drag/Fill/Select*/ScrollTorejectAt(x, y)before the request ever leaves your machine — you get an immediate, descriptive error rather than a confusing server-side failure. - Polarity surprises on the option flags. Go uses negative
polarity throughout (
NoClear,NoEvents—trueopts out). TS splits:noClearis negative (trueopts out), butfireEventsis positive (falseopts out). The defaults are identical across both SDKs — only the spelling differs. InsertTextandPressKeyneed focus. Neither targets an element — they go to whatever currently has focus on the page. Do aClickfirst if you need a specific input to receive them.releaseignores its target. AClickwithAction: "release"fires at the current cursor position regardless of the locator you pass — that's the whole point, so a long-press started elsewhere can be closed cleanly.
- Targeting elements — the constructors and modifiers every action accepts.
- Waiting — the explicit pause every action needs in front of it.
- API reference: Go interaction methods · TS interaction methods.