Record tests with Capture
shaft-capture defines the provider-neutral intermediate representation used
by SHAFT Capture and the managed-browser recorder that produces it. It records
browser intent for review, replay, migration, and future Java/JUnit/Cucumber
consumers. It generates Java 25 SHAFT/TestNG tests without depending on
shaft-ai; provider adapters remain optional.
Capture creation, validation, serialization, and privacy enforcement work with
pilot.ai.enabled=false. Optional AI consumers may receive only the already
redacted representation after the separate Pilot approval checks succeed.
Managed browser recording
The recorder launches a fresh SHAFT-managed Chrome, Chromium, or Edge session. Firefox and WebKit are rejected with explicit unsupported-browser messages until equivalent event coverage is available. WebDriver BiDi supplies navigation, browsing-context, prompt, and preload-script signals when available. A JavaScript listener drained through ordinary WebDriver provides deterministic interaction capture and remains the compatibility fallback.
Use the Capture commands on Connect shaft-mcp. That page owns the runnable MCP command reference and classpath notes.
Use --runtime-dir <path> on every command to isolate control files. stop
also accepts --discard. Only one recorder may own a runtime directory at a
time. The daemon control endpoint is bound to loopback, requires a generated
bearer token, and removes its token and descriptor at shutdown. Browser
profiles are temporary and removed after normal stop or interruption unless
--user-data-dir <path> is supplied.
capture start also accepts Playwright-codegen-shaped options where SHAFT can
map them safely: --viewport-size, --device, --color-scheme,
--geolocation, --timezone, --block-service-workers, --load-storage,
--save-storage, --save-har, --test-id-attribute, --lang,
--user-agent, --user-data-dir, --proxy-server, --proxy-bypass,
--ignore-https-errors, and --timeout. Device presets use bundled
Chrome/Edge mobile-emulation profiles when available; color scheme,
geolocation, timezone, and service-worker bypass use browser protocol support;
storage state uses SHAFT's browser storage-state JSON; and HAR output uses the
same redacted observability entries as SHAFT failure traces. Unsupported
drivers or unmapped device names produce deterministic warnings instead of raw
protocol failures. --save-har-glob is accepted with a warning; Capture writes
all observed network entries. Use capture features to list the current
Playwright codegen feature map.
capture start \
--url https://example.test \
--device "Pixel 7" \
--geolocation "30.0444,31.2357" \
--timezone Africa/Cairo \
--color-scheme dark \
--load-storage target/auth-state.json \
--save-storage target/auth-state-out.json \
--save-har target/capture.har
The same lifecycle is exposed by the capture_start, capture_start_codegen,
capture_status, and capture_stop MCP tools. Generation is exposed by
capture_generate; capture_codegen_features returns the feature map. Status
contains safe metadata, counts, readiness, and warnings, never typed values.
Readiness is READY, RISKY, or BLOCKED; risky steps keep recording, while
blocked readiness means generated replay will need user action such as a stable
locator or required secret input.
WebDriver code generation remains the default through capture_generate,
capture_generate_replay, and capture_code_blocks. Use
playwright_capture_generate_replay or playwright_capture_code_blocks only
when the target repository uses SHAFT.GUI.Playwright or the user asks for
Playwright output. Those Playwright tools read the same Capture session format
but emit SHAFT.GUI.Playwright setup, actions, waits, and assertions.
MCP Playwright action recordings created with playwright_record_start remain
available through the same Playwright tool names; playwright_recording_code_blocks
now adapts supported recorded actions into a Capture session, returns the
generated source/test-data/report/review paths, and keeps the direct replay
method for actions that do not yet have a Capture equivalent.
When recording in a visible browser, SHAFT injects a compact Capture panel into
the managed Chrome/Edge session. The panel lists captured actions in plain
English while the user clicks, types, selects, uploads, or navigates. Its
controls pause or resume action capture, add a user checkpoint, edit the visible
action text by recording an edit checkpoint, and stop the recording. Pressing
stop from the panel requests a normal SHAFT stop, closes the managed browser,
and leaves the session in COMPLETED status for generation. The browser panel
and generated capture workbench follow the same SHAFT report visual language as
Allure-attached HTML reports, including status chips and wrapping layouts that
avoid horizontal scrolling during review.
The panel also shows a live readiness chip next to the event count. It is computed from deterministic recorder evidence such as unsupported actions, missing locator candidates, positional or multi-match locators, missing post-navigation or post-submit assertions, redacted required inputs, and collector warnings. The chip reports issues only; it does not block recording.
Use the assertion control to toggle assertion mode, then click an element and choose
one of the deterministic verification types: visible, enabled, selected, text
equals or contains, attribute equals, URL equals or contains, or title equals.
The recorder stores these as VerificationEvent records. Expected text, URL,
title, and attribute values are externalized through the same privacy classifier
used for typed data, so generated assertions do not embed captured secrets.

Use the locator picker control to inspect the target before recording or
verifying it. Picker mode highlights the hovered element, opens a ranked
candidate list on click, and shows each candidate's strategy, expression,
uniqueness count, stability, score rationale, and live probe result (unique,
multi-match, no match, or failed). Pin a candidate when the deterministic
default is not the locator intent; the next captured event for that logical
element stores the selected candidate with the USER_PROVIDED locator signal,
so generation can prefer it without editing generated source.

For agent-driven MCP flows, the intended handoff is: call capture_start or
capture_start_codegen, let the user interact with the visible browser, wait
for either capture_stop or a browser-panel stop to complete, then call
capture_code_blocks for WebDriver or playwright_capture_code_blocks for
Playwright. The agent should show the generated result and ask whether the user
wants the complete Java snippet or wants the agent to insert the code into the
current repository. Snippet mode uses the returned Java full-class block,
including imports, setup, inline SHAFT.GUI.Locator.* locators, SHAFT
actions/assertions, and teardown.
Insertion mode should inspect the repository and move locators and actions
into existing Page Object classes when that pattern already exists, or create
the smallest matching page/test classes when it does not.
The returned code blocks include deterministic Page Object insertion guidance
for WebDriver and Playwright captures, including SHAFT locator inventory, action
sequence, and fallback manual-mapping warnings when the generated source has no
extractable candidates.
Use FLOW_START and FLOW_END checkpoints to mark an explicit reusable flow
inside a recording. The checkpoint description becomes the generated helper
method name, so a segment marked as login as admin generates a
loginAsAdmin() method and the replay test calls that method at the original
point in the journey:
capture checkpoint --kind FLOW_START --description "login as admin"
# perform the login steps in the managed browser
capture checkpoint --kind FLOW_END --description "login as admin"
@Test
public void replayCheckout() throws Exception {
driver.browser().navigateToURL("https://shop.example/login");
loginAsAdmin();
driver.element().click(SHAFT.GUI.Locator.clickableField("Checkout"));
}
private void loginAsAdmin() throws Exception {
driver.element().click(SHAFT.GUI.Locator.inputField("Username"));
driver.element().type(SHAFT.GUI.Locator.inputField("Username"), requiredData("username"));
}
For a record-at-target flow, provide the existing Java source and insertion
anchor when generating snippets. The CLI accepts
--target-source src/test/java/.../CheckoutTest.java --insert-after replayCheckout
and returns the normal generation result plus a focused insertion plan. MCP
agents can call capture_record_at_target_code_blocks to receive separate
blocks for SHAFT locator inventory/imports, action lines, and a no-edit
insertion guide.
SHAFT validates that the requested anchor is present when possible, but it never
edits the source file until the calling agent performs a separately approved
repository change.
All process arguments and filesystem paths are built with Java APIs
(ProcessBuilder, Path, and Files). No Windows, POSIX shell, or path
separator is assumed; restrictive POSIX permissions are applied when supported
and otherwise the host filesystem's inherited permissions are used.
Format
Every session has a schemaVersion, safe session and browser metadata, ordered
events and checkpoints, external test-data references, a redaction summary, and
explicit extension maps. The current version is 1.0; readers migrate the
synthetic 0.9 format and reject unsupported versions with an actionable
message.
The event hierarchy covers:
- navigation, click, type, clear, select, check/uncheck, and upload;
- keyboard, window/tab, frame, alert, and explicit wait operations;
- explicit verification events and replay status.
ElementSnapshot retains sanitized role, accessible name, label, normalized
attributes, visibility state, and LocatorCandidate evidence. Candidate scores
are deterministic inputs based on strategy, uniqueness, visibility, stability,
and recorded signals. No model inference is used to rank locators.
The bundled schema is:
shaft-capture/src/main/resources/schema/shaft-capture-session-1.0.schema.json
CaptureJsonCodec validates before read and write, emits stable human-readable
JSON, preserves explicit extension fields, and never publishes a partially
validated recording.
Privacy boundary
CapturePrivacyClassifier runs before values enter a CaptureSession.
Passwords, tokens, configured sensitive fields/selectors/attributes/URL
parameters, and configured value patterns produce named environment or secret
references with no original value. Ordinary typed data is externalized to
capture-data.json by default through ExternalTestDataWriter.
Upload events store a logical fixture reference, sanitized basename, media type, and size. They never retain an arbitrary absolute user path or file contents. Evidence references accept only safe relative paths. Cookies, storage, headers, page source, screenshots, and other evidence are absent unless a later collector explicitly enables a documented category.
The persisted redaction summary contains only counts and rule names. It does not contain removed values.
Lifecycle
CaptureSessionStore provides thread-safe start, append, checkpoint,
interruption, stop, and read operations. Each update serializes and validates a
complete snapshot before an atomic replacement. In-progress or crashed sessions
remain readable with status INCOMPLETE; a normal stop records COMPLETED and
an end timestamp.
Example:
var session = CaptureSession.start(
"checkout-recording",
Instant.now(),
browserMetadata);
var store = new CaptureSessionStore(Path.of("recordings/checkout.json"));
store.start(session);
store.append(captureEvent);
store.stop(Instant.now());
Deterministic TestNG generation
Generate a test, SHAFT JSON test data, and a deterministic report with the Capture generation command on Connect shaft-mcp.
The default output layout is:
generated-tests/
src/test/java/generated/capture/<SessionName>Test.java
src/test/resources/testDataFiles/<session-name>-test.json
target/shaft-capture/generation-report.json
target/shaft-capture/capture-review.json
target/shaft-capture/capture-workbench.html
target/shaft-capture/control-flow-preview.json
Generation selects locators in the accessibility, label, test-ID, stable
ID/name, CSS, then XPath family. The report records the score contribution from
uniqueness, visibility, interactability, semantic match, volatility,
frame/shadow context, and replay evidence, plus ranked fallbacks. Stable
user-provided locators pinned from the recorder overlay can outrank the
deterministic default. The review file
summarizes deterministic readiness, blockers, risks, typed findings, and next
suggestions; MCP generation results expose the same path as reviewPath and
return the deterministic review warnings in the tool result. Static review
findings cover brittle absolute or index-heavy locators, missing post-navigation
or post-submit assertions, fixed-duration waits, and sensitive JSON-backed test
data. When replay fails and a SHAFT trace exists, Capture maps the failure back
to the generated step and failed trace action, and flags failing network/API
calls as candidates for HTTP contract replay. The workbench HTML is a local
review UI for building record/checkpoint commands, editing generated source
through the browser file picker or download fallback, and reviewing the
Playwright codegen feature map beside the generated code.
Add --enable-fallback-locators when generating WebDriver replay code to make
the generated test try ranked captured alternatives before failing a target
lookup. The selected locator remains the first candidate; fallback candidates
are accepted only when they resolve to one matching element with the recorded
tag and accessible name, and interaction fallbacks must be visible and enabled.
When a fallback is used, the generated helper writes a SHAFT report log entry:
Capture fallback locator used for username-input: By.cssSelector: #username -> By.cssSelector: [name="username"]
Use --control-flow-preview to write deterministic suggestions for common
non-linear browser journeys without changing the generated replay. Capture
flags adjacent identical action groups, likely optional modal or banner close
actions, and recovery-like steps after failed or skipped recorded interactions.
The same suggestions appear in generation-report.json under
controlFlowSuggestions and in MCP warning results as review/CONTROL_FLOW
findings.
capture generate --session recordings/checkout.json --control-flow-preview
capture generate --session recordings/checkout.json \
--apply-control-flow-preview generated-tests/target/shaft-capture/control-flow-preview.json
Default generation stays linear. Applying a reviewed preview only changes approved optional close actions by adding an if-displayed guard:
if (isCaptureElementDisplayed(COOKIE_CLOSE_BUTTON_LOCATOR)) {
driver.element().click(COOKIE_CLOSE_BUTTON_LOCATOR);
}
Repeated groups and recovery paths remain review suggestions until the user marks explicit flow checkpoints or edits the generated test.
Example review finding:
{
"category": "LOCATOR",
"severity": "WARNING",
"summary": "Brittle XPATH locator selected for pay-button: /html/body/div[3]/form/button[2].",
"evidenceIds": ["event-4"],
"recommendation": "Prefer semantic locator text \"Pay now\" when unique."
}
Example status payload:
{
"state": "ACTIVE",
"eventCount": 12,
"readiness": "RISKY",
"warnings": ["Step 7 uses generated positional CSS locator for pay-button."]
}
Ordinary values are copied from the recording's external JSON into the
generated test-data file. Secret and sensitive references become required
environment variables and are typed securely. Uploads remain relative fixture
references under src/test/resources. Missing data is reported as required
user input. Captured credentials, cookies, authorization values, and absolute
personal paths fail the privacy scan instead of entering generated artifacts.
Every generated class creates a fresh driver in @BeforeMethod and calls
driver.quit() from @AfterMethod(alwaysRun = true). Only explicit
VerificationEvent records become assertions. An ASSERTION checkpoint must
point at a verification event; unsupported steps fail generation with their
event IDs and remediation. A matched FLOW_START/FLOW_END pair does not
infer abstractions from similar steps; only the explicitly marked events move
into a reusable helper method.
Compilation is enabled by default. Add --replay to run the compiled TestNG
class in an isolated process and require populated, passing Allure result
files. Existing source, data, report, or preview files are never replaced
unless --overwrite is supplied. MCP replay code follows the selected backend:
WebDriver tools keep SHAFT.GUI.WebDriver and Playwright tools generate
SHAFT.GUI.Playwright.
AI enrichment is optional and uses two phases for native CLI users: preview with explicit processing approval, then apply the reviewed fingerprinted preview. Use the canonical Capture command reference on Connect shaft-mcp when running it.
The provider may suggest Java names and captured-state assertions only. It cannot replace deterministic locators. Preview output is schema-validated and privacy-scanned; apply rejects stale fingerprints, invalid identifiers, unknown events, and assertions that contradict captured state. Accepted changes are compiled and replayed again.
When an MCP client calls Capture or Doctor AI-enabled tools, SHAFT treats that
tool call as the agent approval boundary for sharing the already redacted local
evidence with the calling agent. If no configured provider/API key is available,
MCP results still include agent handoff blocks so the MCP client can use its own
LLM and repository context. Native terminal commands keep the explicit provider
and --allow-local-ai or --allow-remote-ai approval requirements.
Run the focused suite with:
mvn -pl shaft-capture -am test
The real-browser recording and generated-replay suites are opt-in:
mvn -pl shaft-capture -am test \
-DincludeCaptureBrowserE2E=true \
-Dtest=ManagedCaptureRecorderBrowserTest,CaptureGeneratedReplayBrowserTest