The fastest, most stable way to use the TypeScript SDK is deterministic-first, AI-fallback: every step tries a native Appium block first, and only if that block throws does the SDK fall back to running a natural-language instruction with AI.
This gives you the best of both worlds:
- Speed and determinism when selectors are stable: native Appium runs with no AI round-trip.
- Resilience when the UI shifts, a popup appears, or a selector breaks: the step heals itself by falling back to AI instead of failing the test.
The appiumHandler parameter
aiExecute accepts an optional appiumHandler: a function that receives the live driver and runs your native Appium / WebdriverIO code.
await gptDriver.aiExecute(
"Tap the Sign In button", // AI fallback instruction
{
appiumHandler: async (driver) => { // native code, tried first
await driver.$("~sign-in-button").click();
},
}
);
How it runs
When you pass an appiumHandler, each call to aiExecute does the following:
-
Runs the handler first against the live driver. Fast, deterministic, no AI call.
-
Logs the step to the GPT Driver dashboard (a screenshot plus the handler’s source) so the native step is visible alongside your AI steps for debugging.
-
On success, returns immediately. The AI is never called.
-
On failure (the handler throws), the SDK logs a warning and runs the natural-language instruction with AI to recover:
Custom Appium handler failed, falling back to AI execution
That message is greppable in your run logs, so you can see exactly which steps fell back to AI.
The appiumHandler and the instruction should express the same intent. The handler is the fast path; the instruction is the recovery path that must achieve the same result when the handler can’t.
A reusable step helper
In practice you wrap the pattern in a small helper so every step reads as “native code, with an AI description to fall back to”:
async function step(title, appiumHandler, aiPrompt) {
console.log(`▶ ${title}`);
await gptDriver.aiExecute(aiPrompt, { appiumHandler });
}
Now each step is one line of intent, a deterministic block, and a natural-language fallback:
// Native tap runs first; if the selector is gone, AI taps the button instead.
await step(
"tap Play lesson",
async () => { await $("~Play lesson").click(); },
`In the action sheet, tap "Play lesson".`
);
// Native assertion runs first; AI verifies the same thing if it throws.
await step(
"verify lesson launched",
async () => { await $("~Saved lines").waitForDisplayed({ reverse: true, timeout: 25000 }); },
`Verify a lesson launches full-screen and replaces the list. Fail if no lesson appears.`
);
Worked example
A full flow built this way. Native Appium drives every step; AI only engages when a native block throws (a moved selector, a slow load, an unexpected popup):
test/specs/saved-lines.spec.js
import GptDriver from "gpt-driver-node";
const BUNDLE_ID = "com.example.app";
describe("Saved Lines", () => {
let gptDriver;
beforeEach(async function () {
const { protocol, hostname, port, path } = browser.options;
gptDriver = new GptDriver({
apiKey: browser.options.gptDriverApiKey,
driver: browser,
serverConfig: { url: `${protocol}://${hostname}:${port}${path}` },
cachingMode: "INTERACTION_REGION",
testId: this.currentTest?.fullTitle(),
appId: BUNDLE_ID,
});
});
// Deterministic-first, AI-fallback. The handler runs first (fast, no AI);
// on failure the SDK runs the natural-language prompt with AI.
async function step(title, appiumHandler, aiPrompt) {
console.log(`▶ ${title}`);
await gptDriver.aiExecute(aiPrompt, { appiumHandler });
}
it("plays a lesson and unsaves the line", async () => {
await step(
"open the action sheet on the first saved line",
async () => { await $("~ellipsis").click(); await $("~Unsave").waitForDisplayed({ timeout: 5000 }); },
`Tap the three-dot "More" menu on the first saved line. An action sheet with "Play lesson" and "Unsave" should appear.`
);
await step(
"tap Play lesson",
async () => { await $("~Play lesson").click(); },
`In the action sheet, tap "Play lesson".`
);
await step(
"verify the lesson launched",
async () => { await $("~Saved lines").waitForDisplayed({ reverse: true, timeout: 25000 }); },
`Verify a lesson launches full-screen and replaces the Saved Lines list. Fail if no lesson appears.`
);
await step(
"exit back to Saved Lines",
async () => { await $("~Back").click(); await $("~Saved lines").waitForDisplayed({ timeout: 6000 }); },
`Exit the lesson (tap the back, close, or X control, confirming to quit if asked) to return to the Saved Lines screen.`
);
await step(
"unsave the line",
async () => { await $("~ellipsis").click(); await $("~Unsave").click(); },
`Open the "More" menu on the first saved line again and tap "Unsave".`
);
await gptDriver.setSessionSucceeded();
});
});
Falling back to AI for reads, too
The same try-native-first idea works for data you read off the screen. Try a native selector, and fall back to extract (or assert) when the native read fails:
let firstLine = "";
try {
firstLine = await $("~saved-line-text").getText();
} catch (e) {
const extracted = await gptDriver.extract(["firstSavedLineText"]);
firstLine = extracted.firstSavedLineText ?? "";
}
When to reach for AI directly
Skip the handler and call aiExecute(prompt) with no native code when:
- The target is a WebView, canvas, map, or other non-native element that selectors can’t reach reliably. See non-native elements.
- The step is inherently visual or fuzzy (“dismiss whatever popup is on screen”, “tap the cheapest room option”).
- You are prototyping and don’t have stable selectors yet. Add
appiumHandler blocks later to speed up the steps that run often.
Tips
- Keep handlers fast and let them throw early. Use short
waitFor* timeouts inside handlers so a missing element falls back to AI quickly instead of burning the full default timeout.
- Pair the handler and the prompt. They should do the same thing; the prompt is what runs when the handler can’t.
- Combine with caching. Set
cachingMode so the AI fallback steps that do run also get cached. See the reference.
- Review fallbacks on the dashboard. Every native step is logged with a screenshot, and every fallback emits the greppable warning above, so you can see where the suite is leaning on AI and tighten those selectors.