Skip to main content
The TypeScript SDK exposes a single GptDriver class. Import the default export and construct it with your WebdriverIO browser:
import GptDriver from "gpt-driver-node";

const gptDriver = new GptDriver({
    apiKey: "YOUR_API_KEY",
    driver: browser,
    serverConfig: { url: "http://127.0.0.1:4723/" },
});

Constructor options

new GptDriver(config: GptDriverConfig)
OptionTypeDescription
apiKeystringRequired. Your GPT Driver API key.
driverWebDriver | BrowserAn existing WebdriverIO / Appium session. The SDK runs on this session.
serverConfig.urlURL | stringThe Appium server URL. Required whenever you pass a driver.
serverConfig.deviceobject{ platform, deviceName, platformVersion }. Required only when you do not pass a driver and want the SDK to start its own session.
cachingModeCachingModeDefault caching for AI steps. One of "NONE", "FULL_SCREEN", "INTERACTION_REGION". Defaults to "NONE".
testIdstringA label for this run, shown on the dashboard.
appIdstringApp identifier (appPackage on Android / bundleId on iOS). Auto-read from the session capabilities when omitted; set it to override.
additionalUserContextstringFree-text guidance passed to the AI on every AI step.
maxWaitForStableScreenSecsnumberHow long to wait for the screen to settle before an AI step.
organisationIdstringOrganisation identifier used for cache scoping.

AI commands

aiExecute

Execute a natural-language instruction on the current screen. Optionally pass an appiumHandler to run native Appium code first and only fall back to AI if it throws (see Deterministic execution).
aiExecute(command: string, options?: {
    appiumHandler?: (driver) => Promise<any>;
    cachingMode?: CachingMode;
}): Promise<void>

// Options-object form
aiExecute(options: {
    command: string;
    appiumHandler?: (driver) => Promise<any>;
    cachingMode?: CachingMode;
}): Promise<void>
// Natural language only
await gptDriver.aiExecute("Tap the login button and wait for the home screen");

// Native-first, AI fallback
await gptDriver.aiExecute("Tap the Sign In button", {
    appiumHandler: async (driver) => { await driver.$("~sign-in-button").click(); },
});

// Override caching for this step
await gptDriver.aiExecute("Navigate to the settings page", { cachingMode: "FULL_SCREEN" });

assert

Verify a condition without taking any action. Resolves if the condition holds, throws otherwise.
assert(assertion: string, options?: { cachingMode?: CachingMode }): Promise<void>
await gptDriver.assert(`The total price is ${price}`);

assertBulk

Check multiple conditions in one call. Throws if any condition fails.
assertBulk(assertions: string[]): Promise<void>
await gptDriver.assertBulk([
    `The total price is ${price}`,
    "The VAT is calculated correctly",
    "The delivery date is shown in the format dd.mm.YYYY",
]);

checkBulk

Check multiple conditions in one call. Returns an object with a true / false result per condition instead of throwing.
checkBulk(conditions: string[]): Promise<Record<string, boolean>>
const results = await gptDriver.checkBulk([
    "The total price is shown",
    "The VAT is calculated correctly",
]);

extract

Extract information from the current screen. Returns an object keyed by the values you asked for.
extract(keys: string[]): Promise<Record<string, any>>
const flight = await gptDriver.extract([
    "departureAirportCode",
    "destinationAirportCode",
    "departureTime",
]);

await gptDriver.assertBulk([
    `departure airport is ${flight.departureAirportCode}`,
    `destination airport is ${flight.destinationAirportCode}`,
]);

openDeepLinkUrl

Open a deep link in the app under test.
openDeepLinkUrl(params: { url: string; package?: string; bundleId?: string }): Promise<void>
// Android
await gptDriver.openDeepLinkUrl({ url: "myapp://product/123", package: "com.example.app" });

// iOS
await gptDriver.openDeepLinkUrl({ url: "myapp://product/123", bundleId: "com.example.app" });

Session lifecycle

The session starts automatically on the first AI command. Report the outcome at the end of a test so it is recorded on the dashboard.
setSessionSucceeded(): Promise<void>
setSessionFailed(): Promise<void>
setSessionStatus(status: "succeeded" | "failed"): Promise<void>
await gptDriver.setSessionSucceeded();

Caching

Caching reduces AI calls for repetitive executions. Set a default in the constructor, or override per call.
ModeDescription
NONENo caching (default).
FULL_SCREENCache hit only if the full screen matches a previous successful execution.
INTERACTION_REGIONCache hit if the interaction region matches (for example, the area around a tapped element). Recommended.
// Default for the whole session
const gptDriver = new GptDriver({ /* ... */, cachingMode: "INTERACTION_REGION" });

// Override for a single step
await gptDriver.aiExecute("Navigate to settings", { cachingMode: "FULL_SCREEN" });