> ## Documentation Index
> Fetch the complete documentation index at: https://docs.mobileboost.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Setup

> Add the GPT Driver SDK to an existing WebdriverIO + Appium test suite (TypeScript / JavaScript)

The TypeScript SDK (`gpt-driver-node`) wraps your existing **WebdriverIO + Appium** session. You keep writing normal Appium code, and the SDK adds AI commands (`aiExecute`, `assert`, `extract`, ...) on top of the same `browser` instance. Nothing about your existing setup has to change.

<Note>
  This SDK is for engineers who already run Appium tests in code. If you are looking for the no-code recorder and cloud runner, see [QA Studio](/qa-studio/overview) instead.
</Note>

## Prerequisites

A working WebdriverIO + Appium project. If you are starting from scratch, install Appium and the platform drivers:

```bash theme={null}
# Install Appium
npm install -g appium

# Install drivers
appium driver install xcuitest      # iOS
appium driver install uiautomator2  # Android
```

A typical WebdriverIO project also has these dev dependencies:

```bash theme={null}
npm install -D @wdio/cli @wdio/local-runner @wdio/mocha-framework @wdio/spec-reporter
```

## Installation

Add the SDK to your project:

```bash theme={null}
npm install gpt-driver-node
```

## Configure WebdriverIO

Your `wdio.conf.js` (or `.ts`) stays a standard WebdriverIO config. Two things matter for the SDK:

1. Your Appium **capabilities** (the app and device under test).
2. Where you keep your **GPT Driver API key**. A convenient pattern is a custom top-level field so every spec can read it from `browser.options`:

```js title="wdio.conf.js" theme={null}
export const config = {
    runner: 'local',
    protocol: 'http',
    hostname: '127.0.0.1',
    port: 4723,
    path: '/',

    // Your GPT Driver API key (or read it from process.env.GPT_DRIVER_API_KEY)
    gptDriverApiKey: process.env.GPT_DRIVER_API_KEY,

    specs: ['./test/**/*.spec.js'],
    maxInstances: 1,

    capabilities: [
        {
            platformName: 'Android',
            'appium:automationName': 'UiAutomator2',
            // 'appium:app': '/path/to/app.apk',
            // 'appium:deviceName': 'Pixel 7',
        },
    ],

    framework: 'mocha',
    mochaOpts: { ui: 'bdd', timeout: 60000 * 10 },
};
```

<Note>
  The long Mocha `timeout` matters: AI fallback steps can take several seconds, so keep the per-test timeout generous.
</Note>

## Wire the SDK into a spec

Construct a `GptDriver` and hand it the live WebdriverIO `browser`. The SDK attaches to that session and reuses it. A `beforeEach` keeps one instance per test:

```js title="test/specs/login.spec.js" theme={null}
import GptDriver from "gpt-driver-node";

describe("Login", () => {
    let gptDriver;

    beforeEach(async function () {
        const { protocol, hostname, port, path } = browser.options;
        const baseUrl = `${protocol}://${hostname}:${port}${path}`;

        gptDriver = new GptDriver({
            apiKey: browser.options.gptDriverApiKey,
            driver: browser,                       // the existing WebdriverIO session
            serverConfig: { url: baseUrl },        // the Appium server URL
            cachingMode: "INTERACTION_REGION",     // "NONE" | "FULL_SCREEN" | "INTERACTION_REGION"
            testId: this.currentTest?.fullTitle(), // shows up on the dashboard
            appId: "com.example.app",              // appPackage (Android) / bundleId (iOS)
        });
    });

    it("logs in", async () => {
        await gptDriver.aiExecute("Tap the login button and wait for the home screen");
        await gptDriver.assert("The home screen is displayed with a welcome message");
        await gptDriver.setSessionSucceeded();
    });
});
```

### What each option does

| Option                  | Required | Description                                                                                                               |
| ----------------------- | -------- | ------------------------------------------------------------------------------------------------------------------------- |
| `apiKey`                | yes      | Your GPT Driver API key.                                                                                                  |
| `driver`                | yes\*    | An existing WebdriverIO / Appium `browser`. The SDK runs on this session.                                                 |
| `serverConfig.url`      | yes      | The Appium server URL (required whenever you pass a `driver`).                                                            |
| `cachingMode`           | no       | Default caching for AI steps. Defaults to `"NONE"`.                                                                       |
| `testId`                | no       | A label for this run, shown on the dashboard.                                                                             |
| `appId`                 | no       | App identifier (`appPackage` / `bundleId`); auto-read from the session when omitted.                                      |
| `additionalUserContext` | no       | Free-text guidance passed to the AI on every AI step (for example, `"When asked about Location Permissions, grant it."`). |

<Note>
  \*If you do not pass an existing `driver`, provide `serverConfig.url` **and** `serverConfig.device.platform` and the SDK will start its own Appium session. Attaching to your existing `browser` is the common case and keeps the SDK in your normal WebdriverIO lifecycle.
</Note>

## Session lifecycle

You do **not** start the session manually. It starts automatically on the first AI command (`aiExecute`, `assert`, `extract`, ...). In case an AI step fails the test, we'll set the test status in the dashboard to failed automatically. For successful tests, report the outcome so it is recorded on the dashboard:

```js theme={null}
await gptDriver.setSessionSucceeded();
```

## Run your tests

Start the Appium server (or use the `appium` WebdriverIO service) and run WebdriverIO as usual:

```bash theme={null}
appium                       # if you are not using the @wdio/appium-service
npx wdio run wdio.conf.js
```

## Next steps

<Columns cols={2}>
  <Card title="Deterministic execution" icon="bolt" href="/engineer-tooling/appium-ts-deterministic-execution">
    Run native Appium code first and only fall back to AI when it fails. The fastest, most stable way to use the SDK.
  </Card>

  <Card title="Reference" icon="book" href="/engineer-tooling/appium-ts-reference">
    Every constructor option and command.
  </Card>
</Columns>
