Build AI agents using Scrapybara's TypeScript SDK for remote desktop automation with computer use tools, browser control, and structured outputs.
Expert guidance for building AI agents with the Scrapybara TypeScript SDK, which provides remote desktop instances for computer use automation.
You are working with Scrapybara, a TypeScript SDK for deploying and managing remote desktop instances for AI agents. Follow these principles:
1. **Always use proper type imports** from the SDK
2. **Stop instances after use** to prevent unnecessary billing
3. **Use async/await** for all operations (they are asynchronous)
4. **Handle errors properly** with try/catch blocks
5. **Prefer bash commands over GUI interactions** for launching applications
```typescript
import { ScrapybaraClient } from "scrapybara";
const client = new ScrapybaraClient({ apiKey: "KEY" });
```
```typescript
// Ubuntu instance - supports bash, computer, edit, browser tools
const ubuntuInstance = await client.startUbuntu({ timeoutHours: 1 });
// Browser instance - supports computer, browser tools
const browserInstance = await client.startBrowser();
// Windows instance - supports computer tool
const windowsInstance = await client.startWindows();
```
```typescript
await instance.pause(); // Pause to save resources
await instance.resume({ timeoutHours: 1 }); // Resume work
await instance.stop(); // Terminate and clean up
```
Note: Instances auto-terminate after 1 hour by default.
Import the correct types from the SDK:
```typescript
// Core types
import { ScrapybaraClient, UbuntuInstance, BrowserInstance, WindowsInstance } from "scrapybara";
// Tool types
import { bashTool, computerTool, editTool } from "scrapybara/tools";
// Model types
import { anthropic } from "scrapybara/anthropic";
// Prompts
import { UBUNTU_SYSTEM_PROMPT, BROWSER_SYSTEM_PROMPT, WINDOWS_SYSTEM_PROMPT } from "scrapybara/prompts";
// Message types
import { z } from "zod";
// Error types
import { ApiError } from "scrapybara/core";
// Request/Response types
import { Scrapybara } from "scrapybara";
```
```typescript
const base64Image = await instance.screenshot().base64Image;
```
```typescript
await instance.bash({ command: "ls -la" });
```
```typescript
// Mouse movement
await instance.computer({ action: "move_mouse", coordinates: [x, y] });
// Click actions
await instance.computer({ action: "click_mouse", button: "right", coordinates: [x, y] });
// Drag actions
await instance.computer({ action: "drag_mouse", path: [[x1, y1], [x2, y2]] });
// Scroll actions
await instance.computer({ action: "scroll", coordinates: [x, y], delta_x: 0, delta_y: 0 });
// Key presses
await instance.computer({ action: "press_key", keys: ["a", "b", "c"] });
// Type text
await instance.computer({ action: "type_text", text: "Hello world" });
// Wait
await instance.computer({ action: "wait", duration: 3 });
// Get cursor position
await instance.computer({ action: "get_cursor_position" });
```
```typescript
// Read file
const content = await instance.file.read({ path: "/path/file" }).content;
// Write file
await instance.file.write({ path: "/path/file", content: "data" });
```
```typescript
// Set variables
await instance.env.set({ API_KEY: "value" });
// Get variables
const vars = await instance.env.get().variables;
// Delete variables
await instance.env.delete(["VAR_NAME"]);
```
The ACT SDK enables building computer use agents with unified tools and model interfaces.
1. **Model**: Handles LLM integration (currently Anthropic)
2. **Tools**: Interface for computer interactions (bashTool, computerTool, editTool)
3. **Prompt**: System prompt, user prompt, or message history
```typescript
import { ScrapybaraClient } from "scrapybara";
import { anthropic } from "scrapybara/anthropic";
import { UBUNTU_SYSTEM_PROMPT } from "scrapybara/prompts";
import { bashTool, computerTool, editTool } from "scrapybara/tools";
const client = new ScrapybaraClient();
const instance = await client.startUbuntu();
const tools = [
bashTool(instance),
computerTool(instance),
editTool(instance),
];
const { messages, steps, text, output, usage } = await client.act({
model: anthropic(), // Or anthropic({ apiKey: "KEY" }) for own key
tools,
system: UBUNTU_SYSTEM_PROMPT,
prompt: "Task description",
onStep: handleStep
});
await instance.stop();
```
Use either `prompt` (simple string) OR `messages` (list of messages), not both:
```typescript
// Option 1: Simple prompt
{ prompt: "Do something" }
// Option 2: Message history
{ messages: [...] }
// System prompt (recommended to use constants)
{ system: UBUNTU_SYSTEM_PROMPT } // or BROWSER_SYSTEM_PROMPT, WINDOWS_SYSTEM_PROMPT
```
Messages are structured with roles (user/assistant/tool) and typed content.
```typescript
// TextPart
{ type: "text", text: "content" }
// ImagePart
{ type: "image", image: "base64...", mimeType: "image/png" }
// ReasoningPart
{
type: "reasoning",
id: "id",
reasoning: "reasoning",
signature: "signature",
instructions: "instructions"
}
// ToolCallPart
{
type: "tool-call",
toolCallId: "id",
toolName: "bash",
args: { command: "ls" }
}
// ToolResultPart
{
type: "tool-result",
toolCallId: "id",
toolName: "bash",
result: "output",
isError: false
}
```
Monitor agent execution with the `onStep` callback:
```typescript
const handleStep = (step: Step) => {
console.log(`Text: ${step.text}`);
if (step.toolCalls) {
for (const call of step.toolCalls) {
console.log(`Tool: ${call.toolName}`);
}
}
if (step.toolResults) {
for (const result of step.toolResults) {
console.log(`Result: ${result.result}`);
}
}
console.log(`Tokens: ${step.usage?.totalTokens ?? 'N/A'}`);
};
```
Use Zod schemas to define structured output. The `output` field will contain validated typed data:
```typescript
import { z } from "zod";
const schema = z.object({
posts: z.array(z.object({
title: z.string(),
url: z.string(),
points: z.number(),
})),
});
const { output } = await client.act({
model: anthropic(),
tools,
schema,
system: UBUNTU_SYSTEM_PROMPT,
prompt: "Get the top 10 posts on Hacker News",
});
const posts = output.posts; // Fully typed
```
```typescript
const instance = await client.startUbuntu();
// Start browser
await instance.browser.start();
const cdpUrl = await instance.browser.start().cdpUrl;
// Save authentication state
const authStateId = await instance.browser.saveAuth({ name: "default" }).authStateId;
// Reuse authentication
await instance.browser.authenticate({ authStateId });
// Use browser in agent
const { messages } = await client.act({
model: anthropic(),
tools: [bashTool(instance), computerTool(instance)],
system: BROWSER_SYSTEM_PROMPT,
prompt: "Navigate to example.com",
});
// Stop browser
await instance.browser.stop();
await instance.stop();
```
**Important**: Always start browser before using browserTool.
Track token usage through `TokenUsage` objects with fields: `promptTokens`, `completionTokens`, `totalTokens`.
```typescript
const { usage } = await client.act({...});
console.log(`Total tokens: ${usage.totalTokens}`);
// Also available per step
const handleStep = (step: Step) => {
console.log(`Step tokens: ${step.usage?.totalTokens}`);
};
```
```typescript
import { ApiError } from "scrapybara/core";
try {
await client.startUbuntu();
} catch (e) {
if (e instanceof ApiError) {
console.error(`Error ${e.statusCode}: ${e.body}`);
}
}
```
```typescript
import { ScrapybaraClient } from "scrapybara";
import { anthropic } from "scrapybara/anthropic";
import { UBUNTU_SYSTEM_PROMPT } from "scrapybara/prompts";
import { bashTool, computerTool, editTool } from "scrapybara/tools";
const client = new ScrapybaraClient();
const instance = await client.startUbuntu();
await instance.browser.start();
const { messages, steps, text, output, usage } = await client.act({
model: anthropic(),
tools: [
bashTool(instance),
computerTool(instance),
editTool(instance),
],
system: UBUNTU_SYSTEM_PROMPT,
prompt: "Go to the YC website and fetch the HTML",
onStep: (step) => console.log(`${step}\n`),
});
await instance.browser.stop();
await instance.stop();
```
1. **Always stop instances** after use to prevent unnecessary billing
2. **Use async/await** for all operations as they are asynchronous
3. **Handle API errors** with try/catch blocks
4. **Default timeout** is 60s; customize with `timeout` parameter or `requestOptions`
5. **Instance auto-terminates** after 1 hour by default
6. **Start browser** before browserTool usage
7. **Prefer bash commands** over GUI interactions for launching applications
8. **Use proper type imports** from the SDK for type safety
9. **Monitor token usage** to track costs
10. **Use structured output** with Zod schemas when you need typed data
Leave a review
No reviews yet. Be the first to review this skill!
# Download SKILL.md from killerskills.ai/api/skills/scrapybara-sdk-agent-development/raw