JavaScript SDK

了解如何使用 AI Studio 的 JavaScript SDK。

我们提供一个 JavaScript SDK，用于与 Chrome DevTools Protocol (CDP) 无缝交互 Oxylabs AI Studio API 服务交互，包括 AI-Scraper、AI-Crawler、AI-Browser-Agent 以及其他数据提取工具。

安装

安装 SDK：

npm install oxylabs-ai-studio

可以添加 OXYLABS_AI_STUDIO_API_URL 和 OXYLABS_AI_STUDIO_API_KEY 到 .env 文件，或作为环境变量：

export OXYLABS_AI_STUDIO_API_KEY=your_api_key_here

^用法

AI-Scraper

import { 
  OxylabsAIStudioSDK
} from 'oxylabs-ai-studio';

const sdk = new OxylabsAIStudioSDK({
  apiKey: 'your_api_key_here',
  timeout: 120000,
  retryAttempts: 3,
});

async function testGenerateSchema() {
  try {
    console.log('Testing schema generation...');
    const schema = await sdk.aiScraper.generateSchema({
      user_prompt: 'Extract the title of the page'
    });
    console.log('Schema:', schema);
  } catch (error) {
    console.error('Schema generation error:', error.message);
  }
}

testGenerateSchema();

基本用法

import { 
  OxylabsAIStudioSDK, 
  OutputFormat
} from 'oxylabs-ai-studio';

const sdk = new OxylabsAIStudioSDK({
  apiKey: 'your_api_key_here',
  timeout: 120000,
  retryAttempts: 3,
});

async function testScrapeOutputJson() {
  try {
    console.log('Testing synchronous scraping with JSON output...');
    
    const options = {
      url: 'https://www.freelancer.com',
      user_prompt: 'Extract all links',
      output_format: OutputFormat.JSON,
      geo_location: "US",
      schema: {
        type: 'object',
        properties: {
          links: { type: 'array', items: { type: 'string' } }
        }
      }
    };
    
    const results = await sdk.aiScraper.scrape(options);
    console.log('Sync scraping results:', results);
  } catch (error) {
    console.error('Sync scraping error:', error.message);
  }
}

testScrapeOutputJson();

输入参数

url (字符串): 要处理的目标 URL。
user_prompt (字符串): 关于要提取哪些数据的说明。这用于自动生成 openapi_schema 在使用 scrapeWithAutoSchema 方法时。
output_format (字符串): 所需的输出格式。可以是 markdown 或 json。默认为 markdown.
render_html (布尔): 指定在提取前是否渲染页面上的 JavaScript。默认值为 false.
openapi_schema (Record<string, any>): 一个 JSON Schema 对象，定义输出数据的结构。当 output_format 被设置为 json.
geo_location (字符串): 指定应模拟请求的地理位置（ISO2 格式）。

AI-Crawler

基本用法

import { 
  OxylabsAIStudioSDK, 
  OutputFormat
} from 'oxylabs-ai-studio';

const sdk = new OxylabsAIStudioSDK({
  apiKey: 'your_api_key_here',
  timeout: 120000,
  retryAttempts: 3,
});

async function testCrawlOutputJson() {
  try {
    console.log('Testing crawling with JSON output...');
    
    const options = {
      url: 'https://www.freelancer.com',
      output_format: OutputFormat.JSON,
      user_prompt: 'Get job ad pages',
      return_sources_limit: 3,
      geo_location: "DE",
      schema: {
        type: "object",
        properties: {
          jobAd: {
            type: "object",
            properties: {
              position_title: {
                type: "string"
              },
              salary: {
                type: "string"
              }
            }
          }
        }
      }
    };
    
    const results = await sdk.aiCrawler.crawl(options);
    console.log('Crawling results:', JSON.stringify(results, null, 2));      
  } catch (error) {
    console.error('Crawling error:', error.message);
  }
}

testCrawlOutputJson();

输入参数

url (字符串): 爬取的起始 URL。
crawl_prompt (字符串): 定义要查找和爬取的页面类型的说明。
parse_prompt (字符串): 从爬取的页面中提取哪些数据的说明。这用于自动生成 openapi_schema 在使用 crawlWithAutoSchema 方法时。
output_format (字符串): 所需的输出格式。可以是 markdown 或 json。默认为 markdown.
max_pages (整数): 要返回的最大页面或来源数量。默认值为 25.
render_html (布尔): 指定在提取前是否在页面上渲染 JavaScript。默认值为 false.
openapi_schema (Record<string, any>): 一个 JSON Schema 对象，定义输出数据的结构。当 output_format 被设置为 json.
geo_location (字符串): 指定应模拟请求的地理位置（ISO2 格式）。

Browser-Agent

基本用法

import { 
  OxylabsAIStudioSDK, 
  OutputFormat
} from 'oxylabs-ai-studio';

const sdk = new OxylabsAIStudioSDK({
  apiKey: 'your_api_key_here',
  timeout: 120000,
  retryAttempts: 3,
});

async function testBrowseOutputJson() {
  try {
    console.log('Testing synchronous browsing with JSON output...');
    
    const options = {
      url: 'https://www.freelancer.com',
      output_format: OutputFormat.JSON,
      user_prompt: 'Navigate to the first job ad you can find.',
      geo_location: "US",
      schema: {
        type: 'object',
        properties: {
          job_title: { type: 'string' }
        }
      }
    };
    
    const results = await sdk.browserAgent.browse(options);
    console.log('Sync browsing results:', JSON.stringify(results, null, 2));
  } catch (error) {
    console.error('Sync browsing error:', error.message);
  }
}

testBrowseOutputJson();

输入参数

url (字符串): 浏览代理开始的目标 URL。
browse_prompt (字符串): 定义浏览代理应执行的操作的说明。
parse_prompt (字符串): 在执行浏览操作后要提取哪些数据的说明。这用于自动生成 openapi_schema 在使用 browseWithAutoSchema 方法时。
output_format (字符串): 所需的输出格式。可以是 markdown, html, json，或 screenshot。默认为 markdown.
render_html (布尔): 指定是否渲染页面上的 JavaScript。尽管这是浏览代理，此标志可能会影响某些行为。默认值为 false.
openapi_schema (Record<string, any>): 一个 JSON Schema 对象，定义输出数据的结构。当 output_format 被设置为 json.
geo_location (字符串): 指定应模拟请求的地理位置（ISO2 格式）。

AI-Search

基本用法

import {
  OxylabsAIStudioSDK,
} from 'oxylabs-ai-studio';

const sdk = new OxylabsAIStudioSDK({
  apiKey: 'your_api_key_here',
  timeout: 120000,
  retryAttempts: 3,
});

async function testSearch() {
  try {
    console.log('Testing search...');

    const options = {
      query: 'weather in London',
      limit: 3,
      return_content: true,
      render_javascript: false,
      geo_location: "IT",
    };

    const results = await sdk.aiSearch.search(options);
    console.log('Search results:', JSON.stringify(results, null, 2));
  } catch (error) {
    console.error('Search error:', error.message);
  }
}

testSearch();

输入参数

query (字符串): 搜索查询。
limit (整数): 要返回的最大搜索结果数量。最大值：50。
render_javascript (布尔): 是否在页面上渲染 JavaScript。默认值为 false.
return_content (布尔): 是否返回每个搜索结果的 markdown 内容。默认值为 true.
geo_location (字符串): 指定应模拟请求的地理位置（ISO2 格式）。

AI-Map

基本用法

import { 
  OxylabsAIStudioSDK
} from 'oxylabs-ai-studio';

const sdk = new OxylabsAIStudioSDK({
  apiKey: 'your_api_key_here',
  timeout: 120000,
  retryAttempts: 3,
});

async function testMap() {
  try {
    console.log('Testing map...');
    
    const options = {
      url: 'https://www.freelancer.com/jobs',
      user_prompt: 'Extract tech job ads',
      return_sources_limit: 10,
      geo_location: 'US',
      render_javascript: false
    };
    
    const results = await sdk.aiMap.map(options);
    console.log('Map results:', JSON.stringify(results, null, 2));
  } catch (error) {
    console.error('Map error:', error.message);
  }
}

testMap();

输入参数

url (字符串): 要映射和提取数据的目标 URL。
user_prompt (字符串): 关于要从映射页面中提取哪些数据的说明。
return_sources_limit (整数): 映射过程中要返回的最大来源/页面数量。
geo_location (字符串): 用于映射请求的地理位置（例如，'US'、'UK'）。
render_javascript (布尔): 指定在映射前是否在页面上渲染 JavaScript。默认值为 false.

使用示例

您可以在此处找到每个应用的更多示例：

上一页Python SDK 下一页常见问题

最后更新于2个月前

这有帮助吗？

晚安

安装

用法

AI-Scraper

基本用法

输入参数

AI-Crawler

基本用法

输入参数

Browser-Agent

基本用法

输入参数

AI-Search

基本用法

输入参数

AI-Map

基本用法

输入参数

使用示例

^用法