Skip to main content
The Batch API allows you to efficiently process multiple DeepResearch tasks in parallel. This is ideal for running large-scale research operations where you need to process many queries simultaneously.

Overview

The Batch API provides a way to:
  • Create a batch container for multiple research tasks
  • Add tasks to a batch (can be done incrementally)
  • Monitor batch progress and individual task status
  • Wait for batch completion with progress callbacks
  • Cancel batches and their pending tasks
  • List all batches and their tasks

How It Works

The Batch API follows a simple workflow:
  1. Create a Batch - Initialize a new batch with optional configuration
  2. Add Tasks - Add one or more research tasks to the batch
  3. Monitor Progress - Check batch status and task counts
  4. Wait for Completion - Optionally wait for all tasks to complete
  5. Retrieve Results - Access individual task results through the DeepResearch API

Batch Lifecycle

A batch progresses through the following statuses:
  • open - Batch is created but no tasks are running yet
  • processing - At least one task is queued, running, or completed
  • completed - All tasks have finished successfully
  • completed_with_errors - All tasks finished, but some failed
  • cancelled - Batch was cancelled before completion

API Methods

batch.create(options?)

Creates a new batch container. Returns a batch ID that you’ll use for subsequent operations. Parameters:
interface CreateBatchOptions {
  name?: string;                                    // Optional batch name
  mode?: "fast" | "standard" | "heavy";             // Research mode: "standard" (default), "heavy" (comprehensive), or "fast" (faster completion)
  outputFormats?: ("markdown" | "pdf" | "toon" | Record<string, any>)[];  // Output formats: ["markdown"], ["pdf"], ["toon"], or JSON schema. Cannot mix JSON schema with "markdown"/"pdf". "toon" requires JSON schema.
  search?: {                                        // Search configuration (type, sources, dates, category). See Search Configuration section for details.
    searchType?: "all" | "web" | "proprietary";     // Default: "all"
    includedSources?: string[];                     // e.g., ["web", "academic", "finance", "patent", "transportation", "politics", "legal"]
    excludedSources?: string[];                     // Array of source types to exclude
    startDate?: string;                             // YYYY-MM-DD format
    endDate?: string;                               // YYYY-MM-DD format
    category?: string;                              // Category filter (source-dependent)
  };
  webhookUrl?: string;                              // HTTPS URL for completion notification
  metadata?: Record<string, string | number | boolean>; // Custom metadata
}
Response:
interface CreateBatchResponse {
  success: boolean;
  batch_id?: string;                                // Use this for subsequent operations
  name?: string;                                     // Batch name
  status?: BatchStatus;
  mode?: "fast" | "standard" | "heavy";             // Research mode
  output_formats?: ("markdown" | "pdf" | "toon" | Record<string, any>)[];
  search_params?: {
    search_type?: "all" | "web" | "proprietary";
    included_sources?: string[];
  };
  counts?: BatchCounts;                              // Task counts
  cost?: number;                                     // Total cost in dollars
  created_at?: string;                               // ISO 8601 timestamp string
  completed_at?: string;                              // ISO 8601 timestamp string (if completed)
  webhook_secret?: string;                           // Secret for webhook verification (only on creation)
  error?: string;
}
Example:
const batch = await client.batch.create({
  name: "Q4 Research Batch",
  mode: "fast",
  outputFormats: ["markdown"],
  metadata: {
    project: "Q4-Research",
    user: "analyst-1"
  }
});

if (batch.success) {
  console.log(`Created batch: ${batch.batch_id}`);
}

batch.addTasks(batchId, options)

Adds one or more tasks to an existing batch. You can call this multiple times to add tasks incrementally. Constraints:
  • Maximum tasks per request: 100
  • Minimum tasks per request: 1
  • Batch must be in "open" or "processing" status
Parameters:
interface AddBatchTasksOptions {
  tasks: BatchTaskInput[]; // Array of 1-100 tasks
}

interface BatchTaskInput {
  id?: string;                                      // Optional custom task identifier (for tracking)
  query: string;                                    // Research query or task description (required)
  strategy?: string;                                // Custom research strategy instructions
  urls?: string[];                                  // Array of URLs to extract content from
  metadata?: Record<string, string | number | boolean>; // Custom metadata
}

**Inherited Settings:**

Tasks automatically inherit from the batch:

- `mode` - Research mode (cannot be overridden per task)
- `output_formats` - Output formats (cannot be overridden per task)
- `search_params` - Search parameters (cannot be overridden per task)

**Not Yet Supported in Batch API:**

The following features are **not yet** supported for batch tasks:

- `search` - Tasks cannot override batch-level search configuration
- `files` - Cannot attach files to batch tasks
- `deliverables` - Cannot specify deliverables (CSV, XLSX, PPTX, DOCX) for batch tasks
- `mcpServers` - Cannot configure MCP servers for batch tasks
- `codeExecution` - Always enabled (cannot disable per batch)
- `previousReports` - Cannot reference previous reports in batch tasks
- `brandCollectionId` - Cannot apply branding to batch tasks

**Workaround:** Use individual task creation (`client.deepresearch.create()`) if you need these features.

**Response:**

```typescript
interface AddBatchTasksResponse {
  success: boolean;
  batch_id?: string;                                // Batch ID
  added?: number;                                    // Number of tasks successfully added
  tasks?: BatchTaskCreated[];                       // Array of created task objects
  counts?: BatchCounts;                             // Updated task counts for the batch
  error?: string;
}

interface BatchTaskCreated {
  task_id?: string;                                 // User-provided task identifier (if specified)
  deepresearch_id: string;                          // DeepResearch task ID
  status: string;                                    // Task status
}
Example:
    const addResult = await client.batch.addTasks(batchId, {
      tasks: [
        {
          query: "What are the latest developments in quantum computing?"
        },
        {
          query: "Analyze the impact of AI on healthcare in 2024"
        },
        {
          query: "Compare renewable energy trends across Europe",
          urls: ["https://example.com/report.pdf"]
        }
      ]
    });

    if (addResult.success) {
      console.log(`Added ${addResult.added} tasks`);
    }

batch.status(batchId)

Gets the current status of a batch, including task counts and cost information. Response:
interface BatchStatusResponse {
  success: boolean;
  batch?: DeepResearchBatch;
  error?: string;
}

interface DeepResearchBatch {
  batch_id: string;
  organisation_id?: string;
  api_key_id?: string;
  credit_id?: string;
  name?: string;
  status: BatchStatus;                              // "open" | "processing" | "completed" | "completed_with_errors" | "cancelled"
  mode: "fast" | "standard" | "heavy";             // Research mode
  output_formats?: ("markdown" | "pdf" | "toon" | Record<string, any>)[];
  search_params?: {
    search_type?: "all" | "web" | "proprietary";
    included_sources?: string[];
  };
  counts: BatchCounts;
  cost: number;                                     // Total cost in dollars (replaces 'usage' object)
  webhook_url?: string;
  webhook_secret?: string;                         // Only returned on batch creation
  created_at: string;                               // ISO 8601 timestamp string
  completed_at?: string;                            // ISO 8601 timestamp string (only present when batch is completed)
  metadata?: Record<string, string | number | boolean>;
}

interface BatchCounts {
  total: number;                                    // Total tasks in batch
  queued: number;                                    // Tasks waiting to start
  running: number;                                  // Tasks currently running
  completed: number;                                // Successfully completed tasks
  failed: number;                                   // Failed tasks
  cancelled: number;                                // Cancelled tasks
}
Example:
const status = await client.batch.status(batchId);

if (status.success && status.batch) {
  console.log(`Status: ${status.batch.status}`);
  console.log(
    `Progress: ${status.batch.counts.completed}/${status.batch.counts.total}`
  );
  console.log(`Cost: $${status.batch.cost}`);
}

batch.listTasks(batchId)

Lists all tasks in a batch with their individual statuses. Response:
interface ListBatchTasksResponse {
  success: boolean;
  batch_id?: string;                                // Batch ID
  tasks?: BatchTaskListItem[];
  pagination?: BatchPagination;                     // Pagination information
  error?: string;
}

interface BatchPagination {
  count: number;                                    // Number of tasks returned in this response
  last_key?: string;                                // Pagination key for fetching next page (if has_more is true)
  has_more: boolean;                                // Whether there are more tasks to fetch
}

interface BatchTaskListItem {
  task_id?: string;                                 // User-provided task identifier
  deepresearch_id: string;                          // Task ID (use with deepresearch.status())
  query: string;                                    // The research query
  status: DeepResearchStatus;                       // Task status
  created_at: string;                               // ISO 8601 timestamp string
  completed_at?: string;                            // ISO 8601 timestamp string
}
Example:
const tasksList = await client.batch.listTasks(batchId);

if (tasksList.success && tasksList.tasks) {
  tasksList.tasks.forEach((task, i) => {
    console.log(`${i + 1}. ${task.status}: ${task.query.substring(0, 50)}...`);

    // Get detailed task results
    if (task.status === "completed") {
      const taskResult = await client.deepresearch.status(task.deepresearch_id);
      console.log(`   Output: ${taskResult.output?.substring(0, 100)}...`);
    }
  });
}

batch.waitForCompletion(batchId, options?)

Waits for a batch to complete by polling its status at regular intervals. This is useful for long-running batches where you want to be notified when all tasks finish. Parameters:
interface BatchWaitOptions {
  pollInterval?: number;                            // Polling interval in ms (default: 10000)
  maxWaitTime?: number;                             // Maximum wait time in ms (default: 7200000 = 2 hours)
  onProgress?: (batch: DeepResearchBatch) => void; // Progress callback
}
Response: Returns the final DeepResearchBatch object when the batch reaches a terminal state (completed, completed_with_errors, or cancelled). Example:
try {
  const finalBatch = await client.batch.waitForCompletion(batchId, {
    pollInterval: 10000,                            // Check every 10 seconds
    maxWaitTime: 600000,                            // Wait up to 10 minutes
    onProgress: (batch) => {
      console.log(
        `Progress: ${batch.counts.completed}/${batch.counts.total} completed`
      );
      console.log(
        `Running: ${batch.counts.running}, Queued: ${batch.counts.queued}`
      );
    }
  });

  console.log("Batch completed!");
  console.log(`Final status: ${finalBatch.status}`);
} catch (error) {
  console.error(`Wait interrupted: ${error.message}`);
}

batch.cancel(batchId)

Cancels a batch and all its pending tasks. Tasks that are already running will continue, but queued tasks will be cancelled. Response:
interface CancelBatchResponse {
  success: boolean;
  message?: string;
  batch_id?: string;
  error?: string;
}
Example:
const result = await client.batch.cancel(batchId);

if (result.success) {
  console.log(`Batch ${batchId} cancelled successfully`);
} else {
  console.log(`Failed to cancel: ${result.error}`);
}

batch.list()

Lists all batches associated with your API key. Response:
interface ListBatchesResponse {
  success: boolean;
  batches?: DeepResearchBatch[];                    // Array of batches (direct array, no wrapper)
  error?: string;
}
Example:
const result = await client.batch.list();

if (result.success && result.batches) {
  console.log(`Found ${result.batches.length} batches:`);

  result.batches.forEach((batch, i) => {
    console.log(`${i + 1}. ${batch.batch_id}`);
    console.log(`   Name: ${batch.name || "Unnamed"}`);
    console.log(`   Status: ${batch.status}`);
    console.log(`   Tasks: ${batch.counts.total}`);
  });
}

Complete Example

Here’s a complete example that demonstrates the full batch workflow:
const { Valyu } = require('valyu-js');

const client = new Valyu(process.env.VALYU_API_KEY);

async function runBatchExample() {
  try {
    // 1. Create a batch
    console.log('Creating batch...');
    const batch = await client.batch.create({
      name: 'Research Questions Batch',
      mode: 'fast',
      outputFormats: ['markdown'],
      search: {
        searchType: 'all',
        includedSources: ['valyu/valyu-arxiv']
      },
      metadata: {
        project: 'Q4-Research',
        user: 'analyst-1'
      }
    });

    if (!batch.success) {
      throw new Error(`Failed to create batch: ${batch.error}`);
    }

    const batchId = batch.batch_id;
    console.log(`✓ Created batch: ${batchId}\n`);

    // 2. Add tasks to the batch
    console.log('Adding tasks...');
    const addResult = await client.batch.addTasks(batchId, {
      tasks: [
        {
          query: 'What are the latest developments in quantum computing?'
        },
        {
          query: 'Analyze the impact of AI on healthcare in 2024'
        },
        {
          query: 'Compare renewable energy trends across Europe'
        }
      ]
    });

    if (!addResult.success) {
      throw new Error(`Failed to add tasks: ${addResult.error}`);
    }
    console.log(`✓ Added ${addResult.added} tasks\n`);

    // 3. Monitor progress
    console.log('Waiting for completion...');
    const finalBatch = await client.batch.waitForCompletion(batchId, {
      pollInterval: 10000,
      maxWaitTime: 600000,
      onProgress: (batch) => {
        console.log(
          `  Progress: ${batch.counts.completed}/${batch.counts.total} completed ` +
            `(${batch.counts.running} running, ${batch.counts.queued} queued)`
        );
      }
    });

    console.log('\n✓ Batch completed!');
    console.log(`Final status: ${finalBatch.status}`);
    console.log(`Completed: ${finalBatch.counts.completed}`);
    console.log(`Failed: ${finalBatch.counts.failed}`);

    // 4. Get individual task results
    console.log('\nFetching task results...');
    const tasksList = await client.batch.listTasks(batchId);

    if (tasksList.success && tasksList.tasks) {
      for (const task of tasksList.tasks) {
        if (task.status === 'completed') {
          const taskResult = await client.deepresearch.status(
            task.deepresearch_id
          );
          console.log(`\nTask: ${task.query.substring(0, 50)}...`);
          console.log(
            `Output length: ${taskResult.output?.length || 0} characters`
          );
        } else if (task.status === 'failed') {
          console.log(`\nTask failed: ${task.query.substring(0, 50)}...`);
        }
      }
    }

  } catch (error) {
    console.error('Error:', error.message);
  }
}

runBatchExample();

Search Configuration

Search parameters control which data sources are queried, what content is included/excluded, and how results are filtered by date or category. When set at the batch level, these parameters are applied to all tasks in the batch and cannot be overridden by individual tasks.

Search Type

Controls which backend search systems are queried for all tasks in the batch:
  • "all" (default): Searches both web and proprietary data sources
  • "web": Searches only web sources (general web search, news, articles)
  • "proprietary": Searches only proprietary data sources (academic papers, finance data, patents, etc.)
When set at the batch level, this parameter cannot be overridden by individual tasks.
const batch = await client.batch.create({
  name: "Academic Research Batch",
  search: { searchType: "proprietary" }
});

Included Sources

Restricts search to only the specified source types for all tasks in the batch. When specified, only these sources will be searched. Tasks inherit this setting and cannot override it. Available source types:
  • "web": General web search results (news, articles, websites)
  • "academic": Academic papers and research databases (ArXiv, PubMed, BioRxiv/MedRxiv, Clinical trials, FDA drug labels, WHO health data, NIH grants, Wikipedia)
  • "finance": Financial and economic data (Stock/crypto/FX prices, SEC filings, Company financial statements, Economic indicators, Prediction markets)
  • "patent": Patent and intellectual property data (USPTO patent database, Patent abstracts, claims, descriptions)
  • "transportation": Transit and transportation data (UK National Rail schedules, Maritime vessel tracking)
  • "politics": Government and parliamentary data (UK Parliament members, bills, votes)
  • "legal": Case law and legal data (UK court judgments, Legislation text)
const batch = await client.batch.create({
  name: "Academic Research Batch",
  search: {
    searchType: "proprietary",
    includedSources: ["academic", "web"]
  }
});

Excluded Sources

Excludes specific source types from search results for all tasks in the batch. Uses the same source type values as includedSources. Cannot be used simultaneously with includedSources (use one or the other).
const batch = await client.batch.create({
  name: "Research Batch",
  search: {
    searchType: "proprietary",
    excludedSources: ["web", "patent"]
  }
});

Start Date

Format: ISO date format (YYYY-MM-DD) Filters search results to only include content published or dated on or after this date for all tasks in the batch. Applied to both publication dates and event dates when available. Works across all source types.
const batch = await client.batch.create({
  name: "2024 Research",
  search: { startDate: "2024-01-01" }
});

End Date

Format: ISO date format (YYYY-MM-DD) Filters search results to only include content published or dated on or before this date for all tasks in the batch. Applied to both publication dates and event dates when available. Works across all source types.
const batch = await client.batch.create({
  name: "Q4 2024 Analysis",
  search: {
    startDate: "2024-10-01",
    endDate: "2024-12-31"
  }
});

Category

Filters results by a specific category for all tasks in the batch. The exact categories available depend on the data source. Category values are source-dependent and may not be applicable to all source types.
const batch = await client.batch.create({
  name: "Technology Research",
  search: { category: "technology" }
});

How Batch Search Parameters Work

All tasks inherit batch search parameters when you add tasks to a batch. They automatically inherit the batch’s search configuration. Individual tasks in a batch cannot override the batch-level search parameters (they inherit them).

Important Notes

Parameter Enforcement

Batch-level parameters are enforced and cannot be overridden by individual tasks. This ensures consistent search behavior across all tasks in the batch. Tool-level source specifications are ignored if batch-level sources are specified.

Date Filtering

Dates are applied to both publication dates and event dates when available. ISO format (YYYY-MM-DD) is required. Date filtering works across all source types. If only startDate is provided, results include all content from that date forward. If only endDate is provided, results include all content up to that date. Both dates can be combined for a specific date range.

Best Practices

1. Batch Configuration

  • Set batch-level search configuration when all tasks use the same search parameters to avoid repetition
  • Use metadata to tag batches and tasks for easier organization and filtering
  • Choose the right mode: fast for quick results, standard for balanced quality/speed, heavy for thorough research

2. Task Management

  • Add tasks incrementally - You can add tasks to a batch at any time before it’s completed (1-100 tasks per request)
  • Use task metadata to track source, category, or other custom attributes
  • Tasks inherit batch settings - All tasks inherit mode, output_formats, and search_params from the batch
  • Limited task-level options - Tasks can only specify: id, query, strategy, urls, and metadata
  • Not all DeepResearch features available - Batch tasks do not support files, deliverables, MCP servers, or other advanced features
  • Monitor task status individually if you need fine-grained control

3. Usage Management

  • Monitor cost through the cost field in batch status
  • Set appropriate modes based on your research needs
  • Use search filters to focus on relevant data

4. Error Handling

  • Check success flags on all API responses
  • Handle completed_with_errors status - Some tasks may fail while others succeed
  • Use try-catch around waitForCompletion to handle timeouts

5. Webhooks

  • Set up webhooks for production use to avoid polling
  • Verify webhook signatures using the webhook_secret returned on batch creation
  • Handle webhook retries in your webhook endpoint

6. Performance

  • Use appropriate poll intervals - Don’t poll too frequently (10 seconds is reasonable)
  • Set reasonable timeouts - Use maxWaitTime to prevent indefinite waiting
  • Process results asynchronously - Don’t block on batch completion if you can process results incrementally

Batch-Level vs Task-Level Parameters

The Batch API supports parameters at two levels:

Batch-Level Parameters (set when creating the batch)

  • mode - Applied to all tasks in the batch
  • outputFormats - Applied to all tasks
  • search - Default search configuration for all tasks
  • webhookUrl - Batch completion webhook
  • metadata - Batch-level metadata

Task-Level Parameters (set when adding tasks)

  • id - Optional custom task identifier (for tracking)
  • query - The research query (preferred field name)
  • strategy - Task-specific research strategy instructions
  • urls - URLs to analyze for this task
  • metadata - Task-specific metadata
Important: Tasks inherit batch-level settings and cannot override:
  • mode - Always inherited from batch
  • output_formats - Always inherited from batch
  • search_params - Always inherited from batch
Not Yet Supported: The following features are not yet available for batch tasks:
  • search - Tasks cannot override batch-level search configuration
  • files - File attachments
  • deliverables - Requested deliverables (CSV, XLSX, PPTX, DOCX)
  • mcpServers - MCP server configurations
  • codeExecution - Always enabled (cannot disable)
  • previousReports - Previous report references
  • brandCollectionId - Branding configuration
If you need these features, use individual task creation (client.deepresearch.create()) instead of batch processing.

Integration with DeepResearch API

Individual tasks in a batch are DeepResearch tasks. You can:
  • Get task status: client.deepresearch.status(task.deepresearch_id)
  • Stream task updates: client.deepresearch.stream(task.deepresearch_id, callbacks)
  • Update running tasks: client.deepresearch.update(task.deepresearch_id, instruction)
  • Cancel individual tasks: client.deepresearch.cancel(task.deepresearch_id)
Note: Batch tasks have limited parameters compared to individual DeepResearch tasks. Tasks inherit mode, output_formats, and search_params from the batch and cannot override them. Advanced features like files, deliverables, and MCP servers are not available in batch mode. Use client.deepresearch.create() for full feature support.

Error Codes

Common error scenarios:
  • Batch not found: Invalid batchId - check that the batch exists
  • Tasks array empty: Must provide at least one task when adding
  • Too many tasks: Maximum 100 tasks per request
  • Invalid task input: Task query field is required and cannot be empty
  • Batch already completed: Cannot add tasks to a completed batch (must be "open" or "processing")
  • Timeout: waitForCompletion exceeded maxWaitTime

Rate Limits

  • Batch creation: Standard API rate limits apply
  • Adding tasks: Can add tasks multiple times, but rate limits apply per request
  • Status polling: Use reasonable poll intervals (10+ seconds recommended)

See Also