> ## Documentation Index
> Fetch the complete documentation index at: https://docs.valyu.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Batch Documentation

> Process multiple research tasks efficiently with shared configuration and unified monitoring

Process multiple research tasks efficiently with shared configuration, unified monitoring, and aggregated cost tracking. The Batch API is ideal for bulk research operations where you need to process many queries simultaneously.

<Note>
  New to DeepResearch? Start with the [DeepResearch Guide](/guides/deepresearch) to understand the core concepts before using batch processing.
</Note>

## Features

<CardGroup cols={2}>
  <Card title="Parallel Processing" icon="layer-group">
    Run 1-100 research tasks simultaneously with shared configuration.
  </Card>

  <Card title="Unified Monitoring" icon="chart-line">
    Track progress, costs, and status across all tasks in one place.
  </Card>

  <Card title="Shared Configuration" icon="sliders">
    Apply the same mode, output formats, and search settings to all tasks.
  </Card>

  <Card title="Webhook Notifications" icon="bell">
    Get notified when batches complete instead of polling.
  </Card>
</CardGroup>

## When to Use Batching

Use batch processing when you need to:

* **Process multiple queries** - Run 1-100 research tasks in parallel
* **Share configuration** - Apply the same mode, output formats, and search settings to all tasks
* **Unified monitoring** - Track progress and costs across all tasks in one place
* **Efficient bulk operations** - Reduce API calls and simplify task management

For individual tasks with unique configurations or advanced features (files, deliverables, MCP servers), use the standard [DeepResearch API](/guides/deepresearch) instead.

## Key Concepts

### Batch Lifecycle

A batch progresses through the following statuses:

```mermaid theme={null}
stateDiagram-v2
    [*] --> open: Create Batch
    open --> processing: Add Tasks
    open --> cancelled: Cancel
    processing --> completed: All Tasks Succeed
    processing --> completed_with_errors: Some Tasks Failed
    processing --> cancelled: Cancel
```

| Status                  | Description                                        |
| ----------------------- | -------------------------------------------------- |
| `open`                  | Batch is created but no tasks are running yet      |
| `processing`            | At least one task is queued, running, or completed |
| `completed`             | All tasks have finished successfully               |
| `completed_with_errors` | All tasks finished, but some failed                |
| `cancelled`             | Batch was cancelled before completion              |

### Task States

Individual tasks within a batch can be in these states:

| Status      | Description                 |
| ----------- | --------------------------- |
| `queued`    | Task is waiting to start    |
| `running`   | Task is currently executing |
| `completed` | Task finished successfully  |
| `failed`    | Task encountered an error   |
| `cancelled` | Task was cancelled          |

### Shared Configuration

Tasks in a batch inherit these settings from the batch:

* **`mode`** - Research mode (fast, standard, heavy, max)
* **`output_formats`** - Output formats (markdown, pdf, toon, or JSON schema)
* **`search_params`** - Search configuration (type, sources, dates, category)

Tasks can override:

* **`research_strategy`** - Natural language strategy to guide the research phase
* **`report_format`** - Natural language instructions for output format (highest priority)
* **`strategy`** - Deprecated, use `research_strategy` instead
* **`urls`** - URLs to analyze
* **`metadata`** - Custom metadata

## Basic Workflow

<CodeGroup>
  ```python Python theme={null}
  from valyu import Valyu

  valyu = Valyu()

  # 1. Create a batch with default settings
  batch = valyu.batch.create(
      name="Market Research Q4 2024",
      mode="standard",
      output_formats=["markdown"],
      search={
          "search_type": "all",
          "included_sources": ["web", "academic"],
          "start_date": "2024-01-01",
          "end_date": "2024-12-31"
      },
      metadata={"project": "Q4-2024", "team": "research"}
  )

  if batch.success:
      batch_id = batch.batch_id
      print(f"Created batch: {batch_id}")
      
      # 2. Add tasks to the batch
      tasks = [
          {"query": "Analyze technology sector performance in Q4 2024"},
          {"query": "Research healthcare sector trends and key players"},
          {"query": "Review renewable energy market developments"}
      ]
      
      add_result = valyu.batch.add_tasks(batch_id, tasks)
      
      if add_result.success:
          print(f"Added {add_result.added} tasks")
          
          # 3. Monitor progress
          status = valyu.batch.status(batch_id)
          if status.success and status.batch:
              print(f"Progress: {status.batch.counts.completed}/{status.batch.counts.total}")
              print(f"Total cost: ${status.batch.cost}")
  ```

  ```typescript TypeScript theme={null}
  import { Valyu } from "valyu-js";

  const valyu = new Valyu();

  // 1. Create a batch with default settings
  const batch = await valyu.batch.create({
    name: "Market Research Q4 2024",
    mode: "standard",
    outputFormats: ["markdown"],
    search: {
      searchType: "all",
      includedSources: ["web", "academic"],
      startDate: "2024-01-01",
      endDate: "2024-12-31"
    },
    metadata: { project: "Q4-2024", team: "research" }
  });

  if (batch.success) {
    const batchId = batch.batch_id;
    console.log(`Created batch: ${batchId}`);
    
    // 2. Add tasks to the batch
    const addResult = await valyu.batch.addTasks(batchId, {
      tasks: [
        { query: "Analyze technology sector performance in Q4 2024" },
        { query: "Research healthcare sector trends and key players" },
        { query: "Review renewable energy market developments" }
      ]
    });
    
    if (addResult.success) {
      console.log(`Added ${addResult.added} tasks`);
      
      // 3. Monitor progress
      const status = await valyu.batch.status(batchId);
      if (status.success && status.batch) {
        console.log(`Progress: ${status.batch.counts.completed}/${status.batch.counts.total}`);
        console.log(`Total cost: $${status.batch.cost}`);
      }
    }
  }
  ```

  ```bash cURL theme={null}
  # Create a batch
  curl -X POST "https://api.valyu.ai/v1/deepresearch/batches" \
    -H "Content-Type: application/json" \
    -H "x-api-key: YOUR_API_KEY" \
    -d '{
      "name": "Market Research Q4 2024",
      "mode": "standard",
      "output_formats": ["markdown"],
      "search": {
        "search_type": "all",
        "included_sources": ["web", "academic"],
        "start_date": "2024-01-01",
        "end_date": "2024-12-31"
      }
    }'

  # Add tasks to the batch
  curl -X POST "https://api.valyu.ai/v1/deepresearch/batches/BATCH_ID/tasks" \
    -H "Content-Type: application/json" \
    -H "x-api-key: YOUR_API_KEY" \
    -d '{
      "tasks": [
        {"query": "Analyze technology sector performance in Q4 2024"},
        {"query": "Research healthcare sector trends and key players"},
        {"query": "Review renewable energy market developments"}
      ]
    }'
  ```
</CodeGroup>

## Waiting for Completion

Use `wait_for_completion()` to automatically poll until the batch finishes:

<CodeGroup>
  ```python Python theme={null}
  def on_progress(status):
      if status.success and status.batch:
          counts = status.batch.counts
          print(
              f"Progress: {counts.completed + counts.failed}/{counts.total} "
              f"(Running: {counts.running}, Queued: {counts.queued})"
          )

  try:
      final_status = valyu.batch.wait_for_completion(
          batch_id,
          poll_interval=10,
          max_wait_time=3600,  # 1 hour
          on_progress=on_progress
      )
      
      if final_status.success and final_status.batch:
          print(f"Batch completed!")
          print(f"Status: {final_status.batch.status}")
          print(f"Total cost: ${final_status.batch.cost}")
  except TimeoutError as e:
      print(f"Timeout: {e}")
  ```

  ```typescript TypeScript theme={null}
  try {
    const finalBatch = await valyu.batch.waitForCompletion(batchId, {
      pollInterval: 10000,  // 10 seconds
      maxWaitTime: 600000,  // 10 minutes
      onProgress: (batch) => {
        console.log(
          `Progress: ${batch.counts.completed}/${batch.counts.total} completed`
        );
        console.log(
          `Running: ${batch.counts.running}, Queued: ${batch.counts.queued}`
        );
      }
    });
    
    console.log("Batch completed!");
    console.log(`Final status: ${finalBatch.status}`);
    console.log(`Total cost: $${finalBatch.cost}`);
  } catch (error) {
    console.error(`Wait interrupted: ${error.message}`);
  }
  ```
</CodeGroup>

## Retrieving Task Results

Use the `list_tasks()` / `listTasks()` method with `include_output=true` to get full task outputs in a single paginated request:

<CodeGroup>
  ```python Python theme={null}
  # Get all completed results with full output
  results = valyu.batch.list_tasks(batch_id, status="completed", include_output=True)

  for task in results.tasks:
      print(f"Task: {task.task_id or task.deepresearch_id}")
      print(f"Query: {task.query}")
      print(f"Output: {task.output[:200]}...")
      print(f"Sources: {len(task.sources)} cited")
      print(f"Cost: ${task.cost}")

  # Paginate through all results
  last_key = results.pagination.last_key
  while last_key:
      next_page = valyu.batch.list_tasks(batch_id, status="completed", include_output=True, last_key=last_key)
      for task in next_page.tasks:
          print(f"Task: {task.deepresearch_id} - {task.query}")
      last_key = next_page.pagination.last_key
  ```

  ```typescript TypeScript theme={null}
  // Get all completed results with full output
  const results = await valyu.batch.listTasks(batchId, {
    status: "completed",
    includeOutput: true,
  });

  for (const task of results.tasks) {
    console.log(`Task: ${task.task_id || task.deepresearch_id}`);
    console.log(`Query: ${task.query}`);
    console.log(`Output: ${task.output?.substring(0, 200)}...`);
    console.log(`Sources: ${task.sources?.length} cited`);
    console.log(`Cost: $${task.cost}`);
  }

  // Paginate through all results
  let lastKey = results.pagination.last_key;
  while (lastKey) {
    const nextPage = await valyu.batch.listTasks(batchId, {
      status: "completed",
      includeOutput: true,
      lastKey,
    });
    for (const task of nextPage.tasks) {
      console.log(`Task: ${task.deepresearch_id} - ${task.query}`);
    }
    lastKey = nextPage.pagination.last_key;
  }
  ```

  ```bash cURL theme={null}
  # Get completed tasks with full output
  curl -X GET "https://api.valyu.ai/v1/deepresearch/batches/${BATCH_ID}/tasks?status=completed&include_output=true&limit=25" \
    -H "X-API-Key: ${VALYU_API_KEY}"

  # Get next page
  curl -X GET "https://api.valyu.ai/v1/deepresearch/batches/${BATCH_ID}/tasks?status=completed&include_output=true&last_key=${LAST_KEY}" \
    -H "X-API-Key: ${VALYU_API_KEY}"
  ```
</CodeGroup>

<Tip>
  By default, `include_output` is `false`, returning a lightweight listing with task status only. Set `include_output=true` when you need the full output, sources, images, and cost for each task.
</Tip>

## Parameters Reference

### Mode Values

| Mode       | Description                                                         | Cost per Task |
| ---------- | ------------------------------------------------------------------- | ------------- |
| `fast`     | Fast research mode (quick answers, lightweight research)            | \$0.10        |
| `standard` | Standard research mode (default)                                    | \$0.50        |
| `heavy`    | Comprehensive research mode with fact verification                  | \$2.50        |
| `max`      | Exhaustive research mode with maximum quality and fact verification | \$15.00       |

<Note>
  The `lite` mode is deprecated and maps to `standard`.
</Note>

### Output Formats

* **`markdown`** (default) - Markdown text output
* **`pdf`** - PDF document output
* **`toon`** - TOON format (requires JSON schema)
* **JSON Schema Object** - Structured output matching the provided schema

<Warning>
  Cannot mix JSON schema with `markdown` or `pdf`. Use one or the other. `toon` format requires a JSON schema.
</Warning>

### Search Parameters

Search parameters control which data sources are queried, what content is included/excluded, and how results are filtered by date or category. When set at the batch level, these parameters are applied to all tasks in the batch and cannot be overridden by individual tasks.

### Search Type

Controls which backend search systems are queried for all tasks in the batch:

* **`"all"`** (default): Searches both web and proprietary data sources
* **`"web"`**: Searches only web sources (general web search, news, articles)
* **`"proprietary"`**: Searches only proprietary data sources (academic papers, finance data, patents, etc.)

When set at the batch level, this parameter **cannot be overridden** by individual tasks.

<CodeGroup>
  ```python Python theme={null}
  batch = valyu.batch.create(
      name="Academic Research Batch",
      search={"search_type": "proprietary"}
  )
  ```

  ```typescript TypeScript theme={null}
  const batch = await client.batch.create({
    name: "Academic Research Batch",
    search: { searchType: "proprietary" }
  });
  ```
</CodeGroup>

### Included Sources

Restricts search to only the specified source types for all tasks in the batch. When specified, **only** these sources will be searched. Tasks inherit this setting and cannot override it.

**Available source types:**

* **`"web"`**: General web search results (news, articles, websites)
* **`"academic"`**: Academic papers and research databases (ArXiv, PubMed, BioRxiv/MedRxiv, Clinical trials, FDA drug labels, WHO health data, NIH grants, Wikipedia)
* **`"finance"`**: Financial and economic data (Stock/crypto/FX prices, SEC filings, Company financial statements, Economic indicators, Prediction markets)
* **`"patent"`**: Patent and intellectual property data (USPTO patent database, Patent abstracts, claims, descriptions)
* **`"transportation"`**: Transit and transportation data (UK National Rail schedules, Maritime vessel tracking)
* **`"politics"`**: Government and parliamentary data (UK Parliament members, bills, votes)
* **`"legal"`**: Case law and legal data (UK court judgments, Legislation text)

<CodeGroup>
  ```python Python theme={null}
  batch = valyu.batch.create(
      name="Academic Research Batch",
      search={
          "search_type": "proprietary",
          "included_sources": ["academic", "web"]
      }
  )
  ```

  ```typescript TypeScript theme={null}
  const batch = await client.batch.create({
    name: "Academic Research Batch",
    search: {
      searchType: "proprietary",
      includedSources: ["academic", "web"]
    }
  });
  ```
</CodeGroup>

### Excluded Sources

Excludes specific source types from search results for all tasks in the batch. Uses the same source type values as `included_sources`. Cannot be used simultaneously with `included_sources` (use one or the other).

<CodeGroup>
  ```python Python theme={null}
  batch = valyu.batch.create(
      name="Research Batch",
      search={
          "search_type": "proprietary",
          "excluded_sources": ["web", "patent"]
      }
  )
  ```

  ```typescript TypeScript theme={null}
  const batch = await client.batch.create({
    name: "Research Batch",
    search: {
      searchType: "proprietary",
      excludedSources: ["web", "patent"]
    }
  });
  ```
</CodeGroup>

### Start Date

**Format:** ISO date format (`YYYY-MM-DD`)

Filters search results to only include content published or dated on or after this date for all tasks in the batch. Applied to both publication dates and event dates when available. Works across all source types.

<CodeGroup>
  ```python Python theme={null}
  batch = valyu.batch.create(
      name="2024 Research",
      search={"start_date": "2024-01-01"}
  )
  ```

  ```typescript TypeScript theme={null}
  const batch = await client.batch.create({
    name: "2024 Research",
    search: { startDate: "2024-01-01" }
  });
  ```
</CodeGroup>

### End Date

**Format:** ISO date format (`YYYY-MM-DD`)

Filters search results to only include content published or dated on or before this date for all tasks in the batch. Applied to both publication dates and event dates when available. Works across all source types.

<CodeGroup>
  ```python Python theme={null}
  batch = valyu.batch.create(
      name="Q4 2024 Analysis",
      search={
          "start_date": "2024-10-01",
          "end_date": "2024-12-31"
      }
  )
  ```

  ```typescript TypeScript theme={null}
  const batch = await client.batch.create({
    name: "Q4 2024 Analysis",
    search: {
      startDate: "2024-10-01",
      endDate: "2024-12-31"
    }
  });
  ```
</CodeGroup>

### Category

Filters results by a specific category for all tasks in the batch. The exact categories available depend on the data source. Category values are source-dependent and may not be applicable to all source types.

<CodeGroup>
  ```python Python theme={null}
  batch = valyu.batch.create(
      name="Technology Research",
      search={"category": "technology"}
  )
  ```

  ```typescript TypeScript theme={null}
  const batch = await client.batch.create({
    name: "Technology Research",
    search: { category: "technology" }
  });
  ```
</CodeGroup>

### Important Notes

#### Parameter Enforcement

Batch-level parameters are enforced and cannot be overridden by individual tasks. This ensures consistent search behavior across all tasks in the batch. Tool-level source specifications are ignored if batch-level sources are specified.

#### Date Filtering

Dates are applied to both publication dates and event dates when available. ISO format (`YYYY-MM-DD`) is required. Date filtering works across all source types. If only `start_date` / `startDate` is provided, results include all content from that date forward. If only `end_date` / `endDate` is provided, results include all content up to that date. Both dates can be combined for a specific date range.

## Limitations

### Not Yet Supported in Batch API

The following features are **not yet** supported in the batch API:

* **`deliverables`** - Cannot specify deliverables (CSV, XLSX, PPTX, DOCX) for batch tasks
* **`files`** - Cannot attach files to batch tasks
* **`mcp_servers`** - Cannot configure MCP servers for batch tasks
* **`code_execution`** - Always enabled (cannot disable per batch)
* **`previous_reports`** - Cannot reference previous reports in batch tasks
* **`alert_email`** - Cannot set email alerts for batch tasks

**Workaround:** Use individual task creation (`POST /v1/deepresearch/tasks`) if you need these features.

### Task Constraints

* **Maximum tasks per request**: 100
* **Minimum tasks per request**: 1
* **Batch status**: Batch must be in `"open"` or `"processing"` status to add tasks
* **Inherited settings**: Tasks cannot override `mode`, `output_formats`, or `search_params` from the batch

## Best Practices

### When to Use Batches vs Individual Tasks

| Use Batches When                          | Use Individual Tasks When                         |
| ----------------------------------------- | ------------------------------------------------- |
| Processing 10+ queries with shared config | Each task needs unique configuration              |
| Need unified cost tracking                | Need advanced features (files, deliverables, MCP) |
| Bulk research operations                  | Single or few research queries                    |
| Shared search parameters                  | Different search settings per task                |

### Cost Tracking

Monitor batch costs through the `cost` field:

```python theme={null}
status = valyu.batch.status(batch_id)
if status.success and status.batch:
    cost = status.batch.cost
    print(f"Total cost: ${cost}")
```

### Error Handling

Always check `success` fields and handle errors appropriately:

```python theme={null}
response = valyu.batch.create(...)

if not response.success:
    print(f"Error: {response.error}")
    return

# Proceed with successful response
batch_id = response.batch_id
```

### Webhooks

Set up webhooks for production use to avoid polling:

```python theme={null}
batch = valyu.batch.create(
    name="Research Batch",
    mode="standard",
    webhook_url="https://your-domain.com/webhook"
)

# IMPORTANT: Save the webhook_secret immediately - it's only returned once
webhook_secret = batch.webhook_secret
```

The webhook will receive a POST request when the batch reaches a terminal state (`completed`, `completed_with_errors`, or `cancelled`).

## Next Steps

<CardGroup cols={2}>
  <Card title="DeepResearch Guide" icon="book" href="/guides/deepresearch">
    Learn about individual task features like files, deliverables, and MCP servers
  </Card>

  <Card title="Python SDK" icon="python" href="/sdk/python-sdk/deepresearch-batch">
    Python SDK batch methods and code examples
  </Card>

  <Card title="TypeScript SDK" icon="js" href="/sdk/typescript-sdk/deepresearch-batch">
    TypeScript SDK batch methods and code examples
  </Card>

  <Card title="API Reference" icon="code" href="/api-reference/endpoint/deepresearch-batch-create">
    Complete batch API endpoint documentation
  </Card>
</CardGroup>
