DeepResearch Batch Processing

Process multiple research tasks efficiently with shared configuration, unified monitoring, and aggregated cost tracking. The Batch API is ideal for bulk research operations where you need to process many queries simultaneously.

New to DeepResearch? Start with the DeepResearch Guide to understand the core concepts before using batch processing.

Features

Parallel Processing

Run 1-100 research tasks simultaneously with shared configuration.

Unified Monitoring

Track progress, costs, and status across all tasks in one place.

Shared Configuration

Apply the same mode, output formats, and search settings to all tasks.

Webhook Notifications

Get notified when batches complete instead of polling.

When to Use Batching

Use batch processing when you need to:

Process multiple queries - Run 1-100 research tasks in parallel
Share configuration - Apply the same mode, output formats, and search settings to all tasks
Unified monitoring - Track progress and costs across all tasks in one place
Efficient bulk operations - Reduce API calls and simplify task management

For individual tasks with unique configurations or advanced features (files, deliverables, MCP servers), use the standard DeepResearch API instead.

Key Concepts

Batch Lifecycle

A batch progresses through the following statuses:

Status	Description
`open`	Batch is created but no tasks are running yet
`processing`	At least one task is queued, running, or completed
`completed`	All tasks have finished successfully
`completed_with_errors`	All tasks finished, but some failed
`cancelled`	Batch was cancelled before completion

Task States

Individual tasks within a batch can be in these states:

Status	Description
`queued`	Task is waiting to start
`running`	Task is currently executing
`completed`	Task finished successfully
`failed`	Task encountered an error
`cancelled`	Task was cancelled

Shared Configuration

Tasks in a batch inherit these settings from the batch:

mode - Research mode (fast, standard, heavy, max)
output_formats - Output formats (markdown, pdf, toon, or JSON schema)
search_params - Search configuration (type, sources, dates, category)

Tasks can override:

strategy - Custom research instructions
urls - URLs to analyze
metadata - Custom metadata

Basic Workflow

from valyu import Valyu

valyu = Valyu()

# 1. Create a batch with default settings
batch = valyu.batch.create(
    name="Market Research Q4 2024",
    mode="standard",
    output_formats=["markdown"],
    search={
        "search_type": "all",
        "included_sources": ["web", "academic"],
        "start_date": "2024-01-01",
        "end_date": "2024-12-31"
    },
    metadata={"project": "Q4-2024", "team": "research"}
)

if batch.success:
    batch_id = batch.batch_id
    print(f"Created batch: {batch_id}")
    
    # 2. Add tasks to the batch
    tasks = [
        {"query": "Analyze technology sector performance in Q4 2024"},
        {"query": "Research healthcare sector trends and key players"},
        {"query": "Review renewable energy market developments"}
    ]
    
    add_result = valyu.batch.add_tasks(batch_id, tasks)
    
    if add_result.success:
        print(f"Added {add_result.added} tasks")
        
        # 3. Monitor progress
        status = valyu.batch.status(batch_id)
        if status.success and status.batch:
            print(f"Progress: {status.batch.counts.completed}/{status.batch.counts.total}")
            print(f"Total cost: ${status.batch.cost}")

Waiting for Completion

Use wait_for_completion() to automatically poll until the batch finishes:

def on_progress(status):
    if status.success and status.batch:
        counts = status.batch.counts
        print(
            f"Progress: {counts.completed + counts.failed}/{counts.total} "
            f"(Running: {counts.running}, Queued: {counts.queued})"
        )

try:
    final_status = valyu.batch.wait_for_completion(
        batch_id,
        poll_interval=10,
        max_wait_time=3600,  # 1 hour
        on_progress=on_progress
    )
    
    if final_status.success and final_status.batch:
        print(f"Batch completed!")
        print(f"Status: {final_status.batch.status}")
        print(f"Total cost: ${final_status.batch.cost}")
except TimeoutError as e:
    print(f"Timeout: {e}")

Retrieving Task Results

Use the list_tasks() / listTasks() method with include_output=true to get full task outputs in a single paginated request:

# Get all completed results with full output
results = valyu.batch.list_tasks(batch_id, status="completed", include_output=True)

for task in results.tasks:
    print(f"Task: {task.task_id or task.deepresearch_id}")
    print(f"Query: {task.query}")
    print(f"Output: {task.output[:200]}...")
    print(f"Sources: {len(task.sources)} cited")
    print(f"Cost: ${task.cost}")

# Paginate through all results
last_key = results.pagination.last_key
while last_key:
    next_page = valyu.batch.list_tasks(batch_id, status="completed", include_output=True, last_key=last_key)
    for task in next_page.tasks:
        print(f"Task: {task.deepresearch_id} - {task.query}")
    last_key = next_page.pagination.last_key

By default, include_output is false, returning a lightweight listing with task status only. Set include_output=true when you need the full output, sources, images, and cost for each task.

Parameters Reference

Mode Values

Mode	Description	Cost per Task
`fast`	Fast research mode (quick answers, lightweight research)	$0.10
`standard`	Standard research mode (default)	$0.50
`heavy`	Comprehensive research mode with fact verification	$2.50
`max`	Exhaustive research mode with maximum quality and fact verification	$15.00

The lite mode is deprecated and maps to standard.

Output Formats

markdown (default) - Markdown text output
pdf - PDF document output
toon - TOON format (requires JSON schema)
JSON Schema Object - Structured output matching the provided schema

Cannot mix JSON schema with markdown or pdf. Use one or the other. toon format requires a JSON schema.

Search Parameters

Search parameters control which data sources are queried, what content is included/excluded, and how results are filtered by date or category. When set at the batch level, these parameters are applied to all tasks in the batch and cannot be overridden by individual tasks.

Search Type

Controls which backend search systems are queried for all tasks in the batch:

"all" (default): Searches both web and proprietary data sources
"web": Searches only web sources (general web search, news, articles)
"proprietary": Searches only proprietary data sources (academic papers, finance data, patents, etc.)

When set at the batch level, this parameter cannot be overridden by individual tasks.

batch = valyu.batch.create(
    name="Academic Research Batch",
    search={"search_type": "proprietary"}
)

Included Sources

Restricts search to only the specified source types for all tasks in the batch. When specified, only these sources will be searched. Tasks inherit this setting and cannot override it. Available source types:

"web": General web search results (news, articles, websites)
"academic": Academic papers and research databases (ArXiv, PubMed, BioRxiv/MedRxiv, Clinical trials, FDA drug labels, WHO health data, NIH grants, Wikipedia)
"finance": Financial and economic data (Stock/crypto/FX prices, SEC filings, Company financial statements, Economic indicators, Prediction markets)
"patent": Patent and intellectual property data (USPTO patent database, Patent abstracts, claims, descriptions)
"transportation": Transit and transportation data (UK National Rail schedules, Maritime vessel tracking)
"politics": Government and parliamentary data (UK Parliament members, bills, votes)
"legal": Case law and legal data (UK court judgments, Legislation text)

batch = valyu.batch.create(
    name="Academic Research Batch",
    search={
        "search_type": "proprietary",
        "included_sources": ["academic", "web"]
    }
)

Excluded Sources

Excludes specific source types from search results for all tasks in the batch. Uses the same source type values as included_sources. Cannot be used simultaneously with included_sources (use one or the other).

batch = valyu.batch.create(
    name="Research Batch",
    search={
        "search_type": "proprietary",
        "excluded_sources": ["web", "patent"]
    }
)

Start Date

Format: ISO date format (YYYY-MM-DD) Filters search results to only include content published or dated on or after this date for all tasks in the batch. Applied to both publication dates and event dates when available. Works across all source types.

batch = valyu.batch.create(
    name="2024 Research",
    search={"start_date": "2024-01-01"}
)

End Date

Format: ISO date format (YYYY-MM-DD) Filters search results to only include content published or dated on or before this date for all tasks in the batch. Applied to both publication dates and event dates when available. Works across all source types.

batch = valyu.batch.create(
    name="Q4 2024 Analysis",
    search={
        "start_date": "2024-10-01",
        "end_date": "2024-12-31"
    }
)

Important Notes

Parameter Enforcement

Batch-level parameters are enforced and cannot be overridden by individual tasks. This ensures consistent search behavior across all tasks in the batch. Tool-level source specifications are ignored if batch-level sources are specified.

Date Filtering

Dates are applied to both publication dates and event dates when available. ISO format (YYYY-MM-DD) is required. Date filtering works across all source types. If only start_date / startDate is provided, results include all content from that date forward. If only end_date / endDate is provided, results include all content up to that date. Both dates can be combined for a specific date range.

Limitations

Not Yet Supported in Batch API

The following features are not yet supported in the batch API:

deliverables - Cannot specify deliverables (CSV, XLSX, PPTX, DOCX) for batch tasks
files - Cannot attach files to batch tasks
mcp_servers - Cannot configure MCP servers for batch tasks
code_execution - Always enabled (cannot disable per batch)
previous_reports - Cannot reference previous reports in batch tasks
alert_email - Cannot set email alerts for batch tasks

Workaround: Use individual task creation (POST /v1/deepresearch/tasks) if you need these features.

Task Constraints

Maximum tasks per request: 100
Minimum tasks per request: 1
Batch status: Batch must be in "open" or "processing" status to add tasks
Inherited settings: Tasks cannot override mode, output_formats, or search_params from the batch

Best Practices

When to Use Batches vs Individual Tasks

Use Batches When	Use Individual Tasks When
Processing 10+ queries with shared config	Each task needs unique configuration
Need unified cost tracking	Need advanced features (files, deliverables, MCP)
Bulk research operations	Single or few research queries
Shared search parameters	Different search settings per task

Cost Tracking

Monitor batch costs through the cost field:

status = valyu.batch.status(batch_id)
if status.success and status.batch:
    cost = status.batch.cost
    print(f"Total cost: ${cost}")

Error Handling

Always check success fields and handle errors appropriately:

response = valyu.batch.create(...)

if not response.success:
    print(f"Error: {response.error}")
    return

# Proceed with successful response
batch_id = response.batch_id

Webhooks

Set up webhooks for production use to avoid polling:

batch = valyu.batch.create(
    name="Research Batch",
    mode="standard",
    webhook_url="https://your-domain.com/webhook"
)

# IMPORTANT: Save the webhook_secret immediately - it's only returned once
webhook_secret = batch.webhook_secret

The webhook will receive a POST request when the batch reaches a terminal state (completed, completed_with_errors, or cancelled).

Next Steps

DeepResearch Guide

Learn about individual task features like files, deliverables, and MCP servers

Python SDK

Python SDK batch methods and code examples

TypeScript SDK

TypeScript SDK batch methods and code examples

API Reference

Complete batch API endpoint documentation

Getting Started

Guides & Best Practices

AI SDK Tooling Guides

Use Cases

Core Concepts

Data Sources

Compare

Important Updates

Account & Pricing

Other

​Features

Parallel Processing

Unified Monitoring

Shared Configuration

Webhook Notifications

​When to Use Batching

​Key Concepts

​Batch Lifecycle

​Task States

​Shared Configuration

​Basic Workflow

​Waiting for Completion

​Retrieving Task Results

​Parameters Reference

​Mode Values

​Output Formats

​Search Parameters

​Search Type

​Included Sources

​Excluded Sources

​Start Date

​End Date

​Category

​Important Notes

​Parameter Enforcement

​Date Filtering

​Limitations

​Not Yet Supported in Batch API

​Task Constraints

​Best Practices

​When to Use Batches vs Individual Tasks

​Cost Tracking

​Error Handling

​Webhooks

​Next Steps

DeepResearch Guide

Python SDK

TypeScript SDK

API Reference

Features

When to Use Batching

Key Concepts

Batch Lifecycle

Task States

Shared Configuration

Basic Workflow

Waiting for Completion

Retrieving Task Results

Parameters Reference

Mode Values

Output Formats

Search Parameters

Search Type

Included Sources

Excluded Sources

Start Date

End Date

Category

Important Notes

Parameter Enforcement

Date Filtering

Limitations

Not Yet Supported in Batch API

Task Constraints

Best Practices

When to Use Batches vs Individual Tasks

Cost Tracking

Error Handling

Webhooks

Next Steps