Skip to main content
Process multiple research tasks efficiently with shared configuration, unified monitoring, and aggregated cost tracking. The Batch API is ideal for bulk research operations where you need to process many queries simultaneously.

When to Use Batching

Use batch processing when you need to:
  • Process multiple queries - Run 1-100 research tasks in parallel
  • Share configuration - Apply the same mode, output formats, and search settings to all tasks
  • Unified monitoring - Track progress and costs across all tasks in one place
  • Efficient bulk operations - Reduce API calls and simplify task management
For individual tasks with unique configurations or advanced features (files, deliverables, MCP servers), use the standard DeepResearch API instead.

Key Concepts

Batch Lifecycle

A batch progresses through the following statuses:
StatusDescription
openBatch is created but no tasks are running yet
processingAt least one task is queued, running, or completed
completedAll tasks have finished successfully
completed_with_errorsAll tasks finished, but some failed
cancelledBatch was cancelled before completion

Task States

Individual tasks within a batch can be in these states:
StatusDescription
queuedTask is waiting to start
runningTask is currently executing
completedTask finished successfully
failedTask encountered an error
cancelledTask was cancelled

Shared Configuration

Tasks in a batch inherit these settings from the batch:
  • mode - Research mode (standard, heavy, fast)
  • output_formats - Output formats (markdown, pdf, toon, or JSON schema)
  • search_params - Search configuration (type, sources, dates, category)
Tasks can override:
  • strategy - Custom research instructions
  • urls - URLs to analyze
  • metadata - Custom metadata

Basic Workflow

from valyu import Valyu

valyu = Valyu()

# 1. Create a batch with default settings
batch = valyu.batch.create(
    name="Market Research Q4 2024",
    mode="standard",
    output_formats=["markdown"],
    search={
        "search_type": "all",
        "included_sources": ["web", "academic"],
        "start_date": "2024-01-01",
        "end_date": "2024-12-31"
    },
    metadata={"project": "Q4-2024", "team": "research"}
)

if batch.success:
    batch_id = batch.batch_id
    print(f"Created batch: {batch_id}")
    
    # 2. Add tasks to the batch
    tasks = [
        {"query": "Analyze technology sector performance in Q4 2024"},
        {"query": "Research healthcare sector trends and key players"},
        {"query": "Review renewable energy market developments"}
    ]
    
    add_result = valyu.batch.add_tasks(batch_id, tasks)
    
    if add_result.success:
        print(f"Added {add_result.added} tasks")
        
        # 3. Monitor progress
        status = valyu.batch.status(batch_id)
        if status.success and status.batch:
            print(f"Progress: {status.batch.counts.completed}/{status.batch.counts.total}")
            print(f"Total cost: ${status.batch.cost}")

Waiting for Completion

Use wait_for_completion() to automatically poll until the batch finishes:
def on_progress(status):
    if status.success and status.batch:
        counts = status.batch.counts
        print(
            f"Progress: {counts.completed + counts.failed}/{counts.total} "
            f"(Running: {counts.running}, Queued: {counts.queued})"
        )

try:
    final_status = valyu.batch.wait_for_completion(
        batch_id,
        poll_interval=10,
        max_wait_time=3600,  # 1 hour
        on_progress=on_progress
    )
    
    if final_status.success and final_status.batch:
        print(f"Batch completed!")
        print(f"Status: {final_status.batch.status}")
        print(f"Total cost: ${final_status.batch.cost}")
except TimeoutError as e:
    print(f"Timeout: {e}")

Retrieving Task Results

List all tasks in a batch and retrieve individual results:
# List all tasks in the batch
tasks_response = valyu.batch.list_tasks(batch_id)

if tasks_response.success and tasks_response.tasks:
    for task in tasks_response.tasks:
        print(f"Task: {task.task_id or task.deepresearch_id}")
        print(f"Query: {task.query}")
        print(f"Status: {task.status}")
        
        # Get detailed results for completed tasks
        if task.status == "completed":
            result = valyu.deepresearch.status(task.deepresearch_id)
            print(f"Output: {result.output[:200]}...")

Parameters Reference

Mode Values

ModeDescriptionCost per Task
standardStandard research mode (default)$0.50
heavyComprehensive research mode$1.50
fastFast research mode (lower cost, faster completion)Lower
The lite mode has been replaced by fast.

Output Formats

  • markdown (default) - Markdown text output
  • pdf - PDF document output
  • toon - TOON format (requires JSON schema)
  • JSON Schema Object - Structured output matching the provided schema
Cannot mix JSON schema with markdown or pdf. Use one or the other. toon format requires a JSON schema.

Search Parameters

Search parameters control which data sources are queried, what content is included/excluded, and how results are filtered by date or category. When set at the batch level, these parameters are applied to all tasks in the batch and cannot be overridden by individual tasks.

Search Type

Controls which backend search systems are queried for all tasks in the batch:
  • "all" (default): Searches both web and proprietary data sources
  • "web": Searches only web sources (general web search, news, articles)
  • "proprietary": Searches only proprietary data sources (academic papers, finance data, patents, etc.)
When set at the batch level, this parameter cannot be overridden by individual tasks.
batch = valyu.batch.create(
    name="Academic Research Batch",
    search={"search_type": "proprietary"}
)

Included Sources

Restricts search to only the specified source types for all tasks in the batch. When specified, only these sources will be searched. Tasks inherit this setting and cannot override it. Available source types:
  • "web": General web search results (news, articles, websites)
  • "academic": Academic papers and research databases (ArXiv, PubMed, BioRxiv/MedRxiv, Clinical trials, FDA drug labels, WHO health data, NIH grants, Wikipedia)
  • "finance": Financial and economic data (Stock/crypto/FX prices, SEC filings, Company financial statements, Economic indicators, Prediction markets)
  • "patent": Patent and intellectual property data (USPTO patent database, Patent abstracts, claims, descriptions)
  • "transportation": Transit and transportation data (UK National Rail schedules, Maritime vessel tracking)
  • "politics": Government and parliamentary data (UK Parliament members, bills, votes)
  • "legal": Case law and legal data (UK court judgments, Legislation text)
batch = valyu.batch.create(
    name="Academic Research Batch",
    search={
        "search_type": "proprietary",
        "included_sources": ["academic", "web"]
    }
)

Excluded Sources

Excludes specific source types from search results for all tasks in the batch. Uses the same source type values as included_sources. Cannot be used simultaneously with included_sources (use one or the other).
batch = valyu.batch.create(
    name="Research Batch",
    search={
        "search_type": "proprietary",
        "excluded_sources": ["web", "patent"]
    }
)

Start Date

Format: ISO date format (YYYY-MM-DD) Filters search results to only include content published or dated on or after this date for all tasks in the batch. Applied to both publication dates and event dates when available. Works across all source types.
batch = valyu.batch.create(
    name="2024 Research",
    search={"start_date": "2024-01-01"}
)

End Date

Format: ISO date format (YYYY-MM-DD) Filters search results to only include content published or dated on or before this date for all tasks in the batch. Applied to both publication dates and event dates when available. Works across all source types.
batch = valyu.batch.create(
    name="Q4 2024 Analysis",
    search={
        "start_date": "2024-10-01",
        "end_date": "2024-12-31"
    }
)

Category

Filters results by a specific category for all tasks in the batch. The exact categories available depend on the data source. Category values are source-dependent and may not be applicable to all source types.
batch = valyu.batch.create(
    name="Technology Research",
    search={"category": "technology"}
)

Important Notes

Parameter Enforcement

Batch-level parameters are enforced and cannot be overridden by individual tasks. This ensures consistent search behavior across all tasks in the batch. Tool-level source specifications are ignored if batch-level sources are specified.

Date Filtering

Dates are applied to both publication dates and event dates when available. ISO format (YYYY-MM-DD) is required. Date filtering works across all source types. If only start_date / startDate is provided, results include all content from that date forward. If only end_date / endDate is provided, results include all content up to that date. Both dates can be combined for a specific date range.

Limitations

Not Yet Supported in Batch API

The following features are not yet supported in the batch API:
  • deliverables - Cannot specify deliverables (CSV, XLSX, PPTX, DOCX) for batch tasks
  • brand_collection_id - Cannot apply branding to batch tasks
  • files - Cannot attach files to batch tasks
  • mcp_servers - Cannot configure MCP servers for batch tasks
  • code_execution - Always enabled (cannot disable per batch)
  • previous_reports - Cannot reference previous reports in batch tasks
  • alert_email - Cannot set email alerts for batch tasks
Workaround: Use individual task creation (POST /v1/deepresearch/tasks) if you need these features.

Task Constraints

  • Maximum tasks per request: 100
  • Minimum tasks per request: 1
  • Batch status: Batch must be in "open" or "processing" status to add tasks
  • Inherited settings: Tasks cannot override mode, output_formats, or search_params from the batch

Best Practices

When to Use Batches vs Individual Tasks

Use Batches WhenUse Individual Tasks When
Processing 10+ queries with shared configEach task needs unique configuration
Need unified cost trackingNeed advanced features (files, deliverables, MCP)
Bulk research operationsSingle or few research queries
Shared search parametersDifferent search settings per task

Batch Size Recommendations

  • Small batches (1-10 tasks): Good for testing and quick research
  • Medium batches (10-50 tasks): Ideal for most production use cases
  • Large batches (50-100 tasks): Use for bulk operations, monitor closely

Cost Tracking

Monitor batch costs through the cost field:
status = valyu.batch.status(batch_id)
if status.success and status.batch:
    cost = status.batch.cost
    print(f"Total cost: ${cost}")

Error Handling

Always check success fields and handle errors appropriately:
response = valyu.batch.create(...)

if not response.success:
    print(f"Error: {response.error}")
    return

# Proceed with successful response
batch_id = response.batch_id

Webhooks

Set up webhooks for production use to avoid polling:
batch = valyu.batch.create(
    name="Research Batch",
    mode="standard",
    webhook_url="https://your-domain.com/webhook"
)

# IMPORTANT: Save the webhook_secret immediately - it's only returned once
webhook_secret = batch.webhook_secret
The webhook will receive a POST request when the batch reaches a terminal state (completed, completed_with_errors, or cancelled).

Next Steps