> ## Documentation Index
> Fetch the complete documentation index at: https://docs.valyu.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Contents Job Status

> Reference for polling async content extraction job status via GET /v1/contents/jobs/{job_id}.



## OpenAPI

````yaml GET /v1/contents/jobs/{job_id}
openapi: 3.0.3
info:
  title: Valyu API
  version: 2.2.0
  description: >
    The search API built for AI agents. One API that gives your AI unified
    access to web search, academic papers, financial data, SEC filings, clinical
    trials, patents, and 36+ proprietary data sources.


    ## Products


    | Product | Endpoint | Description |

    | --- | --- | --- |

    | **Search** | `POST /v1/search` | Multi-source search across web and
    proprietary datasets |

    | **Contents** | `POST /v1/contents` | Extract clean, structured content
    from URLs |

    | **Answer** | `POST /v1/answer` | AI-powered answers with real-time search
    |

    | **DeepResearch** | `POST /v1/deepresearch/tasks` | Async research agents
    for comprehensive analysis |

    | **Datasources** | `GET /v1/datasources` | Discover available data sources
    and their schemas |


    ## Authentication


    All endpoints require an API key passed via the `X-API-Key` header. Get your
    key at [platform.valyu.ai](https://platform.valyu.ai).


    ```

    X-API-Key: your_api_key_here

    ```


    ## Pricing


    Valyu uses transparent, pay-per-use pricing:

    - **Search**: CPM-based (cost per mille tokens retrieved)

    - **Contents**: $0.001 per URL extracted, +$0.001 with AI features

    - **Answer**: $0.10 per request + variable search and AI costs

    - **DeepResearch**: Fixed per-task pricing by mode ($0.10 - $15.00)


    See [docs.valyu.ai](https://docs.valyu.ai) for full pricing details.


    ## SDKs


    - [Python SDK](https://pypi.org/project/valyu/) - `pip install valyu`

    - [TypeScript SDK](https://www.npmjs.com/package/valyu) - `npm install
    valyu`
  contact:
    name: Valyu Support
    url: https://valyu.ai
    email: support@valyu.ai
  x-logo:
    url: https://valyu.ai/logo.png
    altText: Valyu
servers:
  - url: https://api.valyu.ai
    description: Production
security:
  - ApiKeyAuth: []
tags:
  - name: Search
    description: >-
      Search across web, academic, financial, and proprietary data sources.
      Returns results ready for RAG pipelines, AI agents, and applications.
    externalDocs:
      description: Search API Guide
      url: https://docs.valyu.ai/search/quickstart
  - name: Contents
    description: >-
      Extract clean, structured content from web pages at scale. Supports batch
      processing, AI-powered summaries, structured extraction via JSON schemas,
      and async jobs for large URL sets.
    externalDocs:
      description: Contents API Guide
      url: https://docs.valyu.ai/guides/content-extraction
  - name: Answer
    description: >-
      Get AI-powered answers grounded in real-time search results. The Answer
      API searches across web, academic, and financial sources, then uses AI to
      generate a readable response via Server-Sent Events streaming.
    externalDocs:
      description: Answer API Guide
      url: https://docs.valyu.ai/guides/answer-api
  - name: DeepResearch
    description: >-
      Async research agents that perform comprehensive, multi-step research.
      DeepResearch searches multiple sources, analyzes content, and generates
      detailed reports with citations. Tasks run in the background and can take
      minutes to complete.
    externalDocs:
      description: DeepResearch Guide
      url: https://docs.valyu.ai/guides/deepresearch
  - name: Batches
    description: >-
      Run multiple DeepResearch tasks in parallel with shared configuration,
      unified monitoring, and aggregated cost tracking.
    externalDocs:
      description: Batch API Guide
      url: https://docs.valyu.ai/guides/deepresearch-batch-quickstart
  - name: Datasources
    description: >-
      Discover available data sources and their schemas. A tool manifest for AI
      agents - instead of hardcoding knowledge of available datasets, agents can
      query this API to discover sources, filter by category, and use
      `example_queries` for few-shot prompting.
    externalDocs:
      description: Datasources Guide
      url: https://docs.valyu.ai/guides/datasources
externalDocs:
  description: Complete API Documentation
  url: https://docs.valyu.ai
paths:
  /v1/contents/jobs/{job_id}:
    get:
      tags:
        - Contents
      summary: Poll async content extraction job status
      description: >-
        Check the status and retrieve results of an async content extraction
        job.
      operationId: getContentsJob
      parameters:
        - name: job_id
          in: path
          required: true
          schema:
            type: string
          description: The job ID returned by the async contents request.
          example: cj_a1b2c3d4e5f6g7h8
      responses:
        '200':
          description: Job status and results (when complete).
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ContentsJobResponse'
        '401':
          $ref: '#/components/responses/Unauthorized'
        '403':
          $ref: '#/components/responses/Forbidden'
        '404':
          $ref: '#/components/responses/NotFound'
        '500':
          $ref: '#/components/responses/InternalServerError'
components:
  schemas:
    ContentsJobResponse:
      type: object
      description: Async job status and results.
      required:
        - success
        - job_id
        - status
        - urls_total
        - urls_processed
        - urls_failed
      properties:
        success:
          type: boolean
          description: Whether the request succeeded.
          example: true
        job_id:
          type: string
          description: Unique identifier for the async job.
          example: cj_a1b2c3d4e5f6g7h8
        status:
          type: string
          description: |-
            Current job status.

            - `pending` - Job created, not yet started
            - `processing` - URLs are being processed
            - `completed` - All URLs processed successfully
            - `partial` - Processing finished with some failures
            - `failed` - All URLs failed
          enum:
            - pending
            - processing
            - completed
            - partial
            - failed
          example: completed
        urls_total:
          type: integer
          description: Total number of URLs to process.
          example: 25
        urls_processed:
          type: integer
          description: Number of URLs successfully processed so far.
          example: 23
        urls_failed:
          type: integer
          description: Number of URLs that failed processing.
          example: 2
        created_at:
          type: integer
          description: Job creation time as Unix timestamp in milliseconds.
          example: 1707000000000
        updated_at:
          type: integer
          description: Last update time as Unix timestamp in milliseconds.
          example: 1707000060000
        current_batch:
          type: integer
          description: >-
            Current batch being processed. Only present when `status` is
            `"processing"`.
          example: 3
        total_batches:
          type: integer
          description: >-
            Total number of batches. Only present when `status` is
            `"processing"`.
          example: 5
        results:
          type: array
          description: >-
            Extraction results. Only present when `status` is `"completed"` or
            `"partial"`.
          items:
            $ref: '#/components/schemas/ContentResult'
        actual_cost_dollars:
          type: number
          description: Total cost in USD. Only present when processing is complete.
          example: 0.023
        error:
          type: string
          description: >-
            Error message. Only present when `status` is `"failed"` or
            `"partial"`.
    ContentResult:
      type: object
      description: Extracted content from a single URL.
      required:
        - url
        - status
      properties:
        url:
          type: string
          format: uri
          description: The URL that was processed.
          example: https://en.wikipedia.org/wiki/Quantum_computing
        status:
          type: string
          description: Processing status for this URL.
          enum:
            - success
            - failed
          example: success
        title:
          type: string
          description: Page title. Present when `status` is `"success"`.
          example: Quantum computing - Wikipedia
        content:
          description: >-
            Extracted content. Markdown string when `data_type` is
            `"unstructured"`, JSON object when `data_type` is `"structured"`.
            Present when `status` is `"success"`.
          oneOf:
            - type: string
              description: Markdown-formatted content.
            - type: object
              description: Structured data matching the provided JSON schema.
          example: |-
            # Quantum computing

            Quantum computing is a type of computation...
        description:
          type: string
          description: Meta description of the page.
          example: >-
            Quantum computing is a type of computation that harnesses quantum
            mechanical phenomena.
        source:
          type: string
          description: Always `"web"` for content extraction.
          example: web
        price:
          type: number
          description: Cost in USD for extracting this URL.
          example: 0.001
        length:
          type: integer
          description: Character count of the `content` field.
          example: 24500
        data_type:
          type: string
          description: >-
            `"unstructured"` for text content, `"structured"` when JSON schema
            extraction succeeded.
          enum:
            - unstructured
            - structured
          example: unstructured
        source_type:
          type: string
          description: Classification of the content type.
          enum:
            - website
            - pdf
            - txt
            - json
            - xml
            - sitemap
            - csv
          example: website
        publication_date:
          type: string
          description: Publication date in `YYYY-MM-DD` format, or empty string if unknown.
          example: '2024-03-15'
        id:
          type: string
          description: Identifier for this result (same as `url`).
          example: https://en.wikipedia.org/wiki/Quantum_computing
        image_url:
          type: object
          description: Image URLs found on the page, keyed by index.
          additionalProperties:
            type: string
            format: uri
          example:
            '0': https://upload.wikimedia.org/wikipedia/commons/quantum.png
        screenshot_url:
          type: string
          format: uri
          nullable: true
          description: Screenshot URL if `screenshot` was enabled. `null` otherwise.
          example: null
        summary:
          description: >-
            AI-generated summary or structured extraction result. Only present
            when `summary` was enabled in the request.
          oneOf:
            - type: string
            - type: object
              additionalProperties: true
        summary_success:
          type: boolean
          description: >-
            Whether AI processing succeeded. Only present when `summary` was
            enabled.
        structured_metadata:
          type: object
          description: >-
            Metadata from structured data extraction. Only present when
            `data_type` is `"structured"`.
          additionalProperties: true
        error:
          type: string
          description: Error message. Only present when `status` is `"failed"`.
          example: 'Failed to fetch URL: connection timeout'
    Error:
      type: object
      description: Standard error response.
      required:
        - success
        - error
      properties:
        success:
          type: boolean
          description: Always `false` for error responses.
          example: false
        error:
          type: string
          description: Human-readable error message describing what went wrong.
          example: Invalid request parameters
  responses:
    Unauthorized:
      description: Missing or invalid API key.
      content:
        application/json:
          schema:
            $ref: '#/components/schemas/Error'
          examples:
            example:
              summary: Invalid API key
              value:
                success: false
                error: Invalid API key
    Forbidden:
      description: The API key does not have permission for this operation.
      content:
        application/json:
          schema:
            $ref: '#/components/schemas/Error'
          examples:
            example:
              summary: Permission denied
              value:
                success: false
                error: API key does not have inference permission
    NotFound:
      description: The requested resource was not found.
      content:
        application/json:
          schema:
            $ref: '#/components/schemas/Error'
          examples:
            example:
              summary: Resource not found
              value:
                success: false
                error: Task not found
    InternalServerError:
      description: An unexpected error occurred on the server.
      content:
        application/json:
          schema:
            $ref: '#/components/schemas/Error'
          examples:
            example:
              summary: Internal error
              value:
                success: false
                error: Internal server error. Please try again later.
  securitySchemes:
    ApiKeyAuth:
      type: apiKey
      in: header
      name: X-API-Key
      description: >-
        API key for authentication. Get yours at
        [platform.valyu.ai](https://platform.valyu.ai).

````