VirJenDB

Documentation

last update

2026-May-10

VirJenDB

v1.0

This page is up-to-date!

VirJenDB API

VirJenDB provides a public API that allows you to search virus records, retrieve genome sequences, inspect metadata fields, and download datasets programmatically.

The API is intended for:

The API uses JSON request bodies and standard HTTP methods.


Getting Started

The VirJenDB API base URL is:

https://api2.virjendb.org

Interactive Swagger documentation is available at:

https://api2.virjendb.org/swagger

At the time of writing, the public API does not require authentication.


Available Endpoints

Endpoint Method Description
/v2/search POST Search VirJenDB records
/v2/download POST Download metadata, accessions, or sequences
/v2/sequence POST Retrieve sequences using VirJenDB accession IDs
/v2/datasets_download GET Download pre-generated datasets
/v2/metadata POST Query metadata field definitions
/v2/metadata_download POST Export metadata field definitions

Searching Records

One of the main features of the VirJenDB API is the /v2/search endpoint. It uses the same underlying search backend as the VirJenDB web interface.

Endpoint

POST /v2/search

Basic Search

The simplest possible search request looks like this:

{
  "items": [
    {
      "metadataField": "",
      "searchTerm": "Influenza A virus"
    }
  ],
  "filter": [],
  "pageSize": 20,
  "pageNumber": 1
}

This behaves similarly to entering:Influenza A virus into the VirJenDB web interface search field.

Search Response

Example responses may vary depending on the selected metadata fields and database version.

{
  "total": 1244169,
  "results": [
    {
      "source": {
        "Organism Name": "Influenza A virus",
        "NCBI Species": "Alphainfluenzavirus influenzae",
        "Host Species": "Homo sapiens",
        "Country": "China",
        "Collection Date": "2017-06-30"
      }
    }
  ]
}

Search Clauses

Search requests are built using a list of search clauses inside the items array.

Each item contains:

Field Description
operator Logical operator connecting clauses (and, or, not)
metadataField Metadata field to search in
searchTerm Search value

Example:

{
  "items": [
    {
      "operator": "and",
      "metadataField": "Host Species",
      "searchTerm": "Homo sapiens"
    }
  ]
}

This searches the Host Species field for the value Homo sapiens.


Search Multiple Conditions

Multiple clauses can be combined using logical operators.

Example:

{
  "items": [
    {
      "operator": "and",
      "metadataField": "Host Species",
      "searchTerm": "Homo sapiens"
    },
    {
      "operator": "and",
      "metadataField": "Molecule Type",
      "searchTerm": "ssRNA(+)"
    }
  ],
  "pageSize": 50,
  "pageNumber": 1
}

This returns entries fulfilling both conditions:


Search Operators

The API supports three logical operators:

Operator Description
and All connected conditions must match
or Any connected condition may match
not Excludes matching entries

Example:

{
  "items": [
    {
      "operator": "and",
      "metadataField": "Host Species",
      "searchTerm": "Homo sapiens"
    },
    {
      "operator": "not",
      "metadataField": "Submitter Country",
      "searchTerm": "Germany"
    }
  ]
}

This returns entries where:


Pagination

Search results are paginated.

Field Description
pageSize Number of results returned per request
pageNumber Result page number

Example:

{
  "pageSize": 100,
  "pageNumber": 2
}

This returns the second page of results containing up to 100 entries. For best performance, avoid excessively large page sizes.


Sorting Results

Search results can be sorted using sortColumn and sortOrder.

Example:

{
  "sortColumn": "Collection Date",
  "sortOrder": "desc"
}

Supported sort directions are:


Example Search Request

curl -X POST "https://api2.virjendb.org/v2/search" \
  -H "Content-Type: application/json" \
  -d '{
    "items": [
      {
        "metadataField": "Host Species",
        "searchTerm": "Homo sapiens"
      }
    ],
    "filter": [],
    "pageSize": 20,
    "pageNumber": 1,
    "sortOrder": "asc"
  }'

Downloading Records

The /v2/download endpoint allows you to export metadata, accession IDs, or genome sequences.

Endpoint

POST /v2/download

Information Types

The information parameter controls what type of data is downloaded.

Type Description
metadata Download metadata fields
virjendb_accessions Download only VirJenDB accession IDs
sequences Download sequence data

Output Formats

The fileType parameter controls the output format.

File Type Description
csv Comma-separated values
tsv Tab-separated values
json JavaScript Object Notation
xml Extensible Markup Language
fasta FASTA sequence format

FASTA output is only available when downloading sequences.


Download Scope

The scope parameter controls which records are included in the download.

Scope Description
selection Download explicitly selected accession IDs
max Download records matching the provided search request
all Present in the API schema

Download Matching Search Results

The following request downloads metadata records matching a search query.

{
  "download": {
    "scope": "max",
    "information": "metadata",
    "fileType": "csv",
    "metadata": [
      "Organism Name",
      "Host Species",
      "Collection Date"
    ]
  },
  "search": {
    "items": [
      {
        "metadataField": "Molecule Type",
        "searchTerm": "ssRNA(+)"
      }
    ],
    "pageSize": 1000,
    "pageNumber": 1,
    "sortOrder": "asc"
  }
}

This downloads metadata for all records matching the specified search query.


Download Selected Records

Instead of downloading all matching search results, you can explicitly provide accession IDs.

Example:

{
  "download": {
    "scope": "selection",
    "information": "metadata",
    "fileType": "csv",
    "selected_ids": [
      "vj000000000010",
      "vj000005747069"
    ]
  }
}

VirJenDB accession IDs follow the format:

vj000000000010

Selection downloads currently support up to 10,000 accession IDs per request.


Downloading Sequences

Sequences can be downloaded using:

{
  "download": {
    "scope": "selection",
    "information": "sequences",
    "fileType": "fasta",
    "metadata": [
      "Host Species",
      "Molecule Type"
    ],
    "selected_ids": [
      "vj000000000010"
    ]
  }
}

When exporting sequences in FASTA format:

Example:

>vj000000000010|Host Species=Homo sapiens|Molecule Type=ssRNA(+)
AUGCUAGCUAGCUAGCUAGC

Retrieving Sequences

The /v2/sequence endpoint retrieves sequences directly from VirJenDB accession IDs.

Endpoint

POST /v2/sequence

Example Request

{
  "virjendb_accessions": [
    "vj000000000010",
    "vj000005747069"
  ]
}

Notes


Downloading Precomputed Datasets

VirJenDB also provides several pre-generated datasets for bulk download.

Endpoint

GET /v2/datasets_download

Download a Dataset

Datasets are selected using the filename query parameter.

Example:

GET /v2/datasets_download?filename=virjendbv1_full_dataset_sequence.csv.gz

Available Dataset Types

Metadata Datasets

Sequence Datasets

Cluster Comparison Metrics

Mapping Files

Checksum Files

SHA256 checksum files are available for downloadable datasets.

These files can be used to verify file integrity after download.


Query Metadata Fields

Endpoint

POST /v2/metadata

Example Request

{
  "field_name": [],
  "privacy": [],
  "tags": [],
  "submission_requiredness": [],
  "submission_fieldtype": [],
  "submission_validation": [],
  "match_mode": "and",
  "summary": false
}

Metadata Filters

Field Description
field_name Filter by metadata field name
privacy Filter by privacy classification
tags Filter using metadata tags
submission_requiredness Filter by required or optional fields
submission_fieldtype Filter by field type
submission_validation Filter by validation rules
match_mode Combine filters using and or or
summary Return summarized output

Metadata definitions may evolve between database releases.


Export Metadata Definitions

Metadata definitions can also be downloaded directly.

Endpoint

POST /v2/metadata_download

Example Request

{
  "field_name": [],
  "file_type": "csv"
}

Notes

Common export formats include:

This endpoint can be used to export metadata field documentation and schema information.


Validation Errors

Most validation errors return HTTP status code 422.

Example:

{
  "detail": [
    {
      "loc": ["body", "items", 0],
      "msg": "field required",
      "type": "value_error"
    }
  ]
}

Additional Notes

Table of Contents