last update
2026-May-10
VirJenDB
v1.0
VirJenDB API
VirJenDB provides a public API that allows you to search virus records, retrieve genome sequences, inspect metadata fields, and download datasets programmatically.
The API is intended for:
- automated analysis workflows
- external web applications
- bioinformatics pipelines
- bulk data retrieval
- metadata exploration
The API uses JSON request bodies and standard HTTP methods.
Getting Started
The VirJenDB API base URL is:
https://api2.virjendb.org
Interactive Swagger documentation is available at:
https://api2.virjendb.org/swagger
At the time of writing, the public API does not require authentication.
Available Endpoints
| Endpoint | Method | Description |
|---|---|---|
/v2/search |
POST |
Search VirJenDB records |
/v2/download |
POST |
Download metadata, accessions, or sequences |
/v2/sequence |
POST |
Retrieve sequences using VirJenDB accession IDs |
/v2/datasets_download |
GET |
Download pre-generated datasets |
/v2/metadata |
POST |
Query metadata field definitions |
/v2/metadata_download |
POST |
Export metadata field definitions |
Searching Records
One of the main features of the VirJenDB API is the /v2/search endpoint.
It uses the same underlying search backend as the VirJenDB web interface.
Endpoint
POST /v2/search
Basic Search
The simplest possible search request looks like this:
{
"items": [
{
"metadataField": "",
"searchTerm": "Influenza A virus"
}
],
"filter": [],
"pageSize": 20,
"pageNumber": 1
}
This behaves similarly to entering:Influenza A virus into the VirJenDB web interface search field.
Search Response
Example responses may vary depending on the selected metadata fields and database version.
{
"total": 1244169,
"results": [
{
"source": {
"Organism Name": "Influenza A virus",
"NCBI Species": "Alphainfluenzavirus influenzae",
"Host Species": "Homo sapiens",
"Country": "China",
"Collection Date": "2017-06-30"
}
}
]
}
Search Clauses
Search requests are built using a list of search clauses inside the items array.
Each item contains:
| Field | Description |
|---|---|
operator |
Logical operator connecting clauses (and, or, not) |
metadataField |
Metadata field to search in |
searchTerm |
Search value |
Example:
{
"items": [
{
"operator": "and",
"metadataField": "Host Species",
"searchTerm": "Homo sapiens"
}
]
}
This searches the Host Species field for the value Homo sapiens.
Search Multiple Conditions
Multiple clauses can be combined using logical operators.
Example:
{
"items": [
{
"operator": "and",
"metadataField": "Host Species",
"searchTerm": "Homo sapiens"
},
{
"operator": "and",
"metadataField": "Molecule Type",
"searchTerm": "ssRNA(+)"
}
],
"pageSize": 50,
"pageNumber": 1
}
This returns entries fulfilling both conditions:
- the host species matches
Homo sapiens - the molecule type matches
ssRNA(+)
Search Operators
The API supports three logical operators:
| Operator | Description |
|---|---|
and |
All connected conditions must match |
or |
Any connected condition may match |
not |
Excludes matching entries |
Example:
{
"items": [
{
"operator": "and",
"metadataField": "Host Species",
"searchTerm": "Homo sapiens"
},
{
"operator": "not",
"metadataField": "Submitter Country",
"searchTerm": "Germany"
}
]
}
This returns entries where:
- the host species matches
Homo sapiens - the submitter country does not match
Germany
Pagination
Search results are paginated.
| Field | Description |
|---|---|
pageSize |
Number of results returned per request |
pageNumber |
Result page number |
Example:
{
"pageSize": 100,
"pageNumber": 2
}
This returns the second page of results containing up to 100 entries. For best performance, avoid excessively large page sizes.
Sorting Results
Search results can be sorted using sortColumn and sortOrder.
Example:
{
"sortColumn": "Collection Date",
"sortOrder": "desc"
}
Supported sort directions are:
ascdesc
Example Search Request
curl -X POST "https://api2.virjendb.org/v2/search" \
-H "Content-Type: application/json" \
-d '{
"items": [
{
"metadataField": "Host Species",
"searchTerm": "Homo sapiens"
}
],
"filter": [],
"pageSize": 20,
"pageNumber": 1,
"sortOrder": "asc"
}'
Downloading Records
The /v2/download endpoint allows you to export metadata, accession IDs, or genome sequences.
Endpoint
POST /v2/download
Information Types
The information parameter controls what type of data is downloaded.
| Type | Description |
|---|---|
metadata |
Download metadata fields |
virjendb_accessions |
Download only VirJenDB accession IDs |
sequences |
Download sequence data |
Output Formats
The fileType parameter controls the output format.
| File Type | Description |
|---|---|
csv |
Comma-separated values |
tsv |
Tab-separated values |
json |
JavaScript Object Notation |
xml |
Extensible Markup Language |
fasta |
FASTA sequence format |
FASTA output is only available when downloading sequences.
Download Scope
The scope parameter controls which records are included in the download.
| Scope | Description |
|---|---|
selection |
Download explicitly selected accession IDs |
max |
Download records matching the provided search request |
all |
Present in the API schema |
Download Matching Search Results
The following request downloads metadata records matching a search query.
{
"download": {
"scope": "max",
"information": "metadata",
"fileType": "csv",
"metadata": [
"Organism Name",
"Host Species",
"Collection Date"
]
},
"search": {
"items": [
{
"metadataField": "Molecule Type",
"searchTerm": "ssRNA(+)"
}
],
"pageSize": 1000,
"pageNumber": 1,
"sortOrder": "asc"
}
}
This downloads metadata for all records matching the specified search query.
Download Selected Records
Instead of downloading all matching search results, you can explicitly provide accession IDs.
Example:
{
"download": {
"scope": "selection",
"information": "metadata",
"fileType": "csv",
"selected_ids": [
"vj000000000010",
"vj000005747069"
]
}
}
VirJenDB accession IDs follow the format:
vj000000000010
Selection downloads currently support up to 10,000 accession IDs per request.
Downloading Sequences
Sequences can be downloaded using:
{
"download": {
"scope": "selection",
"information": "sequences",
"fileType": "fasta",
"metadata": [
"Host Species",
"Molecule Type"
],
"selected_ids": [
"vj000000000010"
]
}
}
When exporting sequences in FASTA format:
- sequence data is returned in FASTA format
- metadata fields may be added to FASTA headers
- fields are separated using the
|character - the VirJenDB accession is always included in the header
Example:
>vj000000000010|Host Species=Homo sapiens|Molecule Type=ssRNA(+)
AUGCUAGCUAGCUAGCUAGC
Retrieving Sequences
The /v2/sequence endpoint retrieves sequences directly from VirJenDB accession IDs.
Endpoint
POST /v2/sequence
Example Request
{
"virjendb_accessions": [
"vj000000000010",
"vj000005747069"
]
}
Notes
- accession IDs must follow the VirJenDB accession format
- sequence formatting may vary depending on the requested records
- response structures may evolve between database versions
Downloading Precomputed Datasets
VirJenDB also provides several pre-generated datasets for bulk download.
Endpoint
GET /v2/datasets_download
Download a Dataset
Datasets are selected using the filename query parameter.
Example:
GET /v2/datasets_download?filename=virjendbv1_full_dataset_sequence.csv.gz
Available Dataset Types
Metadata Datasets
virjendbv1_full_dataset_metadata.csv.gzvirjendbv1_unique_seqs_metadata.csv.gzvirjendbv1_votu_clusters_metadata.csv.gz
Sequence Datasets
virjendbv1_full_dataset_sequence.csv.gzvirjendbv1_unique_seqs_sequence.csv.gzvirjendbv1_votu_clusters_sequence.csv.gz
Cluster Comparison Metrics
virjendbv1_votu_cluster_comparison_metrics.csv.gz
Mapping Files
ncbi_to_gtdb_map.csv.gz
Checksum Files
SHA256 checksum files are available for downloadable datasets.
These files can be used to verify file integrity after download.
Query Metadata Fields
Endpoint
POST /v2/metadata
Example Request
{
"field_name": [],
"privacy": [],
"tags": [],
"submission_requiredness": [],
"submission_fieldtype": [],
"submission_validation": [],
"match_mode": "and",
"summary": false
}
Metadata Filters
| Field | Description |
|---|---|
field_name |
Filter by metadata field name |
privacy |
Filter by privacy classification |
tags |
Filter using metadata tags |
submission_requiredness |
Filter by required or optional fields |
submission_fieldtype |
Filter by field type |
submission_validation |
Filter by validation rules |
match_mode |
Combine filters using and or or |
summary |
Return summarized output |
Metadata definitions may evolve between database releases.
Export Metadata Definitions
Metadata definitions can also be downloaded directly.
Endpoint
POST /v2/metadata_download
Example Request
{
"field_name": [],
"file_type": "csv"
}
Notes
Common export formats include:
csvjsonxlsx
This endpoint can be used to export metadata field documentation and schema information.
Validation Errors
Most validation errors return HTTP status code 422.
Example:
{
"detail": [
{
"loc": ["body", "items", 0],
"msg": "field required",
"type": "value_error"
}
]
}
Additional Notes
- Very large downloads may take additional time to generate
- API response structures may evolve between database releases
- Checksum files can be used to verify dataset downloads
- The Swagger interface provides the complete OpenAPI schema