last update
2025-August-04
VirJenDB
v0.1
VirJenDB Application Programming Interface (API)
Available at https://api2.virjendb.org/swagger.
Download
Allows users to download data from the VirJenDB database based on specific search criteria or selected entries. This endpoint supports various data types (metadata, accessions, sequences) and file formats (TSV, CSV, JSON, XML, FASTA).
### Download Scope (download.scope
):
Control what data is included in your download:
-
selection
: Downloads only the records explicitly identified by their ‘0dn’ IDs in thedownload.selected_ids
list. This is ideal for downloading a specific subset of results, perhaps after a prior search and selection on the client-side. The maximum number of IDs supported in a single ‘selection’ download is 10,000. If you provide more than 10,000 IDs, only the first 10,000 will be processed. -
max
: Downloads data for records matching the criteria defined in thesearch
object. This scope leverages Elasticsearch’ssearch_after
mechanism for efficient deep pagination, allowing you to retrieve large result sets in pages. Usesearch.pageNumber
andsearch.pageSize
to specify which portion of the total search hits you wish to download. For example,pageNumber=1
andpageSize=10000
will download the first 10,000 results, whilepageNumber=2
andpageSize=10000
will download records 10,001 to 20,000. There is no hard limit on the total number of records you can potentially paginate through, as long as thepageSize
itself does not exceed the Elasticsearch 10000.
### Information Type (download.information
):
Specify the type of information to be downloaded for the selected records:
-
metadata
: Downloads the metadata fields for each record. You can specify which fields to include usingdownload.metadata
. Ifdownload.metadata
is empty, all public metadata fields will be included by default. -
accessions
: Downloads only the ‘0dn’ accession IDs for each record. This is a lightweight option for obtaining a list of identifiers. -
sequences
: Downloads the sequence data along with the accession and other requested metadata fields (fromdownload.metadata
) for each record. Sequences are provided in FASTA format. This can be a large download depending on the number of records and sequence lengths.
### File Type (download.fileType
):
Choose the desired output file format:
tsv
: Tab-Separated Values.csv
: Comma-Separated Values.xml
: Extensible Markup Language — a structured text format for hierarchical data.json
: JavaScript Object Notation — a lightweight format for structured data, commonly used in APIs.fasta
: FASTA format (only available forinformation: "sequences"
).
### Search Parameters (search
object):
When download.scope
is max
, the search
object’s parameters are used to define
the dataset to be downloaded. These parameters work identically to the /v2/search
endpoint,
allowing for complex queries, filtering, sorting, and pagination.
Important Notes:
- When downloading
sequences
in FASTA format, metadata fields (fromdownload.metadata
) will be included in the FASTA header, separated by|
. The VirJenID field is always included as the first identifier in the FASTA header. - The
pageSize
forsearch_request
in ‘max’ scope determines the number of records in the downloaded file. - If no results are found for the given criteria or selections, a
404 Not Found
error will be returned. - Internal Elasticsearch 10,000 limits for individual Elasticsearch requests are handled
by the backend using
search_after
for deep pagination, but thepageSize
you request directly dictates the number of results in your download chunk.
What is Swagger UI?
Learn more at swagger.io.
What is a RestAPI?
Learn more at freecodecamp.org.