When creating a submission, the first step is to declare how your genomic sequence data is formatted. This choice is critical as it tells the AGARI platform how to correctly link your metadata records to their corresponding sequences. The platform supports two primary formats.
This format involves using one or more separate FASTA files, where each file contains the sequence for a single genome.
This approach is recommended for organisms with larger genomes, such as bacteria or parasites. Pathogens like Cholera, Klebsiella pneumoniae, and Malaria typically use this format.
To link the metadata to the correct sequence file, your metadata TSV file must include a column named sequenceFileName. The value in this column for each row must exactly match the name of the corresponding FASTA file you are uploading.
If you upload two FASTA files named cholera_sample_A.fasta and cholera_sample_B.fasta, your metadata file would look like this:
| isolateId | sampleId | ... | sequenceFileName |
|---|---|---|---|
| ISO-001 | SAMP-001 | ... | cholera_sample_A.fasta |
| ISO-002 | SAMP-002 | ... | cholera_sample_B.fasta |
This format uses a single FASTA file that contains the sequences for many different isolates.
This is common for organisms with smaller genomes, such as viruses. Pathogens like COVID-19 and Mpox often use this format.
To link the metadata, the header line for each sequence record in your FASTA file must start with a > symbol followed immediately by an identifier that exactly matches the value in the isolateId column of your metadata TSV file. Any additional information in the FASTA header after the ID will be ignored.
If you have two isolates, SARSCOV2-001 and SARSCOV2-002, your files would look like this:
Metadata File (e.g., covid_samples.tsv):
| isolateId | sampleId | ... |
|---|---|---|
| SARSCOV2-001 | SAMP-A | ... |
| SARSCOV2-002 | SAMP-B | ... |
Sequence File (e.g., sequences.fasta):
>SARSCOV2-001 some other info...
ATGC...GATTACA
>SARSCOV2-002 some other info...
ATGC...GATTACA© 2025 AGARI. All rights reserved.