This file is the result of a preprocessing step in the classification. For each sequence is shown which reference sequence matches the most.
The classification file shows the classification for each best match. It shows the following information: ID, reference ID, taxonomy, identification rank, cutoff and confidence. This file also shows which sequences were not classified.
This file shows the classification for each ID. It shows the following information: ID, the predicted name, full classification, identification rank, cutoff confidence, reference ID, BLAST score, BLAST sim and BLAST coverage. This file also shows which sequences were not classified.
This is a Krona chart which shows the taxonomy distribution of the classification.
This file contains the same information as the Krona chart, but in text format.
This file contains the cutoff calculation results.
The first level of the file show the identification rank.
In the second level the higher rank groups are listed.
For each higher rank group the cutoff, confidence and
other extra information is given.
This is the file which needs to be used as input for classification.
This file contains the same information as the cutoffs.json file, but written in a tab delimited table instead.
The best cutoffs are calculated of the .cutoffs.json file This file contains the best cutoffs for the taxonomic groups. If a cutoff for a taxonomic group has a low F-measure, the cutoff of a taxonomic group higher is chosen with a higher F-measure.
This file contains the same information as the best.json file, but written in a tab delimited table instead.
These files contain information about how the cutoffs changed from the original cutoffs in the .cutoffs.json file to the best cutoffs in the .best.json file.
Here all calculated F-measures for each possible cutoff are given. This file does not contain the final results, but can be used for verification purposes.
This is a Krona chart which shows the taxonomy distribution of the submitted reference dataset.
This file contains the same information as the Krona chart, but in text format.
A tab-delimited file which shows the number of sequences per length interval of the submitted dataset.
A tab-delimited file which contains the taxonomic distribution of the submitted reference dataset. The first column shows the taxonname and the second column the number of sequences with that taxon. The last column shows the percentage of sequences with said taxonname of the total amount of sequences in the dataset.
This is a similarity matrix file. This file consist of three columns. The first two show the IDs of the
sequences that are being compared. The third column shows to percentage for which these sequences coincide.
A sequence length distribution of a FASTA file can be generated. The interval setting for this distribution is automatically set to 100, but can be changed.
The taxonomic distribution analysis generates the distribution of taxon names in the given dataset. This distribution is shown per taxonomic rank. For which ranks this is done can be changed in the settings.
This analysis generates a similarity matrix (.sim) file of the given dataset. This file consist of three columns. The first two show the IDs of the
sequences that are being compared. The third column shows to percentage for which these sequences coincide.
This file can be used as input for the cutoff calculation and visualization.
A .sim file will be created automatically for these calculations if this file is not given.
The command line version of DNA barcoder also has analysis options for a general overview of the submitted file and variation of the sequences. These options might be implemented into this interface in future developments.
This file contains the coordinates of each point of the visualization.
A similarity matrix file. This file consist of three columns. The first two show the IDs of the sequences that are being compared. The third column show to percentage for which these sequences coincide. This file will only be created if no similarity matrix file is given as input.