You can use any sequence data, but the ideal is to use predicted genes from assembled sequences.
How should I format my gene prediction results?
You need to tell MyTaxa which gene was predicted on which contig. There are three options for that:
If you used the stand-alone MetaGeneMark, you can directly upload the GFF v2 file produced.
If you used a software producing standard GFF v3, including the field id in the last column
(with the Gene ID), you can directly upload this file.
If you don't have either file, you can generate a simple flat text file containing the ID of the
gene, the length of the gene, and the ID of the corresponding contig; separated by tabulations.
If you select No prediction, MyTaxa will attempt to directly classify the query sequences. Use this,
for example, if you are directly using reads (without assembly or gene prediction).