change organism of FASTA database

CPAS Forum (Inactive)
change organism of FASTA database mboehmer  2011-01-24 17:27
Status: Closed
 
I searched a huge number of mZXML files against a particular FASTA file using X!Tandem before realizing that Labkey associated no organism with this database. I have an uniprot file from this organism loaded on the same server and was hoping that Labkey can show me information from both, the original FASTA as well as the uniprot file. Is there a way to change the organism from currently unknown unknown to the correct organism?
 
 
Brian Connolly responded:  2011-01-26 11:40
Maik,

The answer to your question is yes and no. For a given FASTA file, you can change the organism for ALL NEW searches using this FASTA file. For all previously imported searches, you will not be able to change the organism and these imported search results will continue to use "Unknown unknown" organism or the organism that was guessed by LabKey from the FASTA file.

To set the organism to be used for all future searches you can do the following

   1. Goto the Protein Database Admin console (goto the Admin Console and then click on the "Protein Databases" link)
   2. In the list of FASTA files, in the Fasta Files webpart, find the FASTA file you want to change.
   3. Select and copy the full file path specified for the file in the "Fasta File Name" column
   4. In the "Protein Annotations Loaded" web part, click on the "Load New Annot File" link
   5. This will open the "Load Protein Annotations" page
         1. Paste in the full file path from step #3 into the Full file path field
         2. In the comment field, add a comment specifying that this is the reloaded FASTA file
         3. Type: choose fasta
         4. Enter the desired organism in the "Default Organism" field
         5. Hit the Load Annotation button
   6. This will bring you back to the Protein Database Admin console. Hit the refresh button to see the newly loaded FASTA file and annotations.

NOTE: You will now see 2 entries in the FASTA files and Protein Annotations web parts associated with the same FASTA file path. This is expected. One of the entries will be associated with previously searched results and other with newly imported results.

All future searches, using the FASTA file, will now use this newly loaded version, which is assocated with the organism you specified above.


Further Information
=======================
By default, when LabKey loads results that were searched against a new FASTA file, it loads the FASTA file, including all sequences and any annotations that can be parsed from the FASTA header line. Every annotation is associated with an organism and a sequence. LabKey uses several heuristics to guess the organism from the annotations. These heuristics work fairly well, but not perfectly

If you know that your FASTA file comes from say Human or Mouse samples(for example), you should load the FASTA file, via the Protein Databases Admin console, prior to performing any searches using that FASTA file. Currently, this is the only way you can specify the default organism to be used for a given FASTA file.



-Brian
 
mboehmer responded:  2011-01-27 16:30
Thanks Brian,
it worked. Nice to have all the Uniprot and NCBI links now also available.

-Maik