Hi,
We are using LabKey 9.3 on a linux system.
In trying to run Mascot (2.1.03) remotely, some data uploads fail due to finding proteins with special characters in their fasta header files.
I have installed cgi/labkeydbmgmt.pl on our Mascot server as per you instructions.
[BTW, there is no Mascot 2.1.3 that I know of, instead the current linux version is 2.1.03 and the current Win is 2.2.x -- please correct your "mascot setup" documentation.. I wasted 1/2hr on that alone.]
LabKey runs the Mascot queries very well -- thanks! But sometimes fails in loading the results and gives this error:
11 Feb 2010 22:15:27,918 INFO : Starting to import spectra from /home/labkey/Masc_Larry/Sample3746_SCXf14_IMACe_lcmsms_1.mzXML
11 Feb 2010 22:15:28,024 INFO : Importing MS/MS results is 32% complete
11 Feb 2010 22:15:28,380 INFO : Importing MS/MS results is 33% complete
11 Feb 2010 22:15:28,889 INFO : Importing MS/MS results is 34% complete
11 Feb 2010 22:15:29,178 ERROR: XMLStreamException in hasNext()
com.ctc.wstx.exc.WstxParsingException: Unexpected close tag </search_hit>; expected </psi>.
at [row,col {unknown-source}]: [258980,12]
at com.ctc.wstx.sr.StreamScanner.constructWfcException(StreamScanner.java:605)
at com.ctc.wstx.sr.StreamScanner.throwParseError(StreamScanner.java:461)
at com.ctc.wstx.sr.BasicStreamReader.reportWrongEndElem(BasicStreamReader.java:3256)
(and stops after a few of these error lines).
I have found this to be solely due to special characters in the FASTA files, replacing those characters immediately fixes the problem.
Here are the cases where I LabKey is failing to put import into XML properly:
IPI human:
>IPI:IPI00431197.3| .... intron 4&9 variant
>IPI:IPI00465120.3|TREMBL:P78550;Q6I955 Tax_Id=9606 Gene_Symbol=- 3<beta>-HSD <psi>1 protein
>IPI:IPI00816409.1|TREMBL:A0N5T0 Tax_Id=9606 Gene_Symbol=- V<gamma>1 protein (Fragment)
>IPI:IPI00816761.1|TREMBL:Q16366 Tax_Id=9606 Gene_Symbol=CREB1 <alpha>CREB-1 protein (Fragment)
It fails the first due to ampersand and then a number.
It fails the second (and I expect others also) due to the <psi> (unmatched XML end-tag).
They are unusual characters in the FASTA files, so I expect you just missed them when testing the otherwise excellent app.
Can you fix this?
thanks,
Kutbuddin |