Register a Sequence or Molecule

Identity Service

When registering a sequence or molecule, use the "identity" of the sequence to determine uniqueness. External tools can use the following API to get the identity of a sequence prior to registration. To get the identity of a sequence use either the identity/get.api or identity/ensure.api. The ensure.api will create a new identity if the sequence hasn't been added yet.

To get or ensure the identity of a single nucleotide sequence:

LABKEY.Ajax.request({
url: LABKEY.ActionURL.buildURL("identity", "get.api"),
jsonData: {
items: [{
type: "nucleotide", data: "GATTACA"
}]
}
});

To get or ensure the identity of a collection of sequences:

LABKEY.Ajax.request({
url: LABKEY.ActionURL.buildURL("identity", "get.api"),
jsonData: {
items: [{
type: "nucleotide", data: "GATTACA"
},{
type: "protein", data: "ELVISLIVES"
}]
}
});

To get or ensure the identity of a molecule containing multiple protein sequences:

LABKEY.Ajax.request({
url: LABKEY.ActionURL.buildURL("identity", "get.api"),
jsonData: {
items: [{
type: "molecule", items: [{
type: "protein", data: "MAL", count: 2
},{
type: "protein", data: "SYE", count: 3
]}
}]
}
});

Manual Entry

You can enter molecule, sequences, cell lines, etc, using registration wizards. For an example use of the wizard, see Register Nucleotide Sequences

Cut-and-Paste or Import from a File

The LabKey import data page can also be used to register new entities. For example, to register new nucleotide sequences:

  • Go to the list of all entity types (the DataClasses web part).
  • Click NucSequence.
  • In the grid, select > Import Bulk Data.
  • Upload an Excel file or paste in text (select tsv or csv) in the format:
DescriptionprotSeqIdtranslationFrametranslationStarttranslationEndsequence
Anti_IGF-1PS-7000caggtg...

When importing a nucleotide sequence with a related protSeqId using the protein sequence's name, you will need to click the Import Lookups By Alternate Key checkbox on the Import Data page. The Name column may be provided, but will be auto-generated if it isn't. The Ident column will be auto-generated based upon the sequence.

To register new protein sequences:

  • Go to the list of all entity types
  • Click ProtSequence
  • In the grid, select > Import Bulk Data.
  • Paste in a TSV file or upload an Excel file in the format below.
NameDescriptionchainFormatIdorganismssequence
PS-1PS1038-11["human","mouse"]MALWMRL...

To register new molecules:

  • Go to the list of all entity types
  • Click Molecule
  • In the grid, select > Import Bulk Data.
  • Paste in data of the format:
DescriptionstructureFormatIdseqTypecomponents
description11[{type: "nucleotide", name: "NS-1", stoichiometry: 3}]
description13[{type: "protein", name: "PS-1", stoichiometry: 2}, {type: "protein", ident: "ips:1234"}]
description13[{type: "chemical", name: "CH-1"}, {type: "molecule", name: "M-1"}]

Note that the set of components is provided as a JSON array containing one or more sequences, chemistry linkers, or other molecules. The JSON object can refer to an existing entity by name (e.g "NS-1" or "PS-1") or by providing the identity of the previously registered entity (e.g., "ips:1234" or "m:7890"). If the entity isn't found in the database, an error will be thrown -- for now, all components must be registered prior to registering a molecule.

Register via Query API

The client APIs also can be used to register new entities.

Register Nucleotide Sequence

From your browser's dev tools console, enter the following to register new nucleotide sequences:

LABKEY.Query.insertRows({
schemaName: "exp.data",
queryName: "NucSequence",
rows: [{
description: "from the client api",
sequence: "gattaca"
},{
description: "another",
sequence: "cattaga"
}]
});

Register Protein Sequence

To register new protein sequences:

LABKEY.Query.insertRows({
schemaName: "exp.data",
queryName: "ProtSequence",
rows: [{
name: "PS-100",
description: "from the client api",
chainFormatId: 1,
sequence: "ML"
}]
});

Register Molecule

To register new molecules:

LABKEY.Query.insertRows({
schemaName: "exp.data",
queryName: "Molecule",
rows: [{
description: "from the client api",
structureFormatId: 1,
components: [{
type: "nucleotide", name: "NS-202"
}]
},{
description: "another",
structureFormatId: 1,
components: [{
type: "protein", name: "PS-1"
},{
type: "protein", name: "PS-100", count: 5
}]
}]
});

Lineage, Derivation, and Samples

Parent/child relationships within an entity type are modeled using derivation. For example, the details page for this nucleotide sequence (NS-3) shows that two other sequences (NS-33 and NS-34) have been derived from it.

To create new children, you can use the "experiment/derive.api" API, but it is still subject to change. The dataInputs is an array of parents each with an optional role. The targetDataClass is the LSID of the entity type of the derived datas. The dataOutputs is an array of children each with an optional role and a set of values.

LABKEY.Ajax.request({
url: LABKEY.ActionURL.buildURL("experiment", "derive.api"),
jsonData: {
dataInputs: [{
rowId: 1083
}],


targetDataClass: "urn:lsid:labkey.com:DataClass.Folder-5:NucSequence",
dataOutputs: [{
role: "derived",
values: {
description: "derived!",
sequence: "CAT"
}
}]
}
})

Samples will be attached to an entity using derivation. Instead of a targetDataClass and dataOutputs, use a targetSampleSet and materialOutputs. For example:

LABKEY.Ajax.request({
url: LABKEY.ActionURL.buildURL("experiment", "derive.api"),
jsonData: {
dataInputs: [{
rowId: 1083
}],


targetSampleSet: "urn:lsid:labkey.com:SampleSet.Folder-5:Samples",
materialOutputs: [{
role: "sample",
values: {
name: "new sample!",
measurement: 42
}
}]
}
})

Register Parents/Inputs

To indicate parents/inputs when registering an entity, use the columns "DataInputs/<DataClassName>" or "MaterialInputs/<SampleSetName>". The value in the column is a comma separated list of values in a single string. This works for both DataClass and SampleSet:

LABKEY.Query.insertRows({
schemaName: "exp.data",
queryName: "MyDataClass",
rows: [{
name: "blush",
"DataInputs/MyDataClass": "red wine, white wine"
}]
});

The following example inserts into both parents and children into the same sample set:

LABKEY.Query.insertRows({
schemaName: "samples",
queryName: "SamplesAPI",
rows: [{
name: "red wine"
},{
name: "white wine"
},{
name: "blush",
"MaterialInputs/SamplesAPI": "red wine, white wine"
}]
});

Related Topics

Discussion

Was this content helpful?

Log in or register an account to provide feedback


previousnext
 
expand all collapse all