Example 1: Review a Basic XAR.xml

2024-04-19

Experiment runs are described by a researcher as a series of experimental steps performed on specific inputs, producing specific outputs. The researcher can define any attributes that may be important to the study and can associate these attributes with any step, input, or output. These attributes are known as experimental annotations. Experiment descriptions and annotations can be saved in an XML document known as an eXperimental ARchive or xar (pronounced zar) file.

The best way to understand the format of a xar.xml document is to walk through a simple example. The example experiment run starts with a sample (Material) and ends up with some analysis results (Data). In LabKey Server, this example run looks like the following:

In the summary view, the red hexagon in the middle represents the Example 1 experiment run as a whole. It starts with one input Material object and produces one output Data object. Clicking on the Example 1 node brings up the details view, which shows the protocol steps that make up the run. There are two steps: a "prepare sample" step which takes as input the starting Material and outputs a prepared Material, followed by an "analyze sample" step which performs some assay of the prepared Material to produce some data results. Note that only the data results are designated as an output of the run (i.e. shown as an output of the run in the summary view, and marked with a black diamond and the word "Output" in details view). If the prepared sample were to be used again for another assay, it too might be marked as an output of the run. The designation of what Material or Data objects constitute the output of a run is entirely up to the researcher.

The xar.xml file that produces the above experiment structure is shown in the following table. The schema doc for this Xml instance document is XarSchema_minimum.xsd. (This xsd file is a slightly pared-down subset of the schema that is compiled into the LabKey Server source project; it does not include some types and element nodes that are being redesigned).

Table 1:  Xar.xml for a simple 2-step protocol

First, note the major sections of the document, highlighted in yellow:

 

ExperimentArchive (root):  the document node, which specifies the namespaces used by the document and (optionally) a path to a schema file for validation.

 

Experiment:  a section which describes one and only one experiment which is associated with the run(s) described in this xar.xml

 

ProtocolDefinitions:  the section describes the protocols that are used by the run(s) in this document.  These protocols can be listed in any order in this section.  Note that there are 4 protocols defined for this example:  two detail protocols (Sample prep and Example analysis) and two “bookend” protocols.  One bookend represents the start of the run (Example 1 protocol, of type ExperimentRun) and the other serves to mark or designate the run outputs (the protocol of type ExperimentRunOutput).

 

Also note the long string highlighted in blue, beginning with “urn:lsid:…”.  This string is called an LSID, short for Life Sciences Identifier.  LSIDs play a key role in LabKey Server.  The highlighted LSID identifies the Protocol that describes the run as a whole.  The run protocol LSID is repeated in several places in the xar.xml ; these locations must match LSIDs for the xar.xml to load correctly.  (The reason for the repetition is that the format is designed to handle multiple ExperimentRuns involving possibly different run protocols.)

<?xml version="1.0" encoding="UTF-8"?>

<exp:ExperimentArchive xmlns:exp="http://cpas.fhcrc.org/exp/xml"

         xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"

         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

         xsi:schemaLocation="http://cpas.fhcrc.org/exp/xml XarSchema_minimum.xsd">

   <exp:Experiment rdf:about="${FolderLSIDBase}:Tutorial">

      <exp:Name>Tutorial Examples</exp:Name>

      <exp:Comments>Examples of xar.xml files.</exp:Comments>

   </exp:Experiment>

   <exp:ProtocolDefinitions>

      <exp:Protocol rdf:about="urn:lsid:localhost:Protocol:MinimalRunProtocol.FixedLSID">

         <exp:Name>Example 1 protocol</exp:Name>

         <exp:ProtocolDescription>This protocol is the "parent" protocol of the run.  Its inputs are …</exp:ProtocolDescription>

         <exp:ApplicationType>ExperimentRun</exp:ApplicationType>

         <exp:MaxInputMaterialPerInstance xsi:nil="true"/>

         <exp:MaxInputDataPerInstance xsi:nil="true"/>

         <exp:OutputMaterialPerInstance xsi:nil="true"/>

         <exp:OutputDataPerInstance xsi:nil="true"/>

      </exp:Protocol>

      <exp:Protocol rdf:about="urn:lsid:localhost:Protocol:SamplePrep">

         <exp:Name>Sample prep protocol</exp:Name>

         <exp:ProtocolDescription>Describes sample handling and preparation steps</exp:ProtocolDescription>

         <exp:ApplicationType>ProtocolApplication</exp:ApplicationType>

         <exp:MaxInputMaterialPerInstance>1</exp:MaxInputMaterialPerInstance>

         <exp:MaxInputDataPerInstance>0</exp:MaxInputDataPerInstance>

         <exp:OutputMaterialPerInstance>1</exp:OutputMaterialPerInstance>

         <exp:OutputDataPerInstance>0</exp:OutputDataPerInstance>

      </exp:Protocol>

      <exp:Protocol rdf:about="urn:lsid:localhost:Protocol:Analyze">

         <exp:Name>Example analysis protocol</exp:Name>

         <exp:ProtocolDescription>Describes analysis procedures and settings</exp:ProtocolDescription>

         <exp:ApplicationType>ProtocolApplication</exp:ApplicationType>

         <exp:MaxInputMaterialPerInstance>1</exp:MaxInputMaterialPerInstance>

         <exp:MaxInputDataPerInstance>0</exp:MaxInputDataPerInstance>

         <exp:OutputMaterialPerInstance>0</exp:OutputMaterialPerInstance>

         <exp:OutputDataPerInstance>1</exp:OutputDataPerInstance>

         <exp:OutputDataType>Data</exp:OutputDataType>

      </exp:Protocol>

      <exp:Protocol rdf:about="urn:lsid:localhost:Protocol:MarkRunOutput">

         <exp:Name>Mark run outputs</exp:Name>

         <exp:ProtocolDescription>Mark the output data or materials for the run.  Any and all inputs…</exp:ProtocolDescription>

         <exp:ApplicationType>ExperimentRunOutput</exp:ApplicationType>

         <exp:MaxInputMaterialPerInstance xsi:nil="true"/>

         <exp:MaxInputDataPerInstance xsi:nil="true"/>

         <exp:OutputMaterialPerInstance>0</exp:OutputMaterialPerInstance>

         <exp:OutputDataPerInstance>0</exp:OutputDataPerInstance>

      </exp:Protocol>

   </exp:ProtocolDefinitions>

 

The next major section of xar.xml is the ProtocolActionDefinitions:  This section describes the ordering of the protocols as they are applied in this run.   A ProtocolActionSet defines a set of “child” protocols within a parent protocol.  The parent protocol must be of type ExperimentRun.  Each action (child protocol) within the set (experiment run protocol) is assigned an integer called an ActionSequence number.  ActionSequence numbers must be positive, ascending integers, but are otherwise arbitrarily assigned.  (It is useful when hand-authoring xar.xml files to leave gaps in the numbering between Actions to allow the insertion of new steps in between existing steps, without requiring a renumbering of all nodes.  The ActionSet always starts with a root action which is the ExperimentRun node listed as a child of itself. 

 

   <exp:ProtocolActionDefinitions>

      <exp:ProtocolActionSet ParentProtocolLSID="urn:lsid:localhost:Protocol:MinimalRunProtocol.FixedLSID">

         <exp:ProtocolAction ChildProtocolLSID="urn:lsid:localhost:Protocol:MinimalRunProtocol.FixedLSID" ActionSequence="1">

            <exp:PredecessorAction ActionSequenceRef="1"/>

         </exp:ProtocolAction>

         <exp:ProtocolAction ChildProtocolLSID="urn:lsid:localhost:Protocol:SamplePrep" ActionSequence="10">

            <exp:PredecessorAction ActionSequenceRef="1"/>

         </exp:ProtocolAction>

         <exp:ProtocolAction ChildProtocolLSID="urn:lsid:localhost:Protocol:Analyze" ActionSequence="20">

            <exp:PredecessorAction ActionSequenceRef="10"/>

         </exp:ProtocolAction>

         <exp:ProtocolAction ChildProtocolLSID="urn:lsid:localhost:Protocol:MarkRunOutput" ActionSequence="30">

            <exp:PredecessorAction ActionSequenceRef="20"/>

         </exp:ProtocolAction>

      </exp:ProtocolActionSet>

   </exp:ProtocolActionDefinitions>