The expression matrix assay ties expression-level information to sample and feature/probe information. After appropriate files are loaded into the system, users can explore microarray results by building queries and visualizations based on feature/probe properties (such as genes) and sample properties.
Expression data may be manually extracted from Gene Expression Omnibus (GEO)
, transformed, and imported to LabKey Server.
Files loaded include:
- Metadata about features/probes (typically at the plate level)
- Sample information
- Actual expression data (often called a "series matrix" file)
Review File Formats
In order to use the assay, you will need three sets of data: a run file, a sample set, and a feature annotation file.
The run file will have one column for probe ids (ID_REF) and a variable number of columns named after a sample found in your sample set. The ID_REF column in the run file will contain probe ids that will be found in your feature annotation file, under the Probe_ID column. All of the other columns in your run file will be named after samples, which must be found in your sample set.
In order to import your run data, you must first import your sample set and your feature annotation set. Your run import will fail if we are unable to find a match for your ID_REF value or for a sample in your sample set. If you don't have current files, you can use these small example files:
Set up the Expression Matrix Assay
- Create a new folder of type Microarray.
- Add a Sample Sets web part to the Microarray Dashboard tab.
- Click the Import Sample Set button.
- On the Import Sample Set page, name your sample set. Here we use ExpressionMatrixSamples.
- In the sample set data text area, paste in a TSV of all your samples. (Or use the provided file sample_expression_matrix.tsv.)
- In the Id Columns section, using the first dropdown select the 'id' column of your samples. (If using sample_expression_matrix.tsv, select "ID_REF".
- Click Submit to save your sample set.
- Return to the Microarray Dashboard.
- Add a Feature Annotation Sets web part at the bottom of the left column.
- Click Import Feature Annotation Set.
- Enter the Name:
- Enter the Vendor:
- For Folder: Select the current folder.
- Browse to select the annotation file. (Or use the provided file sample_feature_annotation_set.txt.) These can be from any manufacturer (i.e. Illumina or Affymetrix), but must be a TSV file with the following column headers:
- Click Upload.
Create a New Assay Design
- Select (Admin) > Manage Assays.
- Click New Assay Design.
- Select the Expression Matrix assay type.
- Scroll down to select the Assay Location (for our samples, use the current folder).
- Name your assay and save it.
Import a Run
Runs will be in the TSV format and have a variable number of columns.
- The first column will always be ID_REF, which will contain a probe id that matches the Probe_ID column from your feature annotation set.
- The rest of the columns will be for samples from your imported sample set (ExpressionMatrixSamples).
An example of column headers:
ID_REF GSM280331 GSM280332 GSM280333 GSM280334 GSM280335 GSM280336 GSM280337 GSM280338 ...
An example of row data:
1007_s_at 7.1722616266753 7.3191207236008 7.32161337343459 7.31420082996567 7.13913363545954 ...
To import a run:
- Navigate to your ExpressionMatrix assay. (Click the name in the Assay List.)
- Click Import data.
- Select the appropriate Feature Annotation Set.
- Click Choose File and navigate to your series matrix file (or use the provided example file series_matrix.tsv).
- Click Save and Finish to begin the import.
Note: Importing a run may take a very long time as we are generally importing millions of rows of data. The Run Properties options include a checkbox named Import Values
. If checked, the values for the run are imported normally. If unchecked, the values are not imported to the server, but links between the series matrix, samples, and annotations are preserved.
View Run Results
After the run is imported, to view the results:
- Click the file name in the runs grid.
There is also an alternative view of the run data, which is pivoted to have a column for each sample and a row for each probe id. To view the data as a pivoted grid:
- Select Admin > Go to Module > Query
- Browse to assay > ExpressionMatrix > [YOUR_ASSAY_NAME] > FeatureDataBySample
- You can add this query to the dashboard using a Query web part.