Pivot Queries: /Documentation

Pivot Queries

A pivot query helps you summarize and re-visualize data in a table. Data can be grouped or aggregated to help you focus on a particular aspect of your data. For example a pivot table can help you see how many data points of a particular kind are present, or it can represent your data by aggregating it into different categories.

Create Query
Syntax for PIVOT Query
PIVOT...BY...IN Syntax

PIVOT...BY...IN using Sub-SELECT

Grouped Headers
"Summary" Columns
Options for Pivoting by Two Columns
Examples

Create Query

To try this yourself, you can download this sample file and import it as a Standard assay type. Remember your assay design name; we use "PivotDemo" below.

DataToPivot.xlsx

Create a new SQL query and edit its source.

Select > Go To Module > Query.
Select a schema. In our example, we used the assay schema, specifically "assay.General.PivotDemo" in our example.
Click Create New Query.
Name it and confirm the correct query/table is selected. For our assay, we chose "Data" (the assay results).
Click Create and Edit Source.

Syntax for PIVOT...BY Query

A PIVOT query is essentially a SELECT specifying which columns you want and how to PIVOT and GROUP them. To write a pivot query, follow these steps.

(1) Start with a base SELECT query.

SELECT ParticipantID, Date, RunName, M1
FROM Data

(2) Identify the data cells you want to pivot and how. In this example, we focus on the values in the RunName column, to separate M1 values for each.

(3) Select an aggregating function to handle any non-unique data, even if you do not expect to need it. MAX, MIN, AVG, and SUM are possibilities. Here we have only one row for each participant/date/run combination, so all would produce the same result, but in other data you might have several sets with multiple values. When aggregating, we can also give the column a new name, here MaxM1.

(4) Identify columns which remain the same to determine the GROUP BY clause.

SELECT ParticipantId, Date, RunName, Max(M1) as MaxM1
FROM Data
GROUP BY ParticipantId, Date, RunName

(5) Finally, pivot the cells.

SELECT ParticipantId, Date, RunName, Max(M1) as MaxM1
FROM Data
GROUP BY ParticipantId, Date, RunName
PIVOT MaxM1 by RunName

PIVOT...BY...IN Syntax

You can focus on particular values using an IN clause to identify specific columns (i.e. distinct rows in the original data) to use in the pivot.

(6) In our example from above, perhaps we want to see only data from two of the runs:

SELECT ParticipantId, Date, RunName, Max(M1) as MaxM1
FROM Data
GROUP BY ParticipantId, Date, RunName
PIVOT MaxM1 by RunName IN ('Run1', 'Run2')

Note that pivot column names are case-sensitive. You may need to use lower() or upper() in your query to work around this issue if you have two values who differ only by letter case.

In addition to reducing the set of result columns, specifying column names in an IN clause has a performance benefit, since LabKey won't need to execute the query to determine the column set.

PIVOT...IN using Sub-SELECT

If you won't know the specific column names for an IN clause (i.e. the distinct row values in the original data) at the time of writing your query, you can also use a sub-SELECT statement in the IN clause. You could either pull the column names from a separate list, a simpler query, or as in this simple query, from values in the same table you will pivot. For example:

SELECT Language, Gender, MAX(Height) AS H
   FROM Demographics
   GROUP BY Language, Gender
   PIVOT H By Gender IN (SELECT DISTINCT Gender FROM Demographics)

Grouped Headers

You can add additional aggregators and present two pivoted columns, here we show both an minimum and a maximum M1 value for each participant/date/run combination. In this case they are the same, but the pattern can be applied to varying data.

SELECT ParticipantId, Date, RunName, Min(M1) as MinM1, Max(M1) as MaxM1
FROM Data
GROUP BY ParticipantId, Date, RunName
PIVOT MinM1, MaxM1 by RunName IN ('Run1', 'Run2')

"Summary" Columns

"Summary" columns are those columns not included in the group-by or pivoted list. When generating the pivot, these summary columns are aggregated as follows:

a COUNT or SUM aggregate summary column is wrapped with SUM
a MIN or MAX is wrapped with a MIN or MAX

For example:

SELECT
-- group by columns
AssignedTo,
Type,

-- summary columns. Turned into a SUM over the COUNT(*)
COUNT(*) AS Total,

-- pivoted columns
SUM(CASE WHEN Status = 'open' THEN 1 ELSE 0 END) AS "Open",
SUM(CASE WHEN Status = 'resolved' THEN 1 ELSE 0 END) AS Resolved

FROM issues.Issues
WHERE Status != 'closed'
GROUP BY AssignedTo, Type
PIVOT "Open", Resolved BY Type IN ('Defect', 'Performance', 'To Do')

Options for Pivoting by Two Columns

Two levels of PIVOT are not supported, however some options for structuring your query may give you the behavior you need. In this scenario, you have experiment data and want to pivot by both peak intensity (not known up front) and sample condition.

The closest option is to concatenate the two values together and pivot on that "calculated" column.

SELECT
  Run.Batch,
  Run.Sample,
  COUNT(*) AS ResultsCount,
  Run.SampleCondition || ' ' || PeakLabel AS ConditionPeak,
  AVG(Data.PercTimeCorrArea) AS AvgPercTimeCorrArea,
FROM Data
GROUP BY Run.Batch, Run.Sample, Run.SampleCondition || ' ' || PeakLabel
PIVOT AvgPercTimeCorrArea BY ConditionPeak

If concatenation is not practical, you could use a version like this. It will include duplicate Batch and Sample columns, which you could get rid of by explicitly listing the aggregates from each sample condition sub-query. Syntax like "SELECT NR.* EXCEPT (NR_Batch, NR_Sample)" to get all of the pivot columns except the joining columns is not supported.

SELECT
  NR.*,
  RED.*
FROM (

  SELECT
    Run.Batch AS NR_Batch,
    Run.Sample AS NR_Sample,
    --Run.SampleCondition,
    PeakLabel,
    AVG(Data.PercTimeCorrArea) AS NR_AvgPercTimeCorrArea,
  FROM Data
  WHERE Run.SampleCondition = 'NR'
  GROUP BY Run.Batch, Run.Sample, Run.SampleCondition, PeakLabel
  PIVOT NR_AvgPercTimeCorrArea BY PeakLabel
   
) NR

INNER JOIN (

  SELECT
    Run.Batch AS RED_Batch,
    Run.Sample AS RED_Sample,
    --Run.SampleCondition,
    PeakLabel,
    AVG(Data.PercTimeCorrArea) AS RED_AvgPercTimeCorrArea,
  FROM Data
  WHERE Run.SampleCondition = 'RED'
  GROUP BY Run.Batch, Run.Sample, Run.SampleCondition, PeakLabel
  PIVOT RED_AvgPercTimeCorrArea BY PeakLabel

) RED
ON NR.NR_Batch = RED.RED_Batch AND NR.NR_Sample = RED.RED_Sample

Examples

Example 1 - Open issues by priority for each area

Another practical application of a pivot query is to display a list of how many issues of each priority are open for each feature area.

SELECT Issues.Area, Issues.Priority, Count(Issues.IssueId) 
  AS CountOfIssues
  FROM Issues
  GROUP BY Issues.Area, Issues.Priority
  PIVOT CountOfIssues BY Priority IN (1,2,3,4)

Example 2 - Open issues by type grouped by who they are assigned to

SELECT
  -- Group By columns
     AssignedTo,
     Type,

  -- Summary columns. Turned into a SUM over the COUNT(*)
     COUNT(*) AS Total,

  -- Pivoted columns
     SUM(CASE WHEN Status = 'open' THEN 1 ELSE 0 END) AS "Open",
     SUM(CASE WHEN Status = 'resolved' THEN 1 ELSE 0 END) AS Resolved

  FROM issues.Issues
  WHERE Status != 'closed'
  GROUP BY AssignedTo, Type
  PIVOT "Open", Resolved BY Type IN ('Defect', 'Performance', 'To Do')

Example 3 - Luminex assay fluorescence intensity by analyte

SELECT ParticipantId, date, Analyte, 
  Count(Analyte) AS NumberOfValues,
  AVG(FI) AS AverageFI, 
  MAX(FI) AS MaxFI
FROM "Luminex Assay 100"
GROUP BY ParticipantID, date, Analyte
PIVOT AverageFI, MaxFI BY Analyte

Example 4 - Support tickets by priority and status

SELECT SupportTickets.Client AS Client, SupportTickets.Status AS Status,
  COUNT(CASE WHEN SupportTickets.Priority = 1 THEN SupportTickets.Status END) AS Pri1,
  COUNT(CASE WHEN SupportTickets.Priority = 2 THEN SupportTickets.Status END) AS Pri2,
  COUNT(CASE WHEN SupportTickets.Priority = 3 THEN SupportTickets.Status END) AS Pri3,
  COUNT(CASE WHEN SupportTickets.Priority = 4 THEN SupportTickets.Status END) AS Pri4,
  FROM SupportTickets
  WHERE SupportTickets.Created >= (curdate() - 7)
  GROUP BY SupportTickets.Client, SupportTickets.Status
  PIVOT Pri1, Pri2, Pri3, Pri4 BY Status

LabKey Support

LabKey Support

Pivot Queries

Pivot Queries

Create Query

Syntax for PIVOT...BY Query

PIVOT...BY...IN Syntax

PIVOT...IN using Sub-SELECT

Grouped Headers

"Summary" Columns

Options for Pivoting by Two Columns

Examples

Example 1 - Open issues by priority for each area

Example 2 - Open issues by type grouped by who they are assigned to

Example 3 - Luminex assay fluorescence intensity by analyte

Example 4 - Support tickets by priority and status

Related Topics

Search

Docs & Product Feedback

Was this content helpful?

Log in or register an account to provide feedback

Pages

LabKey Support

LabKey Support

Pivot Queries

Pivot Queries

Create Query

Syntax for PIVOT...BY Query

PIVOT...BY...IN Syntax

PIVOT...IN using Sub-SELECT

Grouped Headers

"Summary" Columns

Options for Pivoting by Two Columns

Examples

Example 1 - Open issues by priority for each area

Example 2 - Open issues by type grouped by who they are assigned to

Example 3 - Luminex assay fluorescence intensity by analyte

Example 4 - Support tickets by priority and status

Related Topics

Search

Scope ?

Categories ?

Sort ?

Docs & Product Feedback

Was this content helpful?

Log in or register an account to provide feedback

Pages