×
The 2025 Client Survey is now open! Responses will be accepted up until 10/31. Please feel free to share this link within your organization and we thank you in advance for your participation!
Suppose you wanted to expand the Patients dataset to also include male participants who are "Natural Controllers" of HIV.

To do this, we add a SQL query that returns a selection of records from the Demographics table, in particular all Male participants who are Natural Controllers.

And we'll create a new ETL process from scratch, drawing on the new SQL query.

Create a New Source Query

  • Locate the source code for the ETL module. Depending on where you deployed it, go to either LABKEY_HOME/externalModules/etlmodule or build/deploy/modules/etlmodule.
  • Go to the directory etlmodule/queries/study.
  • In that directory, create a file named "MaleNC.sql".
  • Open the file in a text editor and copy and paste the following code into the file:
SELECT Demographics.ParticipantId,
Demographics.StartDate,
Demographics.Gender,
Demographics.PrimaryLanguage,
Demographics.Country,
Demographics.Cohort,
Demographics.TreatmentGroup
FROM Demographics
WHERE Demographics.Gender = 'm' AND Demographics.TreatmentGroup = 'Natural Controller'
  • Save the file.
  • Restart the server.

Create a New ETL Process

ETL processes are defined by XML configuration files that specify the data source, the data target, and other properties. Here we create a new configuration that draws from the query we just created above.

  • In the etlmodule/etls directory, create a new XML file called "MaleNC.xml".
  • Copy the following into MaleNC.xml, and save.
<?xml version="1.0" encoding="UTF-8"?>
<etl xmlns="http://labkey.org/etl/xml">
<name>Demographics >>> Patients (Males)</name>
<description>Update data for study on male patients.</description>
<transforms>
<transform id="males">
<source schemaName="study" queryName="MaleNC"/>
<destination schemaName="study" queryName="Patients" targetOption="merge"/>
</transform>
</transforms>
<schedule>
<poll interval="1h"/>
</schedule>
</etl>
  • Notice that this configuration file has our query (MaleNC) as its source, and the Patients query as its target.
  • Refresh the browser. Notice that the system will add your new module to the server. Click Next if necessary to complete the update.
  • In the "ETL Workspace" folder, notice our new ETL process is now listed in the Data Transforms web part.

Run the ETL Process

  • Click Run Now next to the new process name. You will need to sign in to see this button.
  • Refresh in the pipeline window until the job completes, then click the ETLs tab.
  • New records will have been copied to the Patients table, making a total of 43 records (42 if you skipped the step of changing the gender of a participant in the previous tutorial step).

Previous Step


previousnext
 
expand allcollapse all