Code Snippets Catalog [Python]

Objective and structure

This catalog will provide information about each Python code snippet in the Researcher Workbench. The catalog is organized into three categories of snippets and within each category, the snippets are arranged by alphabetical order (except for snippets_setup).

  1. Dataset snippets
  2. SQL snippets
  3. Storage snippets
Dataset snippets

Dataset snippets are for workbench users who use Dataset Builder to retrieve their data.

 

`snippets_setup`

This snippet imports all software libraries needed to execute the dataset snippets. It also sets the default theme and dimensions for plots.
You can view the snippet Github repository page here.

IMPORTANT NOTE: All code snippets below assume the snippets_setup has been executed.

`add_age_to_demographics`

This snippet calculates the current age of people in your dataframe. It assumes your dataframe has the 'DATE_OF_BIRTH' column.
The calculation does not take into account whether the person is deceased.
You can view the snippet Github repository page here.

`join_dataframes`

This snippet joins two dataframes together using any columns they have in common. Through an inner join, it returns all the columns of both dataframes, including the common columns, but only rows that have matching values in both tables.
You can view the snippet Github repository page here.

`measurement_by_age_and_gender.plotnine`

This snippet plots numeric values of measurements by age and gender using joined demographics and measurements dataframes. The plot assumes the demographics and measurements dataframes were joined using snippet join_dataframes.
You can view the snippet Github repository page here.

 

`measurement_by_gender.plotnine`

This snippet plots numeric values of measurements by gender using joined demographics and measurements dataframes. The plot assumes the demographics and measurements dataframes were joined using snippet join_dataframes.
You can view the snippet Github repository page here.

`summarize_a_dataframe`

This snippet displays summary statistics of a dataframe. By default, the snipped code examines the first 10,000 rows. The default can be changed.
You can view the snippet Github repository page here.

`summarize_a_survey_by_question_concept_id`

This snippet outputs a table and a graph of participant counts by response for one question_concept_id.
The snippet assumes that a dataframe containing survey questions and answers already exists. You must specify the desired question_concept_id.
You can view the snippet Github repository page here.

`summarize_a_survey_module`

This snippet outputs a table of participant counts by question in a module. You also have the option to specify a denominator. In this case, the snippet will output participant percentages by question in the module. The snippet assumes that a dataframe containing survey questions and answers already exists. You must specify the desired question_concept_id.
You can view the snippet Github repository page here.

SQL Snippets

SQL snippets are for workbench users who either know SQL or want to learn how to use SQL.

`snippets_setup`

This snippet imports all software libraries needed to excecute the SQL snippets. It also sets the default theme and dimensions for plots.
You can view the snippet Github repository page here.

IMPORTANT NOTE: All code snippets below assume the snippets_setup has been executed.

 

`measurement_of_interest.sql`

This snippet returns participants birth dates, genders and sites for a measurement of interest in your cohort. You must specify the following parameters:

  • a measurement_concept_id, e.g. 3000963 (Hemoglobin)
  • a unit_concept_id, e.g. 8636 (gram per liter)
  • your cohort's SQL query

You can view the snippet Github repository page here

 

`measurement_of_interest_by_age_and_gender.plotnine`

This snippet plots numeric values of a measurement of interest by age and gender of participants in your cohort. The plot assumes snippet measurement_of_interest.sql has been run.
You can view the snippet Github repository page here.

`measurement_of_interest_by_gender.plotnine`

This snippet plots numeric values of a measurement of interest by gender of participants in your cohort. The plot assumes snippet measurement_of_interest.sql has been run.
You can view the snippet Github repository page here.

`measurement_of_interest_by_site.plotnine`

This snippet plots numeric values of a measurement of interest by site for participants in your cohort. The plot assumes snippet measurement_of_interest has been run.
You can view the snippet Github repository page here.

`measurements_of_interest_summary.sql`

This snippet displays summary statistics for a measurement of interest in your cohort. You must specify the following parameters:

  • measurement of interest: a case-insensitive string, such as "hemoglobin", to be compared to all measurement concept names to identify those of interest
  • your cohort's SQL query

You can view the snippet Github repository page here.

`most_recent_measurement_of_interest.sql`

This snippet returns participants birth dates, genders and sites for a measurement of interest in your cohort. The results are limited to only the most recent result per person in our cohort. You must specify the following parameters:

  • a measurement_concept_id, e.g. 3000963 (Hemoglobin)
  • a unit_concept_id, e.g. 8636 (gram per liter)
  • your cohort's SQL query

You can view the snippet Github repository page here.

`most_recent_measurement_of_interest_by_age_and_gender.plotnine`

This snippet plots numeric values of the most recent measurement of interest by age and gender of participants in your cohort. The plot assumes snippet most_recent_measurement_of_interest.sql has been run.
You can view the snippet Github repository page here.

`most_recent_measurement_of_interest_by_gender.plotnine`

This snippet plots numeric values of the most recent measurement of interest by gender of participants in your cohort. The plot assumes snippet most_recent_measurement_of_interest.sql has been run.
You can view the snippet Github repository page here.

`most_recent_measurement_of_interest_by_site.plotnine`

This snippet plots numeric values of the most recent measurement of interest by site of participants in your cohort. The plot assumes snippet most_recent_measurement_of_interest.sql has been run.
You can view the snippet Github repository page here.

`number_of_participants_with_measurements.sql`

This snippet returns the count of unique participants in your cohort that have at least one measurement. You must specify your cohort's SQL query.
You can view the snippet Github repository page here.

`number_of_participants_with_med_conditions.sql`

This snippet returns the count of unique participants in your cohort that have at least one condition. You must specify your cohort's SQL query.
You can view the snippet Github repository page here.

`total_number_of_participants.sql`

This snippet returns the count of unique participants in your cohort. You must specify your cohort's SQL query.
You can view the snippet Github repository page here.

Storage Snippets

Storage snippets are for workbench users who directly use the workspace bucket.

`snippets_setup`

This snippet imports all software libraries needed to execute the storage snippets.
You can view the snippet Github repository page here.

IMPORTANT NOTE: All code snippets below assume the snippets_setup has been executed.

`copy_data_to_workspace_bucket`

This snippet saves your dataframe into a csv file in a "data" folder in Google Bucket.
You can view the snippet Github repository page here.

`copy_file_from_workspace_bucket`

This snippet copies file in your Google Bucket and loads it into a dataframe.
You can view the snippet Github repository page here.

 

`list_objects_in_bucket`

This snippet returns a list of objects in your Google Bucket.
You can view the snippet Github repository page here.

Was this article helpful?

0 out of 0 found this helpful

Have more questions? Submit a request