Aggregating Data

Aggregating data is writing one query and using those results to create a second query. It is possible to aggregate data using predefined sets or to compare data across stores.

The following example shows a query pulling the information for the patients that have attended Baptist Medical Center South from store 0. This first query is then used in a second query with data in a different store to find those that have also gone to another facility.

The hash, #, indicates that the following text is a comment and will not be included as part of the code.

import numpy
qs = '()s.sending_facility:"BAPTIST MEDICAL CENTER SOUTH"' #Creates the query expression in the following line.
df = get_hits(qs, store=0,fields='URI, extract.patient_id', limit = 100, advanced_options={'dtype':{'extract.patient_id': numpy.object_}}) #forces patient_id to act like a string instead of a number because .join only works with strings. The data is from store 0

qs = '()s.patient_id:IN(%s)' % ','.join(df['extract.patient_id'].tolist()) #This is the second query string created using data from the first query which was converted to a list of strings. Patient_id is converted to a list with .join(df['extract.patient_id'].tolist()).
df1 = get_hits(qs, fields=CLINICAL_REPORTS_FIELDS, store=0)
show_hits(df1) #This prints the results to the screen, calling data from store 0