Getting Counts

Learn how to analyze the distribution of data in your search results.

Sometimes it's important to get counts of fields of interest for your query. Let's look at the emergent stroke case and try to get the distribution count of machine models, series description, and study descriptions.

WITH 
    timepoint_1 AS (
        SELECT 
            Patient.patient_id, 
            Study.row_id, 
            Series.series_uid,
            Series.series_description,
            Study.study_description,
            Series.manufacturer_model_name
        FROM `gradient-pacs-siskin-172863.eris_test.patients_json` AS Patient, 
        UNNEST(studies) AS Study,
        UNNEST(Study.series) AS Series
        WHERE 
            (Study.positive_entities LIKE '%stroke%' OR Study.positive_entities LIKE '%hemorrhage%') 
            AND Series.modality = 'CT'
            AND REGEXP_CONTAINS(Series.manufacturer, "(?i)GE|G.E.|OEC Medical Systems")
            AND Series.slice_thickness <= 1.25 
    )
        
SELECT
    field,
    value,
    COUNT(value) AS count
FROM (
    SELECT
        'series_description' AS field, series_description AS value
    FROM timepoint_1
    UNION ALL
    SELECT
        'study_description' AS field, study_description AS value
    FROM timepoint_1
    UNION ALL
    SELECT
        'manufacturer_model_name' AS field, manufacturer_model_name AS value
    FROM timepoint_1
)
GROUP BY field, value
ORDER BY field, count DESC;

This query:

Creates a unified view of three different fields (series_description, study_description, manufacturer_model_name)
Counts occurrences of each unique value within each field
Orders results by field name and then by count in descending order

This is useful for understanding the distribution of your data and identifying common patterns or outliers.

WITH timepoint_1 AS ( SELECT Patient.patient_id, Study.row_id, Series.series_uid, Series.series_description, Study.study_description, Series.manufacturer_model_name FROM `gradient-pacs-siskin-172863.eris_test.patients_json` AS Patient, UNNEST(studies) AS Study, UNNEST(Study.series) AS Series WHERE (Study.positive_entities LIKE '%stroke%' OR Study.positive_entities LIKE '%hemorrhage%') AND Series.modality = 'CT' AND REGEXP_CONTAINS(Series.manufacturer, "(?i)GE|G.E.|OEC Medical Systems") AND Series.slice_thickness <= 1.25 ) SELECT field, value, COUNT(value) AS count FROM ( SELECT 'series_description' AS field, series_description AS value FROM timepoint_1 UNION ALL SELECT 'study_description' AS field, study_description AS value FROM timepoint_1 UNION ALL SELECT 'manufacturer_model_name' AS field, manufacturer_model_name AS value FROM timepoint_1 ) GROUP BY field, value ORDER BY field, count DESC;