Posts in category cancer

KUMC receives PCORI award to lead Greater Plains Collaborative

The Greater Plains Collaborative award from the Patient-Centered Outcomes Research Institute (PCORI) provides $7 million for a project that will establish a new network of nine medical centers in seven states committed to building a data set from electronic medical records that will be used to contribute to new research in the fields of breast cancer, obesity and amyotrophic lateral sclerosis (also known as ALS, or Lou Gehrig's disease).

The principal investigator of the project is Russ Waitman, director of medical informatics at KU Medical Center. The HERON technology developed by the medical informatics team plays a prominent role.

Further reading:

HERON Bow Creek Release brings Cancer Survival Analysis, R integration

The highlights for this release include:

Our cancer survival analysis plug-in is based on work by Segagni et. al.:

HERON Bow Creek Contents Summary

This month, our tour of rivers and lakes in Kansas honors Bow Creek river.

The HERON repository contains approximately 600 million real observations from the hospital, clinics, and research systems:

Observation Patients Source Go-Live Snapshot Issues
Demographics 16.0M 1.89M
KUH Billing (O2 via SMS) 1980s Dec 2011 various*
UKP Billing 2000 Dec 2011
8.18K 8.18K Frontiers participant registry Jun 2009 Dec 2011
182K 182k Social Security Death Index 1962 Dec 2011
Diagnoses (IDC9) 17.9M 586K
KUH/O2/Epic Nov 2007 Dec 2011 various*
UKP Billing 2000 Dec 2011
Medications 28.0M 103K
KUH/O2/Epic Nov 2007 Dec 2011 various*
Nursing Observations 434M ?
KUH/O2/Epic Nov 2007 Dec 2011 various*
Lab Results 69.8M 248K
KUH/O2/Epic 2003 Dec 2011 various*
Procedures (CPT) 9.4M 531K
UKP Billing 2000 Dec 2011
Specimens 25.5K 2.66K
KUMC Biospecimen Repository ? Dec 2011
Cancer Cases 8.96M 61.9K
KUH Cancer Registry 1950s Dec 2011 labels*
All 606M

Beta Disclaimer

We are providing this early access to obtain feedback from you, the research community. While we are actively working on validating the data loaded into the system with hospital and clinic technical staff, there may be problems with our translation of data from our source systems (HospitalEpicSource and ClinicIdxSource) into HERON.

Please email us at heron-admin@kumc.edu if you discover information you believe may be erroneous.

We are actively working on enhancing the types of data included. Stay tuned to our roadmap to track progress toward upcoming releases.

Various Issues Still Apply

Keep in mind the issues noted in the original HERON beta notice, including:

Enhancements and Problems/Defects/Issues Addressed in this Release

No results

Outstanding Problems/Defects/Issues

Adding SEER Site Recode to HERON Tumor Registry integration

Our HERON tuttlecreek release a couple months ago included initial integration of data on ~60,000 cancer cases from the KUMC tumor registry. We organized the NAACCR terms based on work by colleagues at the Kimmel Cancer Center in Philadelphia and Group Health Cooperative in Seattle:

NAACR terms for tumor registry

But if you want to find, for example, brain cancer cases, due to an outstanding issue (#733), you have to be an expert in codes for primary site, histology, etc.:

For our next release, based on work with John Keighley, we're providing query by SEER Site Recode, a state of the art method for combining primary site and histology:

screenshot of SEER Site Recode term hierarchy

Under the hood: Using python to convert the rules table to SQL

The SEER Site Recode ICD-O-3 (1/27/2003) Definition, lays out the rules in a fairly convenient HTML table:

Converting that table to code manually might have been straightforward, but it would have been repetitive and error-prone; so like so many Geeks and repetitive tasks, I wrote a script to automate it.

source:tumor_reg/seer_recode.py weighs in at about 200 lines, including whitespace and a handful of test cases. It reads the HTML page (well, I feed it through tidy first to clean up some table markup) and produces

  1. A term hierarchy in CSV format (source:heron_load/curated_data/seer_recode_terms.csv)
  2. Rules to recode our our ~60K cancer cases as a SQL case statement (source:heron_load/seer_recode.sql).

The resulting SQL weighs in at about 500 lines. Handling all the different kinds of rules in the table was fun; a lot more fun than writing this sort of SQL by hand:

case
/* Lip */ when (site between 'C000' and 'C009')
  and  not (histology between '9590' and '9989'
   or histology between '9050' and '9055'
   or histology = '9140') then '20010'

...

/* Melanoma of the Skin */ when (site between 'C440' and 'C449')
  and (histology between '8720' and '8790') then '25010'

...

/* Cranial Nerves Other Nervous System */ when (site between 'C710' and 'C719')
  and (histology between '9530' and '9539') then '31040'

/* ... */ when (site between 'C700' and 'C709'
   or site between 'C720' and 'C729')
  and  not (histology between '9590' and '9989'
   or histology between '9050' and '9055'
   or histology = '9140') then '31040'

HERON Tuttlecreek release brings initial Tumor Registry integration

This month's HERON release integrates data from the KUH Tumor Registry, with 65,000 cases dating back to the 1950s(#547). We have also added support for finding patients in KCK county school districts (#531).

Russ regularly gives presentations on our work, describing the integration of various sources into HERON. Since this diagram from How Medical Informatics and HERON Can Help Your Research?, given on November 17, we have integrated the Social Security death master file as well as the tumor registry:

HERON Tuttlecreek Contents Summary

This month, our tour of rivers and lakes in Kansas honors Tuttle Creek Lake.

The HERON repository contains approximately 570 million real observations from the hospital, clinics, and research systems:

Observation Patients Source Go-Live Snapshot Issues
Demographics 15.9M 1.88M
KUH Billing (O2 via SMS) 1980s Oct 2011 various*
UKP Billing 2000 Oct 2011
6.64K 6.64K Frontiers participant registry Jun 2009 Oct 2011
Diagnoses (IDC9) 17.2M 571K
KUH/O2/Epic Nov 2007 Oct 2011 various*
UKP Billing 2000 Oct 2011
Medications 26.3M 96K
KUH/O2/Epic Nov 2007 Oct 2011 various*
Nursing Observations 407M ?
KUH/O2/Epic Nov 2007 Oct 2011 various*
Lab Results 67.2M 240K
KUH/O2/Epic Nov 2007 Oct 2011 various*
Procedures (CPT) 9.1M 523K
UKP Billing 2000 Oct 2011
Specimines 23.1K 2.48K
KUMC Biospecimine Repository ? Oct 2011
Cancer Cases 6.21M 60.4K
KUH Cancer Registry 1950s Aug 2011 labels*
All 570M

Beta Disclaimer

We are providing this early access to obtain feedback from you, the research community. While we are actively working on validating the data loaded into the system with hospital and clinic technical staff, there may be problems with our translation of data from our source systems (HospitalEpicSource and ClinicIdxSource) into HERON.

Please email us at heron-admin@kumc.edu if you discover information you believe may be erroneous.

We are actively working on enhancing the types of data included. Stay tuned to our roadmap to track progress toward upcoming releases.

Various Issues Still Apply

Keep in mind the issues noted in the original HERON beta notice, including:

Problems/Defects/Issues Addressed in this Release

No major issues addressed.

Outstanding Problems/Defects/Issues

No results