Opened 5 years ago

Closed 3 years ago

Last modified 3 years ago

#1965 closed enhancement (fixed)

align encounters for hospital encouters; e.g. UHC LOS visit details and flow sheet vitals

Reported by: mhoag Owned by: dconnolly
Priority: major Milestone: heron-kanopolis-update
Component: data-repository Keywords: encounter-mapping gpc needstest public-web
Cc: ngraham, dconnolly, badagarla, rwaitman, schandaka, ssuman, bzschoche Blocked By:
Blocking: 333, 2860, 3116 Sensitive: no

Description

Currently, the clarity.pat_enc.pat_enc_csn_id's (Epic Encounters) are used to create the concept of a "financial encounter". However, while investigating #1887 it was observed that a single clarity.hsp_acct_mpi.mpi_id could be associated with multiple different clarity.pat_enc.pat_enc_csn_id's. Dan noted that this may imply that clarity.hsp_acct_mpi.mpi_id might be a better source of truth for representing a "financial encounter". If this was the case then the encounter numbers should be generated using clarity.hsp_acct_mpi.mpi_id instead of clarity.pat_enc.pat_enc_csn_id.

Depending on how this implemented it could represent an expensive change.

Attachments (1)

EncounterMappingDiscussion_Russ12022014.JPG (962.5 KB) - added by schandaka 3 years ago.
EncounterMappingDiscussion_Russ

Download all attachments as: .zip

Change History (82)

comment:1 Changed 5 years ago by dconnolly

  • Cc rwaitman mnair added
  • Keywords encounter-mapping added; encounter mapping removed
  • Milestone set to heron-medicine-lodge-update

I suggest organizing a technical meeting around this encounter-mapping stuff early in the next milestone.

I added the encounter-mapping keyword to all the relevant tickets I can think of. I wonder if EMPI would have been a better keyword. Oh well.

comment:2 follow-up: Changed 5 years ago by dconnolly

notes from Informatics meeting:

RW: pls schedule meeting; after this week, so as to keep focus on milestone:trac-pub

  • hospital technical charge encounter
  • getting rid of SMS in July; we should meet with them re revenue [capture?]
  • EMPI master encounter number is what we want to use
    • we may want to develop logic using days

MH: we only have 1.9m encounters ? which is about the number of patients

comment:3 Changed 5 years ago by ngraham

  • Milestone changed from heron-medicine-lodge-update to heron-sappa-update

comment:4 Changed 5 years ago by mhoag

  • Cc schandaka added
  • Status changed from new to assigned

comment:5 Changed 4 years ago by mhoag

  • Owner changed from mhoag to schandaka

comment:6 Changed 4 years ago by ngraham

  • Blocking set to 2099

comment:7 Changed 4 years ago by schandaka

  • Milestone changed from heron-sappa-update to heron-big-blue-update

comment:8 Changed 4 years ago by mhoag

  • Blocking 2099 deleted

This should no longer be blocking the ETL for sappa as we need to move it to the next release. Sravani mentioned to me that to get traction on this ticket we need a more detailed export of the IDX table. I believe that Bhargav and Sravani have had discussion with IDX to that effect. Bhargav, Sravani could you comment further on this?

comment:9 Changed 4 years ago by ngraham

  • Blocking set to 2158

comment:10 Changed 4 years ago by rwaitman

  • Milestone changed from heron-big-blue-update to heron-chikaskia-update

comment:11 Changed 4 years ago by ngraham

  • Blocking 2158 deleted

comment:12 in reply to: ↑ 2 Changed 4 years ago by dconnolly

  • Blocked By set to 1937

Replying to dconnolly:

  • we may want to develop logic using days

While working on summarization of data sets by encounter (#2186), Russ and I just brainstormed a bit more about this logic using days.

Having an encounter_num represent the equivalence class of "all of patient X's encounters that started on D" makes a lot of sense.

comment:13 Changed 4 years ago by badagarla

  • Cc ngraham added; bgraham removed

comment:14 Changed 4 years ago by dconnolly

  • Milestone changed from heron-chikaskia-update to heron-milford-update

comment:15 Changed 4 years ago by dconnolly

  • Milestone changed from heron-toronto-update to HERONv.Next

comment:16 Changed 3 years ago by ngraham

  • Milestone changed from HERONv.Next to heron-saline-update
  • Owner changed from schandaka to mhoag

comment:17 follow-up: Changed 3 years ago by mhoag

  • Cc ssuman added; mnair removed
  • Owner changed from mhoag to schandaka

As a result of the stop gap fix for UHC encounter mapping we decide that encounter mapping was once again a high priority.

Had a meeting today (Sept 8th) to decide on an approach for encounter mapping with Nathan, Russ, Sravani, Bhargav, Suman and myself in attendance.

We began with an overview:

  • Current mapping uses pat_enc_csn_id
  • A UHC encounter (hsp_account_id) can be associated with multiple pat_enc_csn_ids
  • This cause a problem with a the encounter mapping table, because you cannot store mappings going from specific-to-general.

This was followed by a look at the instances where a hsp_account_id was associated with multiple pat_enc_csn_ids.

select uhcencounterid, hsp.pat_enc_csn_id, hsp.hsp_account_id, hsp.pat_enc_date_real, 
       zpc.name, hsp.exp_admission_time, hsp.adt_arrival_time, 
       hsp.hosp_admsn_time, hsp.hosp_disch_time, hsp.department_id, hsp.contact_date, 
       hsp.emer_adm_date, hsp.instant_of_entry_tm
-- select uhcencounterid, zpc.name, hsp.*
from
(select distinct(uhcencounterid) from heron_etl_1.uhc_encounter_mapping
 natural join
 (select uhcencounterid from heron_etl_1.uhc_encounter_mapping group by uhcencounterid having count(*) > 1))
join clarity.pat_enc_hsp hsp 
  on uhcencounterid = hsp.hsp_account_id
join clarity.zc_pat_class zpc
  on hsp.adt_pat_class_c = zpc.adt_pat_class_c
order by uhcencounterid, pat_enc_date_real;

As a group we determined that the UHC encounter (hsp_account_id) was the preferred (gold) representation of a financial encounter in HERON when it was available.

Russ went on to say:

  • That ideally we should use the hsp_account_id to define a financial encounter even when it did not link up to UHC
  • Any encounter from another source (EPIC, IDX) that occurred within the duration of the hsp_account_id should be considered part of the "overall financial encounter" even if it is not explicitly linked (pull in out-patient)
  • All other encounters that do no link up to to a hsp_account_id will be grouped as the same financial encounter if they occur on the same day.

I noted that such a generalization financial encounter will cause us to lose precision with certain facts associated with an encounter (at what point during the encounter did the diagnosis occur?). Tamara and Russ seemed okay with that, proposing that the START_TIME and END_TIME for a fact (and potentially an additional mapping table encounter -> time frame) be used to make up for that loss of precision (i.e. The diagnosis occurred during the first day of the encounter).

Russ also wanted Sravani to validate some of the assumptions that we made during the meeting:

  • No duration from a hsp_account_id collides - or is fully encapsulated within - the duration from another hsp_account_id.
  • ? (I feel like there was another assumption he wished to be validated pertaining to departments. The request was directed toward Sravani so I will ask her to fill in the details)

Next steps seem to be validating assumptions, so I am assigning this to Sravani.

comment:18 in reply to: ↑ 17 Changed 3 years ago by mhoag

Replying to mhoag:

Russ also wanted Sravani to validate some of the assumptions that we made during the meeting:

  • No duration from a hsp_account_id collides - or is fully encapsulated within - the duration from another hsp_account_id.
  • ? (I feel like there was another assumption he wished to be validated pertaining to departments. The request was directed toward Sravani so I will ask her to fill in the details)

I spoke with Sravani and she briefly clarified that Russ wanted her to check Cardiology data, specifically how encounters linked up with a particular event/visit (this is all I remember, again hopefully this will be clarified further in the ticket).

We agreed that she was too time-sliced to do the other validation piece (collision of durations for hsp_account_id for the same patient) so I will take it back after after she completes the validation of the Cardiology data.

comment:19 Changed 3 years ago by ngraham

  • Milestone changed from heron-saline-update to heron-verdigris-update

Batch update from file HERON_saline_meeting.csv

comment:20 Changed 3 years ago by ngraham

  • Keywords gpc added

We'll likely need this for GPC:ticket:155

comment:21 Changed 3 years ago by ngraham

  • Keywords heron-weekly added

comment:22 Changed 3 years ago by ngraham

In heron-weekly, Sravani aims to complete this for milestone:heron-verdigris-update with support from Matt.

comment:23 Changed 3 years ago by mhoag

  • Blocking set to 3063

comment:24 Changed 3 years ago by dconnolly

Now that we have a sketch for the design (ticket/1965#comment:17), I'd like to see this re-cast as an enhancement... something like: Consistent same financial encounter treatment across all data sources?

I'm sure various use cases have been discussed. Bonus points to anybody who clearly documents them, preferably in a form that can be copied and pasted into the release blog item.

comment:25 Changed 3 years ago by schandaka

I was able to get some inputs from the hospital Epic(Interface) folks on the hsp_account_ids. I thought it is worth noting in this ticket.

  • There is no consistent logic in how a hsp_account_id is allotted to a patient visit. It is department specific. For example, If I go to a FamMed? office visit and the provider ordered a blood work for me, one hsp_account_id is created both for the office visit and the order. In pediatrics, it could be completely different scenario where they would give two different hsp_account_ids.

Another example is, if the patient gets admitted at KUH and got a cardiac procedure done(in Cardiology department) on 01/01/2014, he will have one hsp_account_id and the following day if he got another procedure done in Radiology department it will have another hsp_account_id. Because, that is how the workflow for Cardiology and Radiology are set up.

  • When I looked at the data(output from the Matt's previous query) I have noticed that Cancer Center operates much differently. More clarification was given by interface team that when a cancer patient is starting their treatment one hsp_account_id is created for their 'series' of treatments. And that hsp_account_id is not closed until they are done with all the treatments. And the workflow is similar for the all departments who set up their 'follow-up' appointments at once.

Typically hsp_account_id is closed on the patient's discharge unless the workflow indicates something else.Here is a sample case that would show the same.

select  hsp.pat_id,hsp.pat_enc_csn_id,hsp_account_id,department_id, min(hosp_admsn_time),max(hosp_disch_time)
from clarity.pat_enc_hsp hsp 
group by pat_id,pat_enc_csn_id,hsp_account_id,department_id
order by pat_id,min(hosp_admsn_time)

results redcacted

Matt and I discussed yesterday(before the end of the day)and he suggested I should put up a stats query in the ticket where the admission and discharge dates are overlapping for multiple patient encounters with the same hsp_account_ids. I am working on it now I will add that to this ticket shortly.

comment:26 Changed 3 years ago by dconnolly

  • Blocking changed from 3063 to 333, 3063

comment:29 Changed 3 years ago by dconnolly

  • Cc bzschoche added

As I mentioned this morning, the encounter type breakdown in the summary data we reported to PCORNet (GPC:ticket:144) is pretty low-quality. It's 87% NI ("no information").

I'd like to see an encounter type breakdown using this new design. I'd like to see it broken down by year, too.

I think there's a visit type column in the visit dimension that we don't currently populate. I expect populating that would help cohort characterization work.

p.s. Did anybody take notes on the rest of the discussion? how about a whiteboard photo?

comment:30 Changed 3 years ago by dconnolly

I'd also like to see verification that we can correlate hsp_account_ids with idx.bill_inv_number (#333).

comment:31 follow-up: Changed 3 years ago by dconnolly

  • Milestone changed from heron-verdigris-update to heron-kanopolis-update

#333 is a better bang-for-the-buck for this release

comment:32 in reply to: ↑ 31 Changed 3 years ago by mhoag

  • Blocking changed from 333, 3063 to 333

Replying to dconnolly:

#333 is a better bang-for-the-buck for this release

We decided to postpone this. Unblocking ETL.

comment:33 Changed 3 years ago by dconnolly

  • Keywords heron-weekly removed

comment:34 Changed 3 years ago by ngraham

Russ scheduled a design meeting with Bhargav, Tamara, Sravani, Matt, etc to see if we can get something working (based on dates at least) for Kanopolis.

Changed 3 years ago by schandaka

EncounterMappingDiscussion_Russ

comment:35 Changed 3 years ago by schandaka

  • Keywords heron-weekly added

Adding the screenshot from yesterday evening's(12/02/2014) meeting. I did not get a chance to add detailed notes before the heron-weekly.

comment:36 Changed 3 years ago by dconnolly

I'm particularly interested to see use cases along with designs that they address.

cf. HeronLoadDev process which is a bit of a mess just now.

comment:37 follow-up: Changed 3 years ago by ngraham

Matt to add notes and use cases from heron-weekly today...

Use case sketch:
Hospitalization plus another procedure with CPT code would then link to the same encounter...

comment:38 in reply to: ↑ 37 Changed 3 years ago by mhoag

Replying to ngraham:

Matt to add notes and use cases from heron-weekly today...

The plan at this point is to go ahead with the Course of Action identified in comment:17. We will accept the fact that encounters involving Cancer Case series may be too generalized (comment:25) - an encounter may span several months.

Possible future remediation of the Cancer case issue:

  • hsp_account_ids which are involved in the Cancer Case series are excluded from consideration when performing the 3rd piece of normalization logic. i.e.

    All other encounters that do no link up to a hsp_account_id will be grouped as the same financial encounter if they occur on the same day.

Also Dan came up with a good suggestion for an "encounter is no less that a day" by codifying the encounter_num as the start_date and the patient_num (e.g. '20140201' || '123456789'). It would be worth thinking about whether this encoding would be useful.

comment:39 Changed 3 years ago by schandaka

Russ suggested that we should start with the length of stay and group all the encounters within that length of stay period as one clinical encounter. I have noticed that going by the length of stay(hosp_admn_time thru hosp_disch_time)from Epic is not always reliable. Example: A patient could have multiple admissions and discharges in multiple departments on the same day/during the entire stay at the hospital.I discussed this with Russ and he agreed that we could obtain the length of stay information(startdate of the hospital stay and dischargedate of the hospital stay) from UHC data and group all the epic encounters within those date limits as one clinical encounter.

comment:40 Changed 3 years ago by ngraham

  • Blocked By changed from 1937 to 1937, 3116

comment:41 Changed 3 years ago by ngraham

  • Blocked By changed from 1937, 3116 to 1937
  • Blocking changed from 333 to 333, 3116

comment:42 Changed 3 years ago by ngraham

In heron-weekly, we discussed a series of designs and use cases that they address. Dan, Sravani, and Matt refined them after the meeting.

Cardio and Radio visit: no UHC LOS for Radio

Pete comes in for a cardiology visit and then goes over to radiology.

pat_csn_enc_id pat_id service_line
6 Pete cardiology
7 Pete radiology

These get assigned to the same hospital account:

hsp_acct_id pat_csn_enc_id
4 6
4 7

Researcher Rex asks for cardio and radio stuff in the same visit and loses. Eek!

Then Rex asks for LOS for radiology visits and doesn't find Pete's. Eeek!

What happened? Our visit_dimension is based on enc_id, not hospital accounts. UHC goes by hospital accounts. We happen to map hospital account 4 to enc_id 6, so the cardio and UHC LOS match up. But enc_id 7 got stranded.

Design idea: base our encounter mapping more on account id.

We'll update this ticket (#1965) to match this scope. TODO: promote use cases to ticket descriptions.

Who came to E.D. and had a Radiology read?

These enc_ids are not mapped to hospital encounters because they're not inpatient.
But (as noted in #1937) we can aggregate encounters by dates

  • within the bounds of the hospital accounts (and hence UHC), else
  • within the same day

So this is a use case for #1937.

Who had lisinopril on their home meds list at a regular doctor visit? (Ambulatory)

Ambulatory visit: CPT, E&M (general doctor visit) code that lines up with an ambulatory visit: how many patients have had lisinopril on the historical med list also.

This seems to be a use case for #333.

Gestational Diabetes and Anaesthesiology (IDX, Epic, and UHC)

A researcher wants to know:

How many people had gestational diabetes diagnosis (Epic) and an anaesthesiology procedure (CPT coming from IDX) where there was a cesarean delivery during a hospital stay (UHC).

Also seems to be a use case for #333.

comment:43 Changed 3 years ago by dconnolly

Suman asked Sravani if this is ready; no, but she's working on it and hopes to have something late afternoon.

comment:44 Changed 3 years ago by ssuman

I would like to start ETL latest by 3.30 pm today (Wednesday). Sravani might need some of Matt's time for testing changes.

comment:45 Changed 3 years ago by mhoag

  • Blocked By 1937 deleted

comment:46 Changed 3 years ago by dconnolly

I gather Matt and Sravani finished the bulk of the design and development this morning and worked through various issues this afternoon. Testing of [d26c9126e402] has gone well, though it hasn't gone very far.

Much of the effort today was on duplicate keys that arose because we have facts that used to be unique by virtue of having distinct encounters that now share an encounter.

The Encounter_Mapping_1965 branch is not yet merged.

Here's hoping Sravani and Matt land this tomorrow morning.

comment:47 Changed 3 years ago by mhoag

Sravani and I looked at the encounter mapping table generated from test data (and our latest changes d26c9126e402) and looked at a few test patients and they looked good. Then we ran a ad-hoc usecase in i2b2:

  • dragged in systolic bp (flowsheet reading from Epic)
  • UHC length of stay

Encounter independent gave 1 result
Same financial encounter gave 1 result

We had a little trouble showing the negative space: where encounter independent results in a value greater than same financial encounter.

comment:48 Changed 3 years ago by mhoag

  • Resolution set to fixed
  • Status changed from assigned to closed

encounter mapping merged to default (10074f84572e). Closing.

comment:49 Changed 3 years ago by dconnolly

  • Type changed from design-issue to enhancement

comment:50 Changed 3 years ago by mhoag

Ran into an issue during ETL where we were calculating the part based off a column where that could be null (not every hsp_account_id has a correlated mpi_id). Test data did not exploit this fact so it passed on the test ETL run. fix is in d8a41606c920.

comment:51 Changed 3 years ago by mhoag

Ran into an issue during ETL where we inserted null mpi-ids into the encounter mapping table. Fix was to just filter the null ids since they are worthless in the encounter mapping table. Fixed in (9c9be7b2a129)

comment:52 Changed 3 years ago by dconnolly

  • Milestone changed from heron-kanopolis-update to dconnolly

Milestone renamed

comment:53 Changed 3 years ago by dconnolly

  • Milestone changed from dconnolly to heron-kanopolis-update

Milestone renamed

comment:54 Changed 3 years ago by dconnolly

  • Resolution fixed deleted
  • Status changed from closed to reopened

duplicate keys. see ticket:3116#comment:23

comment:55 Changed 3 years ago by dconnolly

  • Owner changed from schandaka to dconnolly
  • Status changed from reopened to accepted

I'm looking into this; I seem to be going back to a previous encounter mapping design for UHC that was un-done in #2720... hmm...

comment:56 Changed 3 years ago by dconnolly

  • Resolution set to fixed
  • Status changed from accepted to closed

I think I fixed it (with Sravani, Nathan). I wasn't confident I understood the code, so I might have made more changes than were necessary.

  1. encounter_mapping based on mpi_id, then hsp_account_id, then pat_csn_enc_id [99636e0baf92]
  2. name hsp_account_id view epic_hsp_audit_info as in 777a9bb4fda4 [eb36af450e67]
  3. Map UHC encounters via HSP_ACCOUNT_ID, and failing that, MPI_ID. [96d6180b9a80]
  4. clean up encounter mapping based on discussion with Nathan: [faa8e9d31c08]
  5. in encounter_mappings for patient_day_visit, select distinct encounters [fb616522c882]
  6. base patient_day_visit.part on pat_id rather than enc_id to make distinct work out [844c0728df4e]
  7. filter out a few hsp_account_ids with multiple pat_ids [7685f1151402]

comment:57 Changed 3 years ago by ngraham

  • Resolution fixed deleted
  • Status changed from closed to reopened

We've had several more iterations of encounter mapping work that we've documented in the ETL ticket (starting with ticket:3116#comment:25). I suppose it's better to track the work here.

ETL is still failing - apparently, due to UHC-related encounter mapping (see ticket:3116#comment:30).

comment:58 Changed 3 years ago by ngraham

  • Owner changed from dconnolly to schandaka
  • Status changed from reopened to assigned

comment:59 Changed 3 years ago by ngraham

  • Owner changed from schandaka to dconnolly

In heron-weekly, Russ suggested to wait for Dan and Matt to return work on encounter mapping and kick off the ETL again.

comment:60 Changed 3 years ago by ngraham

  • Owner changed from dconnolly to mhoag

Matt agreed to take a look at this.

comment:61 Changed 3 years ago by ngraham

  • Resolution set to fixed
  • Status changed from assigned to closed

Matt and I found that we were joining against the entire encounter mapping table when we should have been joining against the filtered version in the "with" clause. See [6b1188054274].

Actually, it seems odd Oracle allowed us to do this - we didn't use the "with" clause in the query and then we liased another table to be the same name as the unused "with" clause. I'm pretty sure Oracle 10g didn't let you do that, but 11g apparently does.

Anyway, we rebuilt the UHC encounter mapping table and verified that there weren't any duplicates when there were before. Merged to Kanopolis in [58c176250cc4].

comment:62 follow-up: Changed 3 years ago by ngraham

  • Resolution fixed deleted
  • Status changed from closed to reopened

Matt and I think we've found a problem with the encounter mapping. I've included some messy notes from our exploration - Matt to fill in details.

Matt's Details

To the question posed in 99636e0baf92 "We want every patient_day_visit no?", the answer is actually no. That view was used to generate encounter_nums for epic encounters (pat_enc_csn_ids) that do not have an associated hsp_account_ids. Specifically, encounter_nums will have already been generated for encounters where pat_enc_csn_ids were mapped to hsp_account_ids. Generating encounter numbers from everything in the patient_day_visit view will create erroneous encounter numbers where the pat_enc_csn_id maps to a hsp_account_id and break the overall mapping.

Further, I believe the correlated change in epic_dimensions_load.sql between lines 217-238 are what caused the dramatic decrease in mappings that Nathan and I identified. Specifically that was where all of the mappings for pat_enc_csn_ids -> enc_num (where pat_enc_csn_ids did not have associated hsp_account_ids) were inserted into the encounter mapping table. Now the code appears to insert a mapping of pat_day_visit -> enc_num instead of the ~16M pat_enc_csn_ids.

Messy notes

Matt and I found that the fact counts went down for many different concept codes compared to last month - total fact count down by 224M. Not only that, but the patient count (by concept) went down by a lot - in one case, we lost 220K patients for 'KUH|FLO_MEAS_ID:9501055' == "001- #9501055 Room Number [1,031,779 facts; 231,804 patients]"'

select count(*) from blueherondata.observation_fact;
-- 1,451,954,533 (a1), --1,676,060,120 (b2)

-- down by....
select 1676060120 - 1451954533 from dual; --224,105,587

Maybe that's ok? We collapsed encounters down so we get fewer facts maybe?

Fact/patient counts dropped by concept:

select * from (
  select a1.concept_cd, a1.facts, a1.patients, a1.facts-b2.facts delta_facts,
  a1.patients-b2.patients delta_patients 
  from blueheronmetadata.counts_by_concept a1
  join blueheronmetadata.counts_by_concept@<deid-prod-b2> b2
  on a1.concept_cd = b2.concept_cd
  )
where delta_facts < 0 or delta_patients < 0
order by delta_patients
;

Created on both A and B:

create table room_num_facts as (
  select pm.patient_ide, em.encounter_ide, obs.* from nightherondata.observation_fact obs
  join nightherondata.patient_mapping pm on pm.patient_num = obs.patient_num
  join nightherondata.encounter_mapping em on em.encounter_num = obs.encounter_num
  where obs.concept_cd = 'KUH|FLO_MEAS_ID:9501055'
  )
;

Of the missing patients, none of the encounters associated with the example flowsheet measure made it in to the encounter mapping:

with missing as (
  select distinct patient_ide from room_num_facts@<prod-b2> b2
  minus
  select distinct patient_ide from room_num_facts a1
  )
select em.* from missing
join room_num_facts rf on rf.patient_ide = missing.patient_ide
join nightherondata.encounter_mapping em on em.encounter_ide = rf.encounter_ide
; -- 0 rows, the associate encounters are missing

It looks like most of the pat_enc_csns didn't make it in to the encounter mapping.

select count(*) from (
with
csns_without_acct as (
  select pat_enc_csn_id from clarity.pat_enc
  minus
  select pat_enc_csn_id from clarity.pat_enc_hsp
  where hsp_account_id is not null --16.2M
  )
select * from csns_without_acct
join nightherondata.encounter_mapping em on em.encounter_ide = to_char(csns_without_acct.pat_enc_csn_id)
and em.encounter_ide_source = 'Epic@kumed.com'
)
;-- 155,510 in A1, 16,252,586 in B2

comment:63 Changed 3 years ago by mhoag

  • Owner changed from mhoag to dconnolly
  • Status changed from reopened to assigned

Nathan and I didn't have my time to wrap our heads around all the changes to get a quick fix in before EOB Friday. Hopefully, Dan and I can collaborate to create a fix on Monday and kick off ETL soon after.

comment:64 in reply to: ↑ 62 Changed 3 years ago by dconnolly

  • Resolution set to fixed
  • Status changed from assigned to closed

Replying to ngraham:

It looks like most of the pat_enc_csns didn't make it in to the encounter mapping.

Quite.

Matt and I fixed that while working through all the encounter mapping and visit dimension logic:

  1. Encounter mappings for patient day, re-using HSP_ACCOUNT_ID mappings. [0e644625fa34]
    • oops: Encounter mappings for PAT_ENC_CSN_ID only used PAT_ENC_CSN_IDs from hospital_visit_dimension_view, which is not nearly all of them.
  2. oops... SQL syntax [cbe15e06d3f8]
  3. oops... wrong alias [f69418f83392]
  4. oops... need v.part too in patient day mappings [14ca36d25642]
  5. oops: alias typo in patient mappings [407e25c3e92d]
  6. Work around null pat_id in pat_enc table. [b6fdb7de1f66]
  7. Build visit dimension from patient-day mappings [ea527487e6de]

comment:65 Changed 3 years ago by dconnolly

  • Resolution fixed deleted
  • Status changed from closed to reopened

Matt,

We neglected to get back to that performance question. The answer seems to be: it's bad. insert running since 10:32:04

Can you have a go at this? I've got gpc-dev to prepare for.

comment:66 Changed 3 years ago by dconnolly

  • Owner changed from dconnolly to mhoag
  • Status changed from reopened to assigned

comment:67 Changed 3 years ago by mhoag

  • Resolution set to fixed
  • Status changed from assigned to closed

Turned the hospital_visit_dimension_view into a table (7e00d76b56e5 merged in heron-kanopolis 87814225d553). Figured it was the most likely culprit since it joins the pat_enc_hsp back on itself.

*Crossing fingers*

Looks like that fixed it (40 min runtime to load visit dimension console output build 72 (/job/heron_etl_PROD/72/console)):

2015-01-05 14:05:25,434 DEBUG heron.load_epic_dimensions single_loader(load_epic_dimensions)
2015-01-05 14:05:25,582 DEBUG heron.load_epic_dimensions.Epic_mappings_and_dimensions.id.epic_dimensions_load run(epic_dimensions_load.sql)
2015-01-05 14:46:40,744 INFO heron.load_epic_dimensions.Epic_mappings_and_dimensions.id.epic_dimensions_load run(epic_dimensions_load.sql)
2015-01-05 14:46:40,748 DEBUG heron.load_epic_dimensions.Epic_mappings_and_dimensions.id.i2b2_star_truncate run(i2b2_star_truncate.sql)

comment:68 Changed 3 years ago by mhoag

  • Resolution fixed deleted
  • Status changed from closed to reopened

ETL fell over (/job/heron_etl_PROD/72/console) because of deid visit dimension, Dan said he had an idea how to fix it.

comment:69 Changed 3 years ago by mhoag

  • Owner changed from mhoag to dconnolly
  • Status changed from reopened to assigned

comment:70 Changed 3 years ago by dconnolly

  • Resolution set to fixed
  • Status changed from assigned to closed

fixed in bedd620b6198: skip test patients

comment:71 Changed 3 years ago by mhoag

  • Resolution fixed deleted
  • Status changed from closed to reopened

ETL fell over (/job/heron_etl_PROD/73/console) due to duplicate facts being inserted for medical history.

comment:72 Changed 3 years ago by mhoag

  • Resolution set to fixed
  • Status changed from reopened to closed

Had a problem with creating uniqueness for the instance_num using string manipulation on pat_enc_date_real. Went ahead and switched to the ora_hash pattern devised in 86588b60bcba.

Fixed in (ae131d77eee7) and (5ed5f73239c8), merged to heron-kanopolis in (dc43158bc35b)

comment:73 Changed 3 years ago by mhoag

  • Resolution fixed deleted
  • Status changed from closed to reopened

comment:74 Changed 3 years ago by mhoag

heron etl fell over again (/view/HERON%20ETL/job/heron_etl_PROD/75/console). This time it was a problem with uniqueness on the instance_num for surgical history. Went ahead and fixed surgical history (and family history since it was in the same file) on changeset 153d5d83c29b. Merged with kanopolis on 83236b4ad270.

comment:75 Changed 3 years ago by mhoag

  • Resolution set to fixed
  • Status changed from reopened to closed

Yeah this should be closed as of the fix noted in comment:74

comment:76 Changed 3 years ago by dconnolly

  • Milestone changed from heron-kanopolis-update to dconnolly

Milestone renamed

comment:77 Changed 3 years ago by dconnolly

  • Milestone changed from dconnolly to heron-kanopolis-update

Milestone renamed

comment:78 follow-up: Changed 3 years ago by dconnolly

  • Keywords needstest added

Neither Nathan, Matt, nor myself can think of a cost-effective way to verify that this is working for the Kanopolis release (#3101).

Tagging as needstest; for follow-up, see GroupOnly/HeronQA#maintenance.

comment:79 in reply to: ↑ 78 ; follow-up: Changed 3 years ago by dconnolly

  • Blocking changed from 333, 3116 to 333, 2860, 3116

Replying to dconnolly:

Neither Nathan, Matt, nor myself can think of a cost-effective way to verify that this is working ...

#2860 seems to provide one way to verify.

comment:80 Changed 3 years ago by dconnolly

  • Keywords heron-weekly removed

comment:81 in reply to: ↑ 79 Changed 3 years ago by ngraham

Replying to dconnolly:

#2860 seems to provide one way to verify.

The previously failing case in #2860 seems to work now - see ticket:2860#comment:12.

comment:82 Changed 3 years ago by mprittie

  • Keywords public-web added

comment:83 Changed 3 years ago by mprittie

  • Sensitive unset
Note: See TracTickets for help on using tickets.