Opened 5 years ago

Closed 5 years ago

Last modified 5 years ago

#1835 closed defect (fixed)

HERON uses ICD-O-2 labels for ICD-O-3 morphology/histology cancer tumor registry codes

Reported by: dconnolly Owned by: dconnolly
Priority: major Milestone: heron-cow-creek-update
Component: data-repository Keywords: public-web
Cc: Jkeighle, tmcmahon, rwaitman, ngraham Blocked By:
Blocking: 1870 Sensitive: no


The materials from WHO (ticket:733#comment:36) has two sources of morphology codes:

  1. ICD-O-3_CSV-metadata/Morphenglish.txt and
  2. ICD-O-2_CSV/icd-o-3-morph.csv

At a glance, they seem to have the same info, but only b. has a hierarchical breakdown, so I used that one.

Now I see that there are differences in the details.

For example, NAACCR|521:98353. the ICD-O-3 book, 3rd ed, says 982-983 LYMPHOID LEYKEMIAS (C42.1), but ICD-O-2_CSV/icd-o-3-morph.csv says M982,Lymphoid leukaemias, and M983,Plasma cell leukaemia,.


("SNOMED") has adopted the ICD-O classification of morphology

-- ICD-O in wikipedia

SNOMED is in UMLS, and we have access UMLS, so perhaps we could get the hierarchical structure there. But when we staged UMLS (#1316) we didn't stage SNOMED.

Attachments (1)

breast-except.html (63.3 KB) - added by dconnolly 5 years ago.

Download all attachments as: .zip

Change History (10)

comment:1 follow-up: Changed 5 years ago by dconnolly

I invited John K. up to validate the work in #733.

In the SEER Site Recode, we see:


C500-C509 excluding 9590-9989, and sometimes 9050-9055, 9140+

We ran the corresponding queries in HERON and the numbers looked fishy:

  • Breast under SEER Site: 8,385 patients
  • Primary site Breast: 8,392 patients
  • So we'd expect a query for primary site Breast and histology 9590 etc. (attachment:breast-except.html) to give 7, but it gave 67

Changed 5 years ago by dconnolly

comment:2 follow-up: Changed 5 years ago by rwaitman

Checking it out on test. Very neat morphology now. Is histology supposed to be blank?

comment:3 in reply to: ↑ 2 Changed 5 years ago by dconnolly

Tamara, I'd like to make sure this makes sense to you for training purposes and then chat about whether more documentation than just these trac notes is worthwhile...

Replying to rwaitman:

Is histology supposed to be blank?

Yes, that's by design. John confirmed that when he talks about histology, he typically means Type&Behav.

The histology/morphology data from NAACCR is redundant. We get

  • 0419 Morph--Type&Behav ICD-O-2
  • 0420 Histology (92-00) ICD-O-2
  • 0430 Behavior (92-00) ICD-O-2

(and likewise 0521, 0522, 0523 for ICD-O-3)

For example, in 0420, when we have Morph Type&Behav of:

  • M9652/3 Hodgkin's disease, mixed cellularity NOS

then we'll have 9652 in Histology and 3 in Behavior.

WHO provides labels for the combination of Type&Behav; not for histological type alone. Since the data is completely redundant, I figured showing the numeric codes was worse than useless.

Perhaps I should have hidden the 0420 and 0522 folders altogether, but I didn't get around to that.

Meanwhile, the main point for training is: start with the SEER Site Summary and only mess with specific histologies and such if you're sure you need to.

comment:4 Changed 5 years ago by dconnolly

  • Status changed from new to accepted

comment:5 in reply to: ↑ 1 Changed 5 years ago by dconnolly

  • Status changed from accepted to assigned

Replying to dconnolly:

The query needs to be constrained to same financial encounter to make sense.

When I do that, I get 5. Now we're just off by 2. One has 'NAACCR|521:96993' which is only in ICD-3. I haven't found an explanation for the other one, yet.

comment:6 Changed 5 years ago by ngraham

  • Blocking set to 1869

comment:7 Changed 5 years ago by dconnolly

  • Blocking changed from 1869 to 1870

This affects concepts but not observation facts.

comment:8 Changed 5 years ago by dconnolly

  • Resolution set to fixed
  • Status changed from assigned to closed

Fixed in:

Lightly tested.

comment:9 Changed 5 years ago by dconnolly

  • Keywords public-web added

Batch update from file neosho-cow-creek-public-web.xls

Note: See TracTickets for help on using tickets.