Posts for the month of October 2012

Mark Medovich on Software Defined Networks

Mark Medovich, Juniper's Chief Architect for the Public Sector, gave a very interesting talk at the Kansas City Software Defined Networking Luncheon, hosted by FishNet Security.

It directly addressed a gap in my knowledge regarding our new NetworkInnovation project, where we plan to "evaluate the applicability of GENI software defined networking and OpenFlow/Openstack technologies to support the secure transmission and storage of personal health information with the Google Fiber network."

I have enough experience as a customer of VM clusters to have vague notion of what OpenStack is about, but I'm brand new to OpenFlow and software defined networks. My first thought on exposure to them was: so what happened to The Rise of the Stupid Network?

Medovich explained that OpenFlow development was driven by high performance computing (HPC). Researchers are trying to reduce the latency for moving data between compute nodes.

The term "switch fabric" flew by... one of many buzzwords that I'm slowly picking up. "If we are going All L2..." I recognized L2 as a reference to a layer in the OSI model, but I didn't remember much about it, and I didn't have enough connectivity to look it up. Afterward, I reminded myself that it's where switches live, as opposed to hubs below and routers above.

"Networks within networks" was another phrase that caught my attention. It appealed to me as like scale-free design in Web Architecture. It reminded me of heated discussions in the IETF about the evils of NAT vs. the end-to-end purity of IPv6. The people I trust were on the IPv6 side (and IPv6 is great for lots of other reasons) but as I reflected later, the idea of one big flat IPv6 network seems like a monoculture, not scale-free.

He talked about multi tenancy data centers:

photo of "Multi-tenant flows within an end site" slide by Medovich

He used an example from when he was at Sun, visiting CVS Caremark: they had to provision for the monthly Medicaid Monday burst, which left a lot of excess capacity for most of the month. My understanding is that Amazon's cloud services came about roughly the same way: they have to provision for Christmas, which left them with a lot of spare capacity most of the time.

Traditional three-level networks are OK provided capacity is reasonably predictable, he said, but they don't deal with dynamic demand.

 "photo of "SCALING Multi-tenant SERVICES" slide by Medovich"

You can't make service level agreements (SLA) for dynamic demand with traditional networks; the best you can do is a service level probability (SLoP).

This brings us to OpenFlow. "Open flow is all about the data center and making virtualization better."

He introduced it using a slide from the OpenFlow Presentation:

"Here's the problem: that step 2. Encapsulate and forward to controller. What controller?" The controller isn't specified, he said.

Variability of OpenFlow devices (switches from this vendor or that) introduces too many variables. The only way the Juniper engineering team could see to make the scaling work was an any-to-any switch fabric. They had to collapse the network, from 3 tiers to 2 to 1.

People are building this sort of scalable multi-tenancy network, he said. But not it's not OpenFlow. Cisco UCS, Juniper fabric. 10000 ports. Software programmable.

"Don't get me wrong; I'm not here to knock OpenFlow. Juniper does support OpenFlow." Just don't expect OpenFlow to be the whole solution. I gather Juniper has filled in all the gaps implicit in OpenFlow use cases.

He threw out "... close to the lambda ..." as a goal the audience would be familiar with. This audience member was not.

Lambda switching uses small amounts of fiber-optic cable and differing light wavelengths (called lambdas) to transport many high-speed datastreams to their destinations -- Network World research center

His discussion of trust, interfaces, and economics reminded me of studying Miller's work on object capability security, the principle of least authority, and patterns of cooperation without vulnerability. More on that in another item. Meanwhile...

I felt more on solid ground when he started to discuss software architecture.

Way back, juniper decided to put a XML RPC server in every switch. This was adopted by the IETF as Netconf. Juniper has a rich SDK layered on top. This is how they rapidly ported OpenFlow to their devices. In answer to criticism that they don't have OpenFlow implemented in firmware, he compared it with developing a new platform from scratch, an argument with obvious appeal, to me. Who's going to implement "legacy" protocols like PPOE, baked into various purchasing specs?

While working toward Junosphere, the software team saw that they couldn't wait until the hardware was finished; they virtualized the whole thing. The result is now used by major telcom providers (Comcast, Telecom Italia) as a test lab.

At the other extreme, he explained how their architecture scales down to multi-tenancy embedded applications to meet military needs.

He mentioned in passing that their architecture includes a JBoss application server. I have a bit of a bad taste from using JBoss as the platform for HERON and hence i2b2, but I gather the version we use is ages out of date. So this nod encourages me to keep an open mind.

He mentioned a "single pane of glass" user interface with roles and permissions and templates. Again, I wondered to what extent this architecture employs the principle of least authority.

The Q&A that followed the talk quickly went over my head with "top of rack architectures" and such. But I did pick up a few more details about virtualization:

Medovich: Which are you using, Xen or KVM?

Audience member: KVM

Medovich: Good for you.

Medovich brought up SAN storage architectures and noted a trend... with aggregate bandwidth of racks is approaching a TB...

Medovich: Are you using a SAN or local storage?

Audience member: local storage

Medovich: That's the right answer.

He brought up the big Amazon outage and explained the causes in some detail. A big compute job could generate a bunch of data in one zone and then decommission the nodes. Then the customer wants to re-instantiate the 200 nodes, but that much compute is only available in another zone. So Amazon would have to migrate the data. It's like de-fragmenting a disk. And eventually, the aggregate bandwidth brought the whole thing down.

The next generation architecture will have to continuously de-fragment the data center.

p.s. A capsule subset of the slides he used is available:

01/25/2012 Winter 2012 ESCC/Internet2 Joint Techs Software Defined Networks - Juniper Networks

HERON Walnut update introduces O2 smart phrases, medication by dose and age at visit searching

In O2, smart phrases, commonly used words or phrases, are used to expedite patient documentation. Find these phrases under Reports->Visit Notes->Note Concepts.

Search medications by dose for inpatient medication orders. When constructing a search, drag "Cumulative Daily Dose of Single Inpatient Order" and you will be prompted to enter desired dose criteria.

Age at visit is searchable for patient visits. Please note this is based on de-identified dates. See HERON training for more information on how dates are shifted.

Need help using HERON? Learn how to Sponsor a HERON user, request data, and perform a search on the Informatics training video page.

HERON Walnut Contents Summary

This month, our tour of rivers and lakes in Kansas honors Walnut River.

The HERON repository contains approximately 856 million real observations from the hospital, clinics, and research systems:

Observation Patients Source Go-Live Snapshot Issues
Demographics 18.3M 1.93M
KUH Billing (O2 via SMS) 1980s Sept 2012 various*
UKP Billing 2000 Sept 2012
13.2K 13.2K Frontiers participant registry Jun 2009 Sept 2012
185.9K 185.9K Social Security Death Index 1962 Sept 2012
Diagnoses (IDC9) 33.6M 638K
KUH/O2/Epic Nov 2007 Sept 2012 various*
UKP Billing 2000 Sept 2012
University HealthSystem Consortium (UHC) Q4 2008 June 2012
Medications 71.9M 283K
Organized by VA Class Nov 2007 Sept 2012
KUH/O2/Epic Nov 2007 Sept 2012 various*
Nursing Observations 524M ?
KUH/O2/Epic Nov 2007 Sept 2012 various*
Lab Results 79.2M 278K
KUH/O2/Epic 2003 Sept 2012 various*
Procedure Orders 52.7M 425K
KUH/O2/Epic 2003 (?) Sept 2012 various*
Procedures (CPT) 10.4M 566K
UKP Billing 2000 Sept 2012
Reports/Notes 24M 214K
KUH/O2/Epic ? Sept 2012
Specimens 34.6K 3.22K
KUMC Biospecimen Repository ? Sept 2012
Visit Details ? ?
KUH/O2/Epic Nov 2007 Sept 2012 #1514
Cancer Cases 9.6M 65.2K
KUH Cancer Registry 1950s Sept 2012 labels*
Hospital Quality Metrics 4.12M 60.9K
University HealthSystem Consortium (UHC) Q4 2008 June 2012
Triple Negative Breast Cancer Registry (BRCA) 17.8K 133
REDCap July 2011 Sept 2012
All 816M


Some material in the UMLS Metathesaurus is from copyrighted sources of the respective copyright holders. Users of the UMLS Metathesaurus are solely responsible for compliance with any copyright, patent or trademark restrictions and are referred to the copyright, patent or trademark notices appearing in the original sources, all of which are hereby incorporated by reference.

Beta Disclaimer

We are providing this early access to obtain feedback from you, the research community. While we are actively working on validating the data loaded into the system with hospital and clinic technical staff, there may be problems with our translation of data from our source systems (HospitalEpicSource and ClinicIdxSource) into HERON.

Please email us at if you discover information you believe may be erroneous.

We are actively working on enhancing the types of data included. Stay tuned to our roadmap to track progress toward upcoming releases.

Various Issues Still Apply

Keep in mind the issues noted in the original HERON beta notice, including:

Enhancements and Problems/Defects/Issues Addressed in this Release

Approximately 2% of Medication Facts are Not Covered by our VA Med Hierarchy

Outstanding Problems/Defects/Issues

On language complexity as authority and new hope for secure systems

Why is the overwhelming majority of common networked software still not secure, despite all effort to the contrary? Why is it almost certain to get exploited so long as attackers can craft its inputs? Why is it the case that no amount of effort seems to be enough to fix software that must speak certain protocols?

The video of The Science of Insecurity by Meredith Patterson crossed my radar several times last year, but I just recently found time to watch it. She offers hope:

In this talk we'll draw a direct connection between this ubiquitous insecurity and basic computer science concepts of Turing completeness and theory of languages. We will show how well-meant protocol designs are doomed to their implementations becoming clusters of 0-days, and will show where to look for these 0-days. We will also discuss simple principles of how to avoid designing such protocols.

In memory of Len Sassaman

In discussion of Postel's Principle, she argues:

  • Treat input-handling computational power [aka input language complexity] as privilege, and reduce it whenever possible.

This is essentially the principle of least privilege, which is the cornerstone of capability systems.

I have been arguing for keeping web language complexity down since I started working on HTML. The official version is the 2006 W3C Technical Architecture Group finding on The Rule of Least Power, but as far back as my 1994 essay, On Formally Unconvertable Document Formats, I wrote:

The RTF, TeX, nroff, etc. document formats provide very sophisticated automated techniques for authors of documents to express their ideas. It seems strange at first to see that plain text is still so widely used. It would seem that PostScript is the ultimate document format, in that its expressive capabilities include essentially anything that the human eye is capable of perceiving, and yet it is device-independent.

And yet if we take a look at the task of interpreting data back into the ideas that they represent, we find that plain text is much to be preferred, since reading plain text is so much easier to automate than reading GIF files (optical character recognition) or postscript documents (halting problem). In the end, while the source to a various TeX or troff documents may correspond closely to the structure of the ideas of the author, and while PostScript allows the author very precise control and tremenous expressive capability, all these documents ultimately capture an image of a document for presentation to the human eye. They don't capture the original information as symbols that can be processed by machine.

To put it another way, rendering ideas in PostScript is not going to help solve the problem of information overload -- it will only compound the situation.

But as recently as my Dec 2008 post on Web Applications security designs, I didn't see the connection between language complexity and privilege, and had little hope of things getting better:

The E system, which is a fascinating model of secure multi-party communication (not to mention lockless concurrency), [...] seems an impossibly high bar to reach, given the worse-is-better tendency in software deployment.

On the other hand, after wrestling with the patchwork of javascript security policies in browsers in the past few weeks, the capability approach in adsafe looks simple and elegant by comparison. Is there any chance we can move the state-of-the-art that far?

After all, who would be crazy enough to essentially throw out all the computing platforms we use and start over?

I've been studying CapROS: The Capability-based Reliable Operating System. Its heritage goes back through EROS in 1999 and KeyKOS in 1988 to GNOSIS in 1979. After a few hours of study, I started to wonder where the pull would come from to provide energy to complete the project. Then this headline crossed my radar:

I saw some comments encouraging them to look at EROS. I hope they do. Meanwhile, Capsicum: practical capabilities for UNIX lets capability approaches co-exist with traditional unix security.

These days, the browser is the biggest threat vector, and turing-complete data, i.e. mobile code, remains notoriously difficult to secure:

The sort of thing that gives me hope is chromium-capsicum - a version of Google's Chromium web browser that uses capability mode and capabilities to provide effective sandboxing of high-risk web page rendering.

Another is servo, Mozilla's exploration into a new browser architecture built on rust. Rust is a new systems programming language designed toward concerns of “programming in the large”, that is, of creating and maintaining boundaries – both abstract and operational – that preserve large-system integrity, availability and concurrency.

It took me several hours, but the other night I managed to build rust and servo. While servo is clearly in its infancy, passing a few dozen tests but not bearing much resemblance to an actual web browser, rust is starting to feel quite mature.

I'd like to see more of a least-authority approach in the rust standard library. Here's hoping for time to participate.