Posts in category networking

Mark Medovich on Software Defined Networks

Mark Medovich, Juniper's Chief Architect for the Public Sector, gave a very interesting talk at the Kansas City Software Defined Networking Luncheon, hosted by FishNet Security.

It directly addressed a gap in my knowledge regarding our new NetworkInnovation project, where we plan to "evaluate the applicability of GENI software defined networking and OpenFlow/Openstack technologies to support the secure transmission and storage of personal health information with the Google Fiber network."

I have enough experience as a customer of VM clusters to have vague notion of what OpenStack is about, but I'm brand new to OpenFlow and software defined networks. My first thought on exposure to them was: so what happened to The Rise of the Stupid Network?

Medovich explained that OpenFlow development was driven by high performance computing (HPC). Researchers are trying to reduce the latency for moving data between compute nodes.

The term "switch fabric" flew by... one of many buzzwords that I'm slowly picking up. "If we are going All L2..." I recognized L2 as a reference to a layer in the OSI model, but I didn't remember much about it, and I didn't have enough connectivity to look it up. Afterward, I reminded myself that it's where switches live, as opposed to hubs below and routers above.

"Networks within networks" was another phrase that caught my attention. It appealed to me as like scale-free design in Web Architecture. It reminded me of heated discussions in the IETF about the evils of NAT vs. the end-to-end purity of IPv6. The people I trust were on the IPv6 side (and IPv6 is great for lots of other reasons) but as I reflected later, the idea of one big flat IPv6 network seems like a monoculture, not scale-free.

He talked about multi tenancy data centers:

photo of "Multi-tenant flows within an end site" slide by Medovich

He used an example from when he was at Sun, visiting CVS Caremark: they had to provision for the monthly Medicaid Monday burst, which left a lot of excess capacity for most of the month. My understanding is that Amazon's cloud services came about roughly the same way: they have to provision for Christmas, which left them with a lot of spare capacity most of the time.

Traditional three-level networks are OK provided capacity is reasonably predictable, he said, but they don't deal with dynamic demand.

 "photo of "SCALING Multi-tenant SERVICES" slide by Medovich"

You can't make service level agreements (SLA) for dynamic demand with traditional networks; the best you can do is a service level probability (SLoP).

This brings us to OpenFlow. "Open flow is all about the data center and making virtualization better."

He introduced it using a slide from the OpenFlow Presentation:

"Here's the problem: that step 2. Encapsulate and forward to controller. What controller?" The controller isn't specified, he said.

Variability of OpenFlow devices (switches from this vendor or that) introduces too many variables. The only way the Juniper engineering team could see to make the scaling work was an any-to-any switch fabric. They had to collapse the network, from 3 tiers to 2 to 1.

People are building this sort of scalable multi-tenancy network, he said. But not it's not OpenFlow. Cisco UCS, Juniper fabric. 10000 ports. Software programmable.

"Don't get me wrong; I'm not here to knock OpenFlow. Juniper does support OpenFlow." Just don't expect OpenFlow to be the whole solution. I gather Juniper has filled in all the gaps implicit in OpenFlow use cases.

He threw out "... close to the lambda ..." as a goal the audience would be familiar with. This audience member was not.

Lambda switching uses small amounts of fiber-optic cable and differing light wavelengths (called lambdas) to transport many high-speed datastreams to their destinations -- Network World research center

His discussion of trust, interfaces, and economics reminded me of studying Miller's work on object capability security, the principle of least authority, and patterns of cooperation without vulnerability. More on that in another item. Meanwhile...

I felt more on solid ground when he started to discuss software architecture.

Way back, juniper decided to put a XML RPC server in every switch. This was adopted by the IETF as Netconf. Juniper has a rich SDK layered on top. This is how they rapidly ported OpenFlow to their devices. In answer to criticism that they don't have OpenFlow implemented in firmware, he compared it with developing a new platform from scratch, an argument with obvious appeal, to me. Who's going to implement "legacy" protocols like PPOE, baked into various purchasing specs?

While working toward Junosphere, the software team saw that they couldn't wait until the hardware was finished; they virtualized the whole thing. The result is now used by major telcom providers (Comcast, Telecom Italia) as a test lab.

At the other extreme, he explained how their architecture scales down to multi-tenancy embedded applications to meet military needs.

He mentioned in passing that their architecture includes a JBoss application server. I have a bit of a bad taste from using JBoss as the platform for HERON and hence i2b2, but I gather the version we use is ages out of date. So this nod encourages me to keep an open mind.

He mentioned a "single pane of glass" user interface with roles and permissions and templates. Again, I wondered to what extent this architecture employs the principle of least authority.

The Q&A that followed the talk quickly went over my head with "top of rack architectures" and such. But I did pick up a few more details about virtualization:

Medovich: Which are you using, Xen or KVM?

Audience member: KVM

Medovich: Good for you.

Medovich brought up SAN storage architectures and noted a trend... with aggregate bandwidth of racks is approaching a TB...

Medovich: Are you using a SAN or local storage?

Audience member: local storage

Medovich: That's the right answer.

He brought up the big Amazon outage and explained the causes in some detail. A big compute job could generate a bunch of data in one zone and then decommission the nodes. Then the customer wants to re-instantiate the 200 nodes, but that much compute is only available in another zone. So Amazon would have to migrate the data. It's like de-fragmenting a disk. And eventually, the aggregate bandwidth brought the whole thing down.

The next generation architecture will have to continuously de-fragment the data center.

p.s. A capsule subset of the slides he used is available:

01/25/2012 Winter 2012 ESCC/Internet2 Joint Techs Software Defined Networks - Juniper Networks

On language complexity as authority and new hope for secure systems

Why is the overwhelming majority of common networked software still not secure, despite all effort to the contrary? Why is it almost certain to get exploited so long as attackers can craft its inputs? Why is it the case that no amount of effort seems to be enough to fix software that must speak certain protocols?

The video of The Science of Insecurity by Meredith Patterson crossed my radar several times last year, but I just recently found time to watch it. She offers hope:

In this talk we'll draw a direct connection between this ubiquitous insecurity and basic computer science concepts of Turing completeness and theory of languages. We will show how well-meant protocol designs are doomed to their implementations becoming clusters of 0-days, and will show where to look for these 0-days. We will also discuss simple principles of how to avoid designing such protocols.

In memory of Len Sassaman

In discussion of Postel's Principle, she argues:

  • Treat input-handling computational power [aka input language complexity] as privilege, and reduce it whenever possible.

This is essentially the principle of least privilege, which is the cornerstone of capability systems.

I have been arguing for keeping web language complexity down since I started working on HTML. The official version is the 2006 W3C Technical Architecture Group finding on The Rule of Least Power, but as far back as my 1994 essay, On Formally Unconvertable Document Formats, I wrote:

The RTF, TeX, nroff, etc. document formats provide very sophisticated automated techniques for authors of documents to express their ideas. It seems strange at first to see that plain text is still so widely used. It would seem that PostScript is the ultimate document format, in that its expressive capabilities include essentially anything that the human eye is capable of perceiving, and yet it is device-independent.

And yet if we take a look at the task of interpreting data back into the ideas that they represent, we find that plain text is much to be preferred, since reading plain text is so much easier to automate than reading GIF files (optical character recognition) or postscript documents (halting problem). In the end, while the source to a various TeX or troff documents may correspond closely to the structure of the ideas of the author, and while PostScript allows the author very precise control and tremenous expressive capability, all these documents ultimately capture an image of a document for presentation to the human eye. They don't capture the original information as symbols that can be processed by machine.

To put it another way, rendering ideas in PostScript is not going to help solve the problem of information overload -- it will only compound the situation.

But as recently as my Dec 2008 post on Web Applications security designs, I didn't see the connection between language complexity and privilege, and had little hope of things getting better:

The E system, which is a fascinating model of secure multi-party communication (not to mention lockless concurrency), [...] seems an impossibly high bar to reach, given the worse-is-better tendency in software deployment.

On the other hand, after wrestling with the patchwork of javascript security policies in browsers in the past few weeks, the capability approach in adsafe looks simple and elegant by comparison. Is there any chance we can move the state-of-the-art that far?

After all, who would be crazy enough to essentially throw out all the computing platforms we use and start over?

I've been studying CapROS: The Capability-based Reliable Operating System. Its heritage goes back through EROS in 1999 and KeyKOS in 1988 to GNOSIS in 1979. After a few hours of study, I started to wonder where the pull would come from to provide energy to complete the project. Then this headline crossed my radar:

I saw some comments encouraging them to look at EROS. I hope they do. Meanwhile, Capsicum: practical capabilities for UNIX lets capability approaches co-exist with traditional unix security.

These days, the browser is the biggest threat vector, and turing-complete data, i.e. mobile code, remains notoriously difficult to secure:

The sort of thing that gives me hope is chromium-capsicum - a version of Google's Chromium web browser that uses capability mode and capabilities to provide effective sandboxing of high-risk web page rendering.

Another is servo, Mozilla's exploration into a new browser architecture built on rust. Rust is a new systems programming language designed toward concerns of “programming in the large”, that is, of creating and maintaining boundaries – both abstract and operational – that preserve large-system integrity, availability and concurrency.

It took me several hours, but the other night I managed to build rust and servo. While servo is clearly in its infancy, passing a few dozen tests but not bearing much resemblance to an actual web browser, rust is starting to feel quite mature.

I'd like to see more of a least-authority approach in the rust standard library. Here's hoping for time to participate.

"In-Home Monitoring in Support of Caregivers for Patients with Dementia" obtains NSF US-Ignite grant

The U.S. National Science Foundation (NSF) awarded us an exploratory research (EAGER) grant for In-Home Monitoring in Support of Caregivers for Patients with Dementia. The investigator team is:

  • Dr. Russ Waitman, Principal Investigator, is Director of Biomedical Informatics at KU Medical Center.
  • Dr. Kristine Williams, Co-Investigator, is Associate Professor of Nursing and Associate Scientist of Gerontology at the University of Kansas.
  • Dr. James Sterbenz, Co-Investigator, is the lead PI of an NSF GENI project: The Great Plains Environment for Network Innovation (GpENI).

This project develops, integrates, and tests advanced video and networking technologies to support family caregivers in managing behavioral symptoms of individuals with dementia, a growing public health problem that adds to caregiver stress, increases morbidity and mortality, and accelerates nursing home placement. The project builds upon a recent University of Kansas Medical Center (KUMC) clinical pilot study that tested the application of video monitoring in the home to support family caregivers of persons with Alzheimer’s disease who exhibited disruptive behaviors. The proposed project focuses on expanding the in-home technological tools available to strengthen the linkage between patients and caregivers with their healthcare team via multi-camera full-motion/high definition video monitoring. Google’s deployment this year of a 1 Gpbs fiber network throughout Kansas City provides the ideal environment for measuring the impact that ultra-high speed networking will have on health care.

fig 2 from US Ignite_FINAL_EAGERv_14.docx from Russ 30 Aug 2012

In a January NSF press release, the National Science Foundation (NSF) "announced that it will serve as the lead federal agency for a White House Initiative called US Ignite, which aims to realize the potential of fast, open, next-generation networks."

Our new connection with US Ignite provides access to resources in that community such as Mozilla Ignite and the GENI network lab. If you'd like to get involved, email Dan Connolly and Russ Waitman.