Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date: Tue, 7 Oct 2014 16:40:22 +0000
From: "Mehaffey, John" <>
To: "" <>
CC: "" <>
Subject: Separating code and data

> From: Tim []
> Sent: Tuesday, October 07, 2014 8:23 AM
> To:
> Cc: Hanno Böck
> Subject: Re: [oss-security] Thoughts on Shellshock and beyond
> > > What class of bug is Shellshock? "Weird feature invented in
> >   pre-Internet era"? How do you conquer this class of bugs?
> >
> > I am still struggling with this one.  I am trying to create that list here:
> >
> >
> > But to be honest, that list is pretty pathetic. This is a challenging class of vulnerability to detect or prevent ahead of time. Ideas would be very welcome.
> I wouldn't go so far as to say shellshock has a well-defined "class"
> of vulnerability or bucket that we can stick it in, but it does
> violate one of my own personal (and I think, the most important)
> _principles_ of secure software design:  don't mix code and data.
> What do I mean by that?  Concrete examples of failures:
>   * word docs with macros
>   * document markup with embedded script (yes: HTML/JS)
>   * OGNL expressions in Struts URL parameters
> Any time you design a system to accept executable code as well as data
> in the same format/context/whatever, you invite a huge number of
> possible attacks.  These attacks may not manifest themselves
> immediately or obviously.  It may require a change in the way the
> software is used, or implementation bugs to expose the risk, but it
> is a highly risky design approach.
> People expect office documents to be data, but in fact they can
> include a limited form of code as well.  In the case of word docs and
> macros, the risk was exposed by implementation bugs and the difficulty
> of keeping the language sandboxed.
> In the case of HTML/JS, the risk came from the way JS is embedded
> inline in so many locations people can't safely allow HTML (a data
> markup format) without allowing JS as well.  (If JS were only allowed
> as external resources and not as, say, events embedded in attributes,
> it would be less mixed and easier to make safe).
> In Apache Struts, OGNL is used are used to parse the entire POST body,
> variable names and values.  However, OGNL expressions are executable
> code, which breaks the whole assumption that POST variables are data.
> So the Struts team is now playing whack-a-mole with blacklist blocking
> of specific attack vectors.
> In the case of shellshock, the "mixing" of code and data came about
> because environment variables, normally used to carry data, were
> overloaded and used to carry code.  This is very similar to the Struts
> case.
> David: your item "Create namespaces where practicable" is effectively
> an implementation of what I'm talking about here.  By creating
> namespaces, you're creating a partition between code and data.  But
> the underlying principle is just to keep these two things separate and
> *well defined* as separate via whatever mechanism makes the most sense.
> Cheers,
> tim

I think that separating code and data belongs on David's list of "Most Important
Software Innovations" (, although
arguably the "Separating Text Content from Format" innovation is an example 
of the class.

>From allowing better cache locality (modern architectures now have both an
i-cache and a d-cache) to the security improvements mentioned above, it is a 
software concept that has paid many dividends over the years.

John Mehaffey
Linux System Architect
Mentor Graphics

Powered by blists - more mailing lists

Please check out the Open Source Software Security Wiki, which is counterpart to this mailing list.

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.