Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20131213172807.GD24286@brightrain.aerifal.cx>
Date: Fri, 13 Dec 2013 12:28:07 -0500
From: Rich Felker <dalias@...ifal.cx>
To: musl@...ts.openwall.com
Subject: Re: validation of utf-8 strings passed as system call
 arguments

On Fri, Dec 13, 2013 at 05:52:35AM -0700, writeonce@...ipix.org wrote:
>    As always, you are absolutely right:-)  but my situation is slightly
>    different, though; the input I receive is expected to be in utf-8, but the
>    nt kernel only accepts utf-16.  This means that I need to choose between
>    conversion that is based on bit distribution only, which might  produce
>    ill-formed utf-16 byte sequences, or do all the validation on my end
>    despite the minor performance penalty.  Since path strings are normally
>    only a few hundred bytes long, and given that the nt kernel cannot be
>    (easily) debugged from my end, I'm leaning towards the latter option.

There's no way to convert between UTF-8 and UTF-16 without
parsing/decoding the UTF-8, which includes validating it for free if
your parser is written properly. Failure to validate would lead to all
sorts of bugs, many of them dangerous, including things like treating
strings not containing '/', '\', ':', '.', etc. as if they contained
those characters, resulting in directory escape vulnerabilities.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.