|
Message-ID: <ebd3d249937ca5b34ac8aac04eb4158c@ispras.ru> Date: Wed, 25 Nov 2020 08:40:02 +0300 From: Alexey Izbyshev <izbyshev@...ras.ru> To: musl@...ts.openwall.com Subject: Re: realpath without procfs -- should be ready for inclusion On 2020-11-24 23:31, Rich Felker wrote: > On Mon, Nov 23, 2020 at 11:26:46PM -0500, Rich Felker wrote: >> On Tue, Nov 24, 2020 at 06:39:59AM +0300, Alexey Izbyshev wrote: >> > * ENOTDIR should be returned if the last component is not a >> > directory and the path has one or more trailing slashes >> >> Yes, that's precisely what I've been working on the past couple hours. >> I think you missed but .. will also erase a path component that's not >> a dir (e.g. /dev/null/.. -> /dev) and these are both instances of a >> common problem. I thought use of readlink covered all the ENOTDIR >> cases but it doesn't when the next component isn't covered by readlink >> or isn't present at all. >> >> It's trivial to fix with a check after each component but that doubles >> the number of syscalls and mostly isn't necessary. I have a reworked >> draft to fix the problem by advancing over /(/|./|.$)* rather than >> just >> /+ after each component, so that we can lookahead and do an extra >> readlink in the cases that need it. > > While this worked, it ended up being the wrong thing to do, making two > places where readlink is called, one of them with a dummy buffer. The > right way to do it is rework the flow so that the existing readlink is > "naturally" hit where needed. This amounts to: > > - Letting .. processing that cancels path components go through the > same code path as new path components, rather than handling it > early, and just skipping the actual readlink if we already know we > have a dir. > > - Also treating a zero-length final component as something that goes > through the readlink code path. > > There was a fair amount of reorganizing needed to make this work out, > but the end result is clean and non-redundant and code size is almost > the same as before with the missing-ENOTDIR bugs. > > Speaking of code size, on 32-bit archs the proposed explicit realpath > is roughly the same size as stat+fstat+fstatat (a little over 1k on > i386), which were needed to implement the old lazy realpath in terms > of procfs. So for minimal static linking, resulting code size may be > same or smaller. (Of course it's larger if stat is already linked for > other reasons.) > > New draft attached. It's possible that there are regressions since I > haven't put together an automated testset. I'm not sure if I'll try to > merge it in this release cycle still or not; that probably depends on > how easy or difficult automating these tests ends up being. > The new draft looks good to me. I've also done some basic manual testing (not covering all proposed cases) and haven't found any issues. I don't see why the size of stack has to be PATH_MAX+1 though. To address the issue with symlink targets of PATH_MAX-1 length, it seems sufficient to just do the following: - ssize_t k = readlink(output, stack, p); - if (k==p) goto toolong; + ssize_t k = readlink(output, stack, p+1); + if (k==p+1) goto toolong; Since p is never past the end of the stack, there is no harm in allowing k == p. Alexey
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.