|
Message-ID: <20200908192746.GA7854@voyager> Date: Tue, 8 Sep 2020 21:27:46 +0200 From: Markus Wichmann <nullplan@....net> To: musl@...ts.openwall.com Subject: Re: realpath without procfs On Tue, Sep 08, 2020 at 01:19:04PM -0400, Rich Felker wrote: > Since it was raised yet again on #musl, I took some time to research > justification for, and ease of implementing, realpath without procfs. > I do remember dietlibc's implementation of realpath(). But that has serious side effects that make it not thread-safe. The basic idea they had was to use chdir() and getcwd() to get the kernel to normalize the paths without having to read it from procfs. Not needing procfs was one of the design goals of that project, so that is why they implemented it that way. Unfortunately, in some cases chdir() is irreversible (e.g. deleted working directory), and also, there is only one working directory per process, so while this is going on, all other threads will have trouble finding their files. Adding locking to prevent the other threads from noticing this would be challenging, to say the least, if not outright impossible. There are just so many places where the working directory plays a role. Oh, and one more side effect: While the working directory is switched elsewhere, another process may unmount the volume containing the original directory. You could open "." first, to prevent this, but that adds another two syscalls overhead. > - ttyname (important to things that use it) > I don't see much of an alternative to using procfs for that one. You could probably search for device and inode of the fd among /dev/tty* and /dev/pts/* but that seems like a hack. That should probably be at most a fallback, if the normal way through /proc doesn't work. > - dynamic linker identifying executable pathname > Well, Linux could just pass AT_EXECFN. But if it doesn't, unless they want to add Solaris' getexecname() syscall, /proc/self/exe is the only link to the executable file name. > This is actually a lot less than I expected, and makes it reasonable > to envision a path to eventually not needing procfs at all. > > So, I did the work to figure out what would be needed to write a > procfs-free realpath, and it turns out that actually writing it was > not any harder, so I did. Attached is a draft. It needs testing, and > I'm undecided whether we should keep the old code and just use this as > a fallback, or just replace it. (The old code has fixed 5-syscall > overhead and ugly side effects on kernels too old to have O_PATH; new > code needs one syscall per path component and might (?) have worse or > different behavior under concurrent changes to the dir tree.) > > Some notes: > > - Attempts to support all pathnames where no intermediate exceeds > PATH_MAX. > > - Initial // is treated as special, but //. and //.. resolve to / > > - getcwd is expanded initially if pathname is relative. This might be > a bad choice since it causes failure whenever pwd is not > representable even if the symlink reached via a relative pathname > would take us to an absolute path that is representable. I just checked, and glibc does the same thing. So at least you are in good company with being unable to handle unreachable working directories in realpath(). > We could > accumulate a relative path, including preserving .. components, > until the first absolute-target symlink, and only apply it by > prepending (and cancelling ..) at the end if no absolute-target > symlink was encountered, but that requires some rework to do. > > Thoughts? > > Rich > #define _GNU_SOURCE > #include <stdlib.h> > #include <limits.h> > #include <errno.h> > #include <unistd.h> > #include <string.h> > > char *realpath(const char *restrict filename, char *restrict resolved) > { > char output[PATH_MAX], stack[PATH_MAX]; > size_t p, q, l, cnt=0; > > l = strlen(filename); > if (l > sizeof stack) goto toolong; Shouldn't that be strnlen(), then? > p = sizeof stack - l - 1; > memcpy(stack+p, filename, l+1); > > if (stack[p] != '/') { > if (getcwd(output, sizeof output) < 0) return 0; > q = strlen(output); > } else { > q = 0; > } > > while (stack[p]) { > if (stack[p] == '/') { > q=0; > p++; > /* Initial // is special. */ > if (stack[p] == '/' && stack[p+1] != '/') { You already incremented p here. Did you want to test for "///"? The comment indicated otherwise. > output[q++] = '/'; > } > while (stack[p] == '/') p++; > } > char *z = __strchrnul(stack+p, '/'); > l = z-(stack+p); > if (l<=2 && stack[p]=='.' && stack[p+l-1]=='.') { > if (l==2) { > while(q>1 && output[q-1]!='/') q--; > if (q>1) q--; > } > p += l; > while (stack[p] == '/') p++; > continue; > } > if (l==1 && stack[p]=='.') > if (l+2 > sizeof output - q) goto toolong; I believe you forgot to finish the first "if" line here. Also, you have already handled the "." path at this point. > output[q] = '/'; > memcpy(output+q+1, stack+p, l); > output[q+1+l] = 0; > p += l; > ssize_t k = readlink(output, stack, p); > if (k==-1) { > if (errno == EINVAL) { > q += 1+l; > while (stack[p] == '/') p++; > continue; > } > return 0; > } > if (k==p) goto toolong; > if (++cnt == SYMLOOP_MAX) { > errno = ELOOP; > return 0; > } > p -= k; > memmove(stack+p, stack, k); > } > if (!q) output[q++] = '/'; > output[q] = 0; > return resolved ? strcpy(resolved, output) : strdup(output); > > toolong: > errno = ENAMETOOLONG; > return 0; > } Ciao, Markus
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.