|
Message-ID: <20180907172312.GO1878@brightrain.aerifal.cx> Date: Fri, 7 Sep 2018 13:23:12 -0400 From: Rich Felker <dalias@...c.org> To: musl@...ts.openwall.com Subject: internal header proposal I'm presently working on moving most or all inline-in-source-file declarations of internal-use interfaces to header files, so that type mismatches between points of use and points of declaration can be caught, and so that I can switch them over to hidden visibility without having to worry about inconsistent application of visibility, where the following errors are easy to make: 1. On definition, missing from declaration at site of use: definition binds at link time, but caller may generate inefficient code using GOT/PLT unnecessarily. 2. On declaration at site of use, missing from definition: depending on arch and linker version, linker may produce an error about unsatisfiable relocation and refuse to link. The second is a big problem (regression risk) applying visibility to internal interfaces, so a good method to preclude it is needed. Putting the hidden attribute only on the declaration in the headers, and omitting it everywhere else, should avoid it entirely, and also avoids the first problem as long as -Wmissing-declarations passes. Anyway, it turns out we have roughly two distinct types of internal interfaces: 1. Namespace-safe versions of standard/public interfaces that allow parts of one subsystem to be used to implement another in cases where the namespace rules would not allow the normal public interfaces to be used. This includes things like pthread functions used to implement C11 threads or thread-safety in plain-C interfaces, __strchrnul, resolv.h functions used in getaddrinfo, mman.h functions used in malloc, etc. 2. Interfaces that are private to a particular subsystem. This includes things like the timezone functions from __tz.c and related files, all the internal stdio and pthread and locale glue, etc. The reason I've broken them down into these two categories is that the latter already have appropriate places to declare them: the corresponding *_impl.h header files (sometimes named differently) for their subsystems, but the former don't. Putting the former group in with the latter would just massively balloon the set of source files that need to include some *_impl.h header, and thereby obscure which files are really intended/allowed to poke at internals of a subsystem vs just needing access to namespace-safe public or semi-public interfaces from that subsystem. So, we need a new place to declare the first group, and I have two possible ways to do it: Option 1: The big fancy header wrapping Add a new tree of "wrapper headers" for public headers (let's call it $(srcdir)/src/include), and -I it before the real public ones ($(srcdir)/include). These new headers include their corresponding public header (../../include/[self].h) then add anything else that's supposed to be "public within musl". For example sys/mman.h would have stuff like: hidden void __vm_wait(void); hidden void __vm_lock(void); hidden void __vm_unlock(void); hidden void *__mmap(void *, size_t, int, int, int, off_t); hidden int __munmap(void *, size_t); hidden void *__mremap(void *, size_t, size_t, int, ...); hidden int __madvise(void *, size_t, int); hidden int __mprotect(void *, size_t, int); hidden const unsigned char *__map_file(const char *, size_t *); Now, every file that needs to use mman.h functions without violating namespace can just #include <sys/mman.h> and use the above. If we wanted, at some point we could even #define the unprefixed names to remap to the prefixed ones, and only #undef them in the files that define them, so that everything automatically gets the namespace-safe, low-call-overhead names. This idea is a lot like how syscall()/__syscall() work now -- the musl source files get programmed with familiar interfaces, and a small amount of header magic makes them do the right thing rather than depending on a public namespace violation. If this all seems too radical, or like it has potential pitfalls we need to think about before committing to it, I have a less invasive proposal too: Option 2: New namespaced.h header Introduce a single new header that declares all of the namespace-safe interfaces across all subsystems, with minimal dependencies on other headers so that it can be included everywhere it's needed with low cost. Unfortunately some functions need types exposed, but <sys/types.h> would probably suffice to get just those without pulling in lots of other stuff. I think the second option is actually more invasive to the source tree, in terms of adding #include lines to files. Option 1 has slightly more hidden complexity, but leads to simplification of the source, and the complexity does not significantly detract from readability of the source, in my opinion. Thoughts on any of this? So far I've been staging commits moving the subsystem-private internal declarations to appropriate headers (type 2 above), but doing nothing with the namespace-safe versions of public interfaces (type 1 above). But I'd like to start on them soon too. When this is all over, I'll be able to add hidden visibility on all of these, and most of the efficiency lost in having to drop vis.h (see commit dc2f368e565c37728b0d620380b849c3a1ddd78f) will be regained. Dynamic linking performance should also slightly increase. Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.