|
Message-ID: <20191031131926.GL16318@brightrain.aerifal.cx> Date: Thu, 31 Oct 2019 09:19:26 -0400 From: Rich Felker <dalias@...c.org> To: Ruinland ChuanTzu Tsai <ruinland@...estech.com> Cc: musl@...ts.openwall.com, alankao@...estech.com Subject: Re: Cross-compiling test flow for libc-test On Thu, Oct 31, 2019 at 02:04:06PM +0800, Ruinland ChuanTzu Tsai wrote: > Hi, > sorry for sending this email out of the blue. > > I'm wondering whether there are any (official) guides about how to do > cross-compiling tests for libc-test ? > > If I understand the Makefile of libc-test correctly, it will compile > test units and then execute those tests on _host_ platform right away. > > Somehow, whilst I was cross-testing glibc, there's a script, > cross-test-ssh.sh, which could be used with `test-wrapper` hook to ex- > ecute those freshly compiled test units on a hetero-architecture platf- > orm via ssh connections. > > If there's no such mechanism for libc-test at this time being, then I'm > willing to develope one. > > That being said, I'm curious about how's the attitude which maintainers > take toward this kind of testing flow. I don't have any sort of automated CI setup I use at present. Generally I run or ask others for results from running tests on archs when changes potentially affecting them have been made, and I do this myself on x86 and sometimes a few other archs, especially before releases where any invasive change has been made. Ability to auto-deploy and run tests across a number of archs would be nice. I don't think anyone has a sufficient set of physical machines setup for this, and that's a big barrier to entry, so what might be a better setup is launching the tests via qemu system-level emulation, setting up a virtfs root to pass into each guest and from which to read the output. Note: qemu user-level emulation will necessarily fail lots of the tests due to problems with how qemu emulates (or rather doesn't; it just passes them thru) signals and how that interacts with thread cancellation. But system-level should be good, and could even let you test different setups that will exercise different code paths in musl like enabling/disabling vdso, or old kernels lacking new syscalls. One thing I would encourage for any automated/CI type testing is layering. Rather than trying to make libc-test into an infrastructure-dependent framework itself, use it as a tool that gets pulled and used. > Aside from cross-testing, I also wonder the status of testing reports > for releases on currently supported CPU architectures. > As I was running libc-test on x86_64, some of functional and regression > tests fail. Here's a list of what I expect to fail. It should probably be turned into a page on the wiki or documented somewhere more visible than this mailing list post: api/main fails due to some confstr and pathconf item macros we don't yet define (we were waiting on glibc to assign numbers so we could align them). functional/strptime fails due to unimplemented functionality. functional/utime may fail due to time_t being 32-bit or kernel lacking time64 support. math/* may fail due to very minor status flag or precision issues. musl/pleval fails in the dynamic-linked version only because it references an internal-use symbol which is hidden in modern musl; the static version successfully gets the symbol. regression/malloc-brk-fail fails conditionally on kernel behavior; the failure is not a problem in musl or the kernel but rather in the test code's ability to setup the right VM space state needed to perform the actual test, which is hard to do. Otherwise, any failures are unexpected, I think. > Is there a validating rule (e.g. funcional/xxx and regression/yyy must > pass) for code checking-in which I can enforce locally before > submitting patches here ? You could turn the above into a rule, but for most things the coverage is not sufficient to tell you that a change is likely-valid. Since musl source is not highly coupled, generally you'll at bet get indication of problem from test files that are testing the specific component you modified, or where the test setup itself depends on functionality you modified. At present I think the tests are most valuable for: 1. preparing ports to new archs, where errors in bits headers or other arch-specific files are often caught by something not working. 2. documenting conformance subtleties and ways to detect them, to avoid regressions if the relevant component is modified and to expose related bugs in other libc implementations and get them fixed. But if you're working on (modifying, or just reading) a component that doesn't seem to have test coverage, writing and submitting tests would be very helpful. Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.