|
Message-ID: <20211123150537.GZ7074@brightrain.aerifal.cx> Date: Tue, 23 Nov 2021 10:05:37 -0500 From: Rich Felker <dalias@...c.org> To: "Nieminen, Jussi" <Jussi.Nieminen@...atrace.com> Cc: "musl@...ts.openwall.com" <musl@...ts.openwall.com> Subject: Re: Bug in getaddrinfo causing spurious returns with wrong error values On Tue, Nov 23, 2021 at 02:47:49PM +0000, Nieminen, Jussi wrote: > Hi, > > I'm a developer from the performance monitoring company Dynatrace, and I've been > recently investigating curious problems at our customers' environments where a > call to musl's getaddrinfo appears to spuriously return ENOENT when called from > a node.js application that is being monitored with the Dynatrace agent. > > I managed to pinpoint the problem to the code that performs the AI_ADDRCONFIG > check. If an address family that is not enabled on the host is specified, a call > to "connect" in that code fails, the socket fd is closed, and the value of > "errno" is then evaluated. > > The problem is that the call to "close" can change the value of errno, which > will break the switch-case that follows it. Especially if aio is used (which is > the case when the Dynatrace agent is included in the application), the call to > close will end up setting errno to ENOENT by default (even without a failure) > within the "aio_cancel" function if an aio operation is active. In such a case > getaddrinfo will then incorrectly return EAI_SYSTEM with errno set to ENOENT. > > (After some error code translations within libuv, node.js will then print an > error message claiming that getaddrinfo failed with ENOENT which is rather > confusing.) Indeed, this all makes sense. > Even if aio is not used, the code might fail whenever "close" gets interrupted > and returns with errno set to EINTR. As the return value of close is not > checked, the errno might thus "silently" change before getting evaluated with > the assumption that it still contains the value set when "connect" failed. close can't return with EINTR but can return with EINPROGRESS which would give the same result here, I think. > Below is a simple patch that should take care of this problem. Let me know if I > can provide any more information or if there is anything else I can help with. I think this patch is probably okay. I wondered if close was in the set of functions POSIX-future intends to require not to touch errno on success, but it doesn't seem to be, and moreover the EINPROGRESS semantics would undermine that anyway. So saving errno before calling close is probably the right thing to do here. > Thanks, > Jussi Thanks for the clear analysis and patch! > ------------------------------------------------------------------------------- > diff --git a/src/network/getaddrinfo.c b/src/network/getaddrinfo.c > index efaab306..71809856 100644 > --- a/src/network/getaddrinfo.c > +++ b/src/network/getaddrinfo.c > @@ -16,6 +16,7 @@ int getaddrinfo(const char *restrict host, const char *restrict serv, const stru > char canon[256], *outcanon; > int nservs, naddrs, nais, canon_len, i, j, k; > int family = AF_UNSPEC, flags = 0, proto = 0, socktype = 0; > + int saved_errno = 0; > struct aibuf *out; > > if (!host && !serv) return EAI_NONAME; > @@ -66,11 +67,14 @@ int getaddrinfo(const char *restrict host, const char *restrict serv, const stru > pthread_setcancelstate( > PTHREAD_CANCEL_DISABLE, &cs); > int r = connect(s, ta[i], tl[i]); > + /* The call to "close" might change errno, especially if aio is in use; > + * save the value set by "connect" for the later comparison. */ > + if (r < 0) saved_errno = errno; > pthread_setcancelstate(cs, 0); > close(s); > if (!r) continue; > } > - switch (errno) { > + switch (saved_errno) { > case EADDRNOTAVAIL: > case EAFNOSUPPORT: > case EHOSTUNREACH: > -------------------------------------------------------------------------------
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.