|
Message-ID: <20220720015457.GC7074@brightrain.aerifal.cx> Date: Tue, 19 Jul 2022 21:54:59 -0400 From: Rich Felker <dalias@...c.org> To: "Nieminen, Jussi" <Jussi.Nieminen@...atrace.com> Cc: "musl@...ts.openwall.com" <musl@...ts.openwall.com> Subject: Re: Bug in getaddrinfo causing spurious returns with wrong error values On Tue, Nov 23, 2021 at 02:47:49PM +0000, Nieminen, Jussi wrote: > Hi, > > I'm a developer from the performance monitoring company Dynatrace, and I've been > recently investigating curious problems at our customers' environments where a > call to musl's getaddrinfo appears to spuriously return ENOENT when called from > a node.js application that is being monitored with the Dynatrace agent. > > I managed to pinpoint the problem to the code that performs the AI_ADDRCONFIG > check. If an address family that is not enabled on the host is specified, a call > to "connect" in that code fails, the socket fd is closed, and the value of > "errno" is then evaluated. > > The problem is that the call to "close" can change the value of errno, which > will break the switch-case that follows it. Especially if aio is used (which is > the case when the Dynatrace agent is included in the application), the call to > close will end up setting errno to ENOENT by default (even without a failure) > within the "aio_cancel" function if an aio operation is active. In such a case > getaddrinfo will then incorrectly return EAI_SYSTEM with errno set to ENOENT. > > (After some error code translations within libuv, node.js will then print an > error message claiming that getaddrinfo failed with ENOENT which is rather > confusing.) > > Even if aio is not used, the code might fail whenever "close" gets interrupted > and returns with errno set to EINTR. As the return value of close is not > checked, the errno might thus "silently" change before getting evaluated with > the assumption that it still contains the value set when "connect" failed. > > Below is a simple patch that should take care of this problem. Let me know if I > can provide any more information or if there is anything else I can help with. > > Thanks, > Jussi > > > ------------------------------------------------------------------------------- > diff --git a/src/network/getaddrinfo.c b/src/network/getaddrinfo.c > index efaab306..71809856 100644 > --- a/src/network/getaddrinfo.c > +++ b/src/network/getaddrinfo.c > @@ -16,6 +16,7 @@ int getaddrinfo(const char *restrict host, const char *restrict serv, const stru > char canon[256], *outcanon; > int nservs, naddrs, nais, canon_len, i, j, k; > int family = AF_UNSPEC, flags = 0, proto = 0, socktype = 0; > + int saved_errno = 0; > struct aibuf *out; > > if (!host && !serv) return EAI_NONAME; > @@ -66,11 +67,14 @@ int getaddrinfo(const char *restrict host, const char *restrict serv, const stru > pthread_setcancelstate( > PTHREAD_CANCEL_DISABLE, &cs); > int r = connect(s, ta[i], tl[i]); > + /* The call to "close" might change errno, especially if aio is in use; > + * save the value set by "connect" for the later comparison. */ > + if (r < 0) saved_errno = errno; > pthread_setcancelstate(cs, 0); > close(s); > if (!r) continue; > } > - switch (errno) { > + switch (saved_errno) { > case EADDRNOTAVAIL: > case EAFNOSUPPORT: > case EHOSTUNREACH: > ------------------------------------------------------------------------------- A couple minor problems with the patch: - The errno from socket() is not used if the failure was from socket(). I'm not sure yet if that matters but I think it may if IPv6 was disabled in a way that makes socket() fail. - In the case where EAI_SYSTEM is returned, the error was not restored back into errno, so the caller cannot get the cause of error if it was clobbered by close. I'll work on a fixed version. I think the right thing to do is just save/restore errno itself rather than switching on saved_errno. Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.