Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKzsc6dqT_u9UyQvH1jEqJ_=yzyU+XFhn_6cA_m=decfeuMpPA@mail.gmail.com>
Date: Wed, 15 Jun 2022 08:30:02 +0200
From: Hans Harder <hans@...as.org>
To: Rich Felker <dalias@...c.org>
Cc: musl@...ts.openwall.com
Subject: Re: hostname is using a case sensitive search in function name_from_hosts

I don't have much experience in making patches, but you mean sometime
like this patch.

It uses strtok and only compares something case insensitive if the
token is the same length of name.
Also it makes it simpler in the remaining code.

Hans

diff -u a/src/network/lookup_name.c b/src/network/lookup_name.c
--- a/src/network/lookup_name.c 2022-04-07 17:12:40.000000000 +0000
+++ b/src/network/lookup_name.c 2022-06-15 06:15:46.680000000 +0000
@@ -6,6 +6,7 @@
 #include <ctype.h>
 #include <stdlib.h>
 #include <string.h>
+#include <strings.h>
 #include <fcntl.h>
 #include <unistd.h>
 #include <pthread.h>
@@ -49,6 +50,7 @@
 static int name_from_hosts(struct address buf[static MAXADDRS], char
canon[static 256], const char *name, int family)
 {
    char line[512];
+   char sep[4] = " \t\n";
    size_t l = strlen(name);
    int cnt = 0, badfam = 0, have_canon = 0;
    unsigned char _buf[1032];
@@ -62,17 +64,22 @@
        return EAI_SYSTEM;
    }
    while (fgets(line, sizeof line, f) && cnt < MAXADDRS) {
-       char *p, *z;
-
+       char *p;
        if ((p=strchr(line, '#'))) *p++='\n', *p=0;
-       for(p=line+1; (p=strstr(p, name)) &&
-           (!isspace(p[-1]) || !isspace(p[l])); p++);
-       if (!p) continue;
+       if (line[0] == 0 || line[0]=='\n') continue;
+       p = strtok(line, sep);
+       while( p != NULL ) {
+           /* only compare case insensitive if length of both are the same */
+           if (strlen(p) == l  && strcasecmp(p,name)==0) {
+               p = strtok(line, sep);
+               break;
+           }
+           p = strtok(NULL, sep);
+       }
+       if (p == NULL) continue;

        /* Isolate IP address to parse */
-       for (p=line; *p && !isspace(*p); p++);
-       *p++ = 0;
-       switch (name_from_numeric(buf+cnt, line, family)) {
+       switch (name_from_numeric(buf+cnt, p, family)) {
        case 1:
            cnt++;
            break;
@@ -86,12 +93,10 @@
        if (have_canon) continue;

        /* Extract first name as canonical name */
-       for (; *p && isspace(*p); p++);
-       for (z=p; *z && !isspace(*z); z++);
-       *z = 0;
-       if (is_valid_hostname(p)) {
+       p = strtok(NULL, sep);
+       if (p != NULL && is_valid_hostname(p)) {
            have_canon = 1;
-           memcpy(canon, p, z-p+1);
+           strcpy(canon, p);
        }
    }
    __fclose_ca(f);



On Thu, Jun 9, 2022 at 10:34 PM Rich Felker <dalias@...c.org> wrote:
>
> On Thu, Jun 09, 2022 at 08:42:28PM +0200, Hans Harder wrote:
> > Hi,
> > I discovered that the function name_from_hosts parses the /etc/hosts
> > file and does a case sensitive search for a name.
> > Sometimes I encounter mixed upper and lowercase hostnames in a /etc/hosts file.
> > It would be easier if the function searches for the name in a case
> > insensitive way....
> >
> > By changing line 68  in src/network/lookup_name.c
> >        for(p=line+1; (p=strstr(p, name)) &&
> > to:
> >        for(p=line+1; (p=strcasestr(p, name)) &&
> >
> > That would resolve the problem.
>
> strcasestr isn't a good match here, because it's quadratic time and
> would be potentially quite slow (depending on file contents). It's
> also not in a usable namespace, and is something of a junk function we
> included for questionable reasons.
>
> The core problem here is that strstr isn't really the right operation
> to be using, and was something of a lazy hack. Due to the linear-time
> implementation it doesn't hurt, but it would make a lot more sense to
> parse this right looking at separators. Even then though it's some
> work to make it properly case-insensitive; strcasecmp is insufficient
> and only handles single-byte characters. So the right thing to do is
> really picking up review and merge of the draft IDN handling work,
> which (if I'm remembering right) normalizes case as an inherent part
> of the process.
>
> Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.