|
Message-ID: <20180907160046.zZvDF%steffen@sdaoden.eu> Date: Fri, 07 Sep 2018 18:00:46 +0200 From: Steffen Nurpmeso <steffen@...oden.eu> To: Rich Felker <dalias@...c.org> Cc: musl@...ts.openwall.com Subject: Re: Regex: behaviour of ? after () atom Rich Felker wrote in <20180907153302.GM1878@...ghtrain.aerifal.cx>: |On Fri, Sep 07, 2018 at 05:25:17PM +0200, Steffen Nurpmeso wrote: |> Rich Felker wrote in <20180907151821.GL1878@...ghtrain.aerifal.cx>: |>|On Fri, Sep 07, 2018 at 03:38:05PM +0200, Steffen Nurpmeso wrote: |>|> Hello. |>|> |>|> In perl this is |>|> |>|> $x="print 1 2"; |>|> if($x =~ /^(:[[:space:]]+)?([^[:space:]]+)(.*)$/){ |>|> print "<$0> -> <$1> <$2> <$3>\n" |>|>} |>|> |>|> and the result is |>|> |>|> </tmp/t.pl> -> <> <print> < 1 2> |>|> |>|> Now the same on AlpineLinux edge and musl-1.1.19-r10 with the MUA |>|> i maintain, which uses the normal regex stuff and calls it via |>|> |>|> echo eins=$3 |>|> vput vexpr i regex "${3}" \ |>|> '^(:[[:space:]]+)?([^[:space:]]+)(.*)$' \ |>|> '<\$0> -> <\$1> <\$2> <\$3>' |>|> echo i=$i |>|> |>|> which in C code does |>|> |>|> if((reflrv = regcomp(&re, argv[2], reflrv))){ |>|> ... |>|> goto jestr; |>|>} |>|> fprintf(stderr, "GOING for <%s> -> <%s> %u\n", |>|> argv[1],argv[2],n_NELEM(rema)); |>|> reflrv = regexec(&re, argv[1], n_NELEM(rema), rema, 0); |>|> |>|> and overall prints |>|> |>|> eins=print 1 2 |>|> GOING for <print 1 2> -> <^(:[[:space:]]+)?([^[:space:]]+)(.*)$> 17 |>|> i=<print 1 2> -> <> <> <> |>|> |>|> It works correctly if i remove the ()? atom, so i thought i should |>|> report that. |>| |>|What is the value of the flags argument you passed to regcomp? |>| |> |> REG_EXTENDED, optional REG_ICASE: |> |> reflrv = REG_EXTENDED; |> if(f & a_ICASE) |> reflrv |= REG_ICASE; |> if((reflrv = regcomp(&re, argv[2], reflrv))){ | |OK, it looks like that should work, and seemed to work here when I |passed the regex to grep -E linked with musl's regex. Can you provide |a minimal self-contained C program to demonstrate the issue you're |having? Happy user that i am, here something for tests/: #include <stdio.h> #include <regex.h> int main(void){ regmatch_t rema[1 + 21]; regex_t re; int i; i = REG_EXTENDED; if((i = regcomp(&re, "^(:[[:space:]]+)?([^[:space:]]+)(.*)$", i))) return 2; i = regexec(&re, "print 1 2", 21, rema, 0); regfree(&re); if(i == REG_NOMATCH) return 3; for(i = 1; i < 21 && rema[i].rm_so != -1; ++i) ; return (i == 3) ? 0 : 4; } i is 1 here. |BTW which "()?" are you talking about? The whole first parenthesized |subsexpression and the ? after it? I wouldn't call that an atom, but |nothing seems wrong with it. I have read regex(7) first just in case something intellectual had to be said. Otherwise i am all for Finnish tango. --steffen | |Der Kragenbaer, The moon bear, |der holt sich munter he cheerfully and one by one |einen nach dem anderen runter wa.ks himself off |(By Robert Gernhardt)
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.