Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAD=6tA4384iEO10x0kHEbc3oSKVag=ekKdh2yHWLxb=PBWO-LA@mail.gmail.com>
Date: Mon, 23 Mar 2026 15:15:34 +0100
From: Simon Resch <simon.resch@...e-intelligence.com>
To: musl@...ts.openwall.com
Subject: regexec infinite loop on self-referential backreference pattern

Hi,

I've found that musl's regexec enters an infinite loop on the
self-referential backreference pattern \(^.*\1\)*, even with a two-byte
input. Since the input is only two bytes, this is not exponential blowup
but appears to be a true infinite loop in the backtracking matcher.

The same bug exists in upstream TRE, from which musl's regex implementation
is derived. For comparison, glibc rejects this pattern at regcomp time with
REG_ESUBREG ("Invalid back reference"), since \1 references a group that
contains itself.

Reproducer

  #include <regex.h>
  #include <stdio.h>

  int main(void) {
      regex_t re;

      const char pattern[] = "\\(^.*\\1\\)*";
      const char subject[] = "aa";

      int rc = regcomp(&re, pattern, 0);
      if (rc != 0) {
          char errbuf[256];
          regerror(rc, &re, errbuf, sizeof(errbuf));
          fprintf(stderr, "regcomp failed: %s\n", errbuf);
          return 1;
      }

      printf("regcomp succeeded, calling regexec ...\n");
      fflush(stdout);

      regexec(&re, subject, 0, NULL, 0);

      regfree(&re);
      return 0;
  }

To verify on Alpine (musl 1.2.5) place the above content into a file
reproducer.c and execute

  docker run --rm -v $PWD/reproducer.c:/reproducer.c alpine sh -c \
          'apk add gcc musl-dev && gcc -o /reproducer /reproducer.c &&
timeout 10 /reproducer; echo "Exit code: $?"'

Output:

  regcomp succeeded, calling regexec ...
  Exit code: 143

The process is killed by timeout after 10 seconds. I have also let it run
for over an hour with no termination.

Any program that passes user-supplied patterns to regcomp/regexec (grep,
sed, awk, web servers with regex-based URL routing, input validators, etc.)
can be made to hang a thread/process permanently with a short pattern and a
trivial input.

Suggested fix: Either reject self-referential backreferences during regcomp
(as glibc does), or add a step limit to the backtracking matcher that
returns REG_ESPACE when exceeded.

Please CC me on replies.

Regards,

-- 

Simon Resch

Senior Software Engineer

simon.resch@...e-intelligence.com



Code Intelligence GmbH

Rheinwerkallee 6 | D-53227 Bonn

www.code-intelligence.com

District Court of Bonn | HRB 23408
Managing Directors: Dr. Khaled Yakdan, Dr. Eric Brüggemann

Content of type "text/html" skipped

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.