oss-security - Re: CPython: CVE-2024-8088: Infinite loop when iterating over zip archive entry names

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Zsis6Gx2qq_laiC7@nihonium>
Date: Fri, 23 Aug 2024 17:38:16 +0200
From: Fay Stegerman <flx@...usk.net>
To: oss-security@...ts.openwall.com
Subject: Re: CPython: CVE-2024-8088: Infinite loop when
 iterating over zip archive entry names

* Fay Stegerman <flx@...usk.net> [2024-08-22 23:12]:
> * Alan Coopersmith <alan.coopersmith@...cle.com> [2024-08-22 20:56]:
> > -------- Forwarded Message --------
> > Subject: 	[Security-announce][CVE-2024-8088] Infinite loop when iterating
> > over zip archive entry names
> > Date: 	Thu, 22 Aug 2024 13:40:20 -0500
> > From: 	Seth Larson <seth@...hon.org>
> > Reply-To: 	security-sig@...hon.org
> > To: 	security-announce@...hon.org
> >
> > There is a HIGH severity vulnerability affecting the CPython "zipfile" module.
> >
> > When iterating over names of entries in a zip archive (for example, methods
> > of "zipfile.ZipFile" like "namelist()", "iterdir()", "extractall()", etc)
> > the process can be put into an infinite loop with a maliciously crafted
> > zip archive. This defect applies when reading only metadata or extracting
> > the contents of the zip archive. Programs that are not handling
> > user-controlled zip archives are not affected.
> >
> > Please see the linked CVE ID for the latest information on affected versions:
> >
> > * https://www.cve.org/CVERecord?id=CVE-2024-8088
> > * https://github.com/python/cpython/pull/122906
> > * https://github.com/python/cpython/issues/122905
>
> A small correction/addendum based on reading the vulnerability report and the PR
> that fixes this (as well as being quite familiar with Python zipfile.ZipFile
> internals and confused how this would affect it): it's not zipfile.ZipFile and
> its methods that are affected, at least not directly, but zipfile.Path.  The
> issue being this code in zipfile._path._ancestry():
>
>   path = path.rstrip(posixpath.sep)
>   while path and path != posixpath.sep:
>       yield path
>       path, tail = posixpath.split(path)
>
> Which results in an infinite loop because for example posixpath.split("//") ==
> ("//", "") but "//" != posixpath.sep:
>
>   >>> it = zipfile._path._parents("//foo")
>   >>> next(it)
>   '//'
>   >>> next(it)
>   '//'
>   >>> next(it)
>   '//'
>
> The infinite loop has been fixed by sanitising the paths.

Forgot to mention this: the infinite loop is triggered when zipfile.Path adds
"implied directories" -- using _parents(), which calls _ancestry() -- in the
overridden .namelist() for the custom zipfile.ZipFile subclass it wraps.  Which
is (indirectly) used by almost all of the zipfile.Path methods like .iterdir(),
.glob(), .exists(), .joinpath() etc.

  >>> zf = zipfile.ZipFile(io.BytesIO(), "w")
  >>> zf.filename = "foo.zip"
  >>> zf.writestr("a/b/c", "abc")
  >>> zf.writestr("d/e", "de")
  >>> zf.namelist()
  ['a/b/c', 'd/e']
  >>> zf.__class__
  <class 'zipfile.ZipFile'>

  >>> p = zipfile.Path(zf)
  >>> p.root.namelist()
  ['a/b/c', 'd/e', 'a/b/', 'a/', 'd/']
  >>> list(p.iterdir())
  [Path('foo.zip', 'a/'), Path('foo.zip', 'd/')]
  >>> zf.__class__
  <class 'zipfile._path.CompleteDirs'>

  >>> zf.writestr("//oops", "oops")
  >>> # infinite loop via joinpath -> resolve_dir -> _name_set -> namelist ->
  >>> # _implied_dirs -> _ancestry
  >>> p / "foo"

As zipfile.Path modifies the class of the original ZipFile, calling .namelist()
or .extractall() on the original ZipFile used to create the Path afterwards is
also affected even though zipfile.ZipFile as such is not.

- Fay

Please check out the Open Source Software Security Wiki, which is counterpart to this mailing list.

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.