|
Message-ID: <57c462dd-f23a-4f55-a870-55c6886767a0@codean.io> Date: Wed, 3 Jul 2024 16:07:37 +0200 From: Thomas Rinsma <thomas@...ean.io> To: oss-security@...ts.openwall.com Subject: Re: Ghostscript 10.03.1 (2024-05-02) fixed 5 CVEs including CVE-2024-33871 arbitrary code execution Hi, Per Solar's request, here is some information on recent Ghostscript bugs. They have all been fixed upstream already for either ~1 month (10.03.1) or ~4 months (10.03.0). It looks like patches have also landed in most distros, but there is not a super clear changelog or version history so this might help clarify things. Note that this is just a subset of all vulnerabilities fixed in 10.03.0 and 10.03.1: these are just the bugs I myself found and reported. # CVE-2024-29509 - heap buffer overflow via the PDFPassword parameter The `runpdf` command (and friends) allows the new C-based PDF interpreter to be invoked from within PS. With this, we can pass various flags and arguments (see `pdf_impl_set_param`) that are normally passed via the command-line when the PDF interpreter is invoked directly. It turns out that validation of several of these parameters is flawed, maybe because they were considered somewhat "trusted", being command-line arguments originally. The fields `ctx->encryption.Password` and `ctx->encryption.PasswordLen` are set based on the value of `PDFPassword`. During the decryption process, in `check_password_R5` in `pdf_sec.c`, a buffer is allocated based on the string-length of this field: ``` code = pdfi_object_alloc(ctx, PDF_STRING, strlen(ctx->encryption.Password), (pdf_obj **)&P); ``` However, a `memcpy` later copies the full length of the PS-supplied object into this buffer: ``` memcpy(P->data, Password, PasswordLen); ``` Because PS-strings are not null-terminated, this will result in a heap buffer overflow when a value of `PDFPassword` is supplied with a null byte in the middle. For example, the following will result in a `memcpy` of 7 bytes into a buffer of size 3: ``` /PDFPassword (foo\000bar) def ``` This bug was fixed in 10.03.0 (2024-03-06), and is bug (1) in this report: https://bugs.ghostscript.com/show_bug.cgi?id=707510 # CVE-2024-29506 - stack buffer overflow in pdfi_apply_filter() The `PDFDEBUG` flag controls the value of `ctx->args.debug`. In `pdfi_apply_filter` this enables execution of a `memcpy` into a stack buffer, without bounds checks. The input (`n->data`, the PDF filter name) is an attacker controlled buffer of arbitrary size. A filter name larger than 100 will overflow the `str` buffer. ``` if (ctx->args.pdfdebug) { char str[100]; memcpy(str, (const char *)n->data, n->length); str[n->length] = '\0'; dmprintf1(ctx->memory, "FILTER NAME:%s\n", str); } ``` This bug was also fixed in 10.03.0 (2024-03-06), and is bug (2) in this report: https://bugs.ghostscript.com/show_bug.cgi?id=707510 # CVE-2024-29507 - stack buffer overflow via CIDFSubstPath/Font params Under specific conditions, the `cidfsubstpath` and `cidfsubstfont` parameters (set by corresponding Postscript objects) are used to load substitute fonts (this is in `pdfi_open_CIDFont_substitute_file`). The values are `memcpy`d into the `fontfname` buffer without bounds checks. Hence, an attacker can pass values larger than the buffer size to trigger a stack buffer overflow. ``` char fontfname[gp_file_name_sizeof]; // 4096 // .. <snip> ... if (ctx->args.cidfsubstpath.data == NULL) { memcpy(fontfname, fsprefix, fsprefixlen); } else { memcpy(fontfname, ctx->args.cidfsubstpath.data, ctx->args.cidfsubstpath.size); fsprefixlen = ctx->args.cidfsubstpath.size; } if (ctx->args.cidfsubstfont.data == NULL) { // ... <snip> ... } else { memcpy(fontfname, ctx->args.cidfsubstfont.data, ctx->args.cidfsubstfont.size); defcidfallacklen = ctx->args.cidfsubstfont.size; } ``` This bug was also fixed in 10.03.0 (2024-03-06), and is bug (3) in this report: https://bugs.ghostscript.com/show_bug.cgi?id=707510 # CVE-2024-29508 - heap pointer leak in pdf_base_font_alloc() The function `pdf_base_font_alloc` used by the `pdfwrite` device will use a hexadecimal pointer representation (`".F" PRI_INTPTR`) for the constructed BaseFont name if the input name is empty: ``` if (pfname->size > 0) { font_name.data = pfname->chars; font_name.size = pfname->size; while (pdf_has_subset_prefix(font_name.data, font_name.size)) { /* Strip off an existing subset prefix. */ font_name.data += SUBSET_PREFIX_SIZE; font_name.size -= SUBSET_PREFIX_SIZE; } } else { gs_snprintf(fnbuf, sizeof(fnbuf), ".F" PRI_INTPTR, (intptr_t)copied); font_name.data = (byte *)fnbuf; font_name.size = strlen(fnbuf); } ``` Resulting in, for example: ``` <</BaseFont/YZKFTQ+.F0x5618b147e378/FontDescriptor 8 0 R/ToUnicode 11 0 R/Type/Font ... ``` An attacker can obtain this pointer value by reading back the output file (after writing to a temporary writable and readable location). This bug (and various other pointer leaks) were fixed in 10.03.0 (2024-03-06), and is bug (4) in this report: https://bugs.ghostscript.com/show_bug.cgi?id=707510 # CVE-2024-29511 - arbitrary file read/write through Tesseract config The `ocr` family of devices invoke Tesseract to perform OCR operations. The device parameter `OCRLanguage` is used by Tesseract to load a data file for that specific language. Specifically, such a file is loaded from `./<OCRLanguage>.traineddata`. By using a path traversal to `/tmp/`, we can force Tesseract to load our own data file: ``` mark /OutputFile (/tmp/notused) /OCRLanguage (../../../../../tmp/test) % loads /tmp/test.traineddata /OutputDevice /ocr .dicttomark setpagedevice ``` As it turns out, Tesseract `traineddata` files can include various configuration values, including `user_patterns_file` which will try to load patterns from the given path, and `debug_file` which will write debug information to the given path. The debug information is quite verbose, and will print full input lines if they don’t start with a valid character in the trained language. By constructing our "language" such that no character is valid, all lines in the pattern file are printed. For example, the configuration settings: ``` debug_file /tmp/out user_patterns_file /etc/passwd ``` will result in a file `/tmp/out` containing: ``` Error: failed to insert pattern 'root:x:0:0:root:/root:/bin/bash' Error: failed to insert pattern 'daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin' Error: failed to insert pattern 'bin:x:2:2:bin:/bin:/usr/sbin/nologin' Error: failed to insert pattern 'sys:x:3:3:sys:/dev:/usr/sbin/nologin' Error: failed to insert pattern 'sync:x:4:65534:sync:/bin:/bin/sync' <etc> ``` In Postscript we can: 1. Construct the traineddata file under `/tmp/` 2. Use path traversal in `OCRLanguage` to load it when initializing the `ocr` device 3. Read the resulting output data in `/tmp/out` This allows us to read arbitrary files outside of the SAFER sandbox, and write to arbitrary file paths, although during writing, every line will start with `Error: failed to insert pattern '` and end with `'`. Note that this is the Tesseract/OCR-related bug that was referred to by the Ghostscript changelog (and quoted earlier in this thread). Contrary to what is stated in the changelog it does not lead to RCE by itself, just file read/write. It also requires Ghostscript to be compiled with Tesseract support. # CVE-2024-29510 - format string injection in uniprint device The `uniprint` device allows the user to provide various string fragments as device options, which are later appended to the output file. Two of these parameters, `upWriteComponentCommands` and `upYMoveCommand`, are actually treated as format strings, specifically for `gp_fprintf` and `gs_snprintf`. For these, the intention is for the user to include just one format specifier in the string, but there is no logic preventing arbitrary format strings (with multiple specifiers) from being used. With full control over the format string (by setting a page device with the respective options), and read access to the device output (by setting it to a temporary file path), an attacker can abuse this to leak data from the stack and perform memory corruption. This is specifically impactful in the cases of `gs_snprintf` (as opposed to `gp_fprintf`), as its format-string parsing logic is not hardened by compiler measures like `D_FORTIFY_SOURCE`, while it still supports the `%n` modifier. Bug report and public blog post with more details and PoC leading to a SAFER sandbox bypass: https://bugs.ghostscript.com/show_bug.cgi?id=707662 https://codeanlabs.com/blog/research/cve-2024-29510-ghostscript-format-string-exploitation/ --- Cheers, Thomas
Powered by blists - more mailing lists
Please check out the Open Source Software Security Wiki, which is counterpart to this mailing list.
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.