Follow @Openwall on Twitter for new release announcements and other news
[<prev] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAASffotPCJ1_8KTipytV3HjSWqK1C=VTSgtLxoCk2xNB2ccfjg@mail.gmail.com>
Date: Wed, 9 Jul 2025 13:59:18 +1000
From: Stephen Von Takach <steve@...ce.technology>
To: Rich Felker <dalias@...c.org>
Cc: musl@...ts.openwall.com, Viv Briffa <viv@...ce.technology>
Subject: Re: unlink on NFS volume fails silently

Yes we traced this.
The libc unlink function on musl returned 0 for a filename on an NFS mount
that wasn't deleted.
https://www.gnu.org/software/libc/manual/html_node/Deleting-Files.html

The same call to unlink on glibc returned 0 and actually removed the file.

The issue occurs when there is a high volume of files being removed


Stephen von Takach Dukai

Engineering Lead

PlaceOS

Australia, Hong Kong, London, New York

p: +61 408 419 954

e: steve@...ce.technology


On Tue, 8 Jul 2025 at 11:08, Rich Felker <dalias@...c.org> wrote:

> On Sun, Jul 06, 2025 at 04:25:31PM +1000, Stephen Von Takach wrote:
> > Hi,
> >
> > We recently had to move a service from being built on alpine linux to
> > debian linux as we were getting silent failures when deleting a directory
> > with many files on an NFS volume. Basically this call to unlink was not
> > raising an error if the file failed to delete
> >
> https://github.com/crystal-lang/crystal/blob/master/src/crystal/system/unix/file.cr#L129
> >
> > We replicated the issue in an alpine container with rm -rf
> > /nfs_mount/git_repo_to_delete and it also failed to successfully delete
> all
> > the files, it did raise an error though (I assume it checked the file was
> > removed before continuing) not entirely sure.
> >
> > Both these operations succeed with glibc when using debian.
> > Looks a bit like this issue:
> > https://gitlab.alpinelinux.org/alpine/aports/-/issues/10960
>
> Assuming you actually traced and saw the unlink syscall succeed, the
> root cause here is the filesystem/kernel lying about that. Standard
> "NFS Considered Harmful" stuff.
>
> But the fact that you're seeing different behavior on Alpine is almost
> surely a matter of busybox rm vs GNU coreutils differences in how they
> behave under faulty kernel behavior. The easy solution is probably
> installing the coreutils package. Otherwise, investigate what busybox
> is doing differently and if there's a way it could be made more
> reliable in this situation.
>
> Rich
>
>

Content of type "text/html" skipped

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.