kernel-hardening - Re: [PATCH v2] fs: hardlink creation restrictions

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAGXu5j+xzLQVHqT4oEXxVi8r2j5OUG3LLZLkMM=JscNzjLmkOQ@mail.gmail.com>
Date: Tue, 21 Feb 2012 15:22:57 -0800
From: Kees Cook <keescook@...omium.org>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: Ingo Molnar <mingo@...e.hu>, Marcin Slusarz <marcin.slusarz@...il.com>, 
	linux-kernel@...r.kernel.org, Randy Dunlap <rdunlap@...otime.net>, 
	Alexander Viro <viro@...iv.linux.org.uk>, linux-doc@...r.kernel.org, 
	linux-fsdevel@...r.kernel.org, kernel-hardening@...ts.openwall.com
Subject: Re: [PATCH v2] fs: hardlink creation restrictions

On Tue, Feb 21, 2012 at 3:10 PM, Andrew Morton
<akpm@...ux-foundation.org> wrote:
> On Tue, 21 Feb 2012 13:58:00 -0800
> Kees Cook <keescook@...omium.org> wrote:
>
>> On systems that have user-writable directories on the same partition
>> as system files, a long-standing class of security issues is the
>> hardlink-based time-of-check-time-of-use race, most commonly seen in
>> world-writable directories like /tmp. The common method of exploitation
>> of this flaw is to cross privilege boundaries when following a given
>> hardlink (i.e. a root process follows a hardlink created by another
>> user). Additionally, an issue exists where users can "pin" a potentially
>> vulnerable setuid/setgid file so that an administrator will not actually
>> upgrade a system fully.
>>
>> The solution is to permit hardlinks to only be created when the user is
>> already the existing file's owner, or if they already have read/write
>> access to the existing file.
>>
>> Many Linux users are surprised when they learn they can link to files
>> they have no access to, so this change appears to follow the doctrine
>> of "least surprise". Additionally, this change does not violate POSIX,
>> which states "the implementation may require that the calling process
>> has permission to access the existing file"[1].
>>
>> This change is known to break some implementations of the "at" daemon,
>> though the version used by Fedora and Ubuntu has been fixed[2] for
>> a while. Otherwise, the change has been undisruptive while in use in
>> Ubuntu for the last 1.5 years.
>>
>> This patch is based on the patch in Openwall and grsecurity.  I have
>> added a sysctl to enable the protected behavior, documentation, and an
>> audit notification.
>>
>> [1] http://pubs.opengroup.org/onlinepubs/9699919799/functions/linkat.html
>> [2] http://anonscm.debian.org/gitweb/?p=collab-maint/at.git;a=commitdiff;h=f4114656c3a6c6f6070e315ffdf940a49eda3279
>>
>
> Looks OKish to me.

Thanks!

>> --- a/Documentation/sysctl/fs.txt
>> +++ b/Documentation/sysctl/fs.txt
>> @@ -32,7 +32,8 @@ Currently, these files are in /proc/sys/fs:
>>  - nr_open
>>  - overflowuid
>>  - overflowgid
>> -- protected_sticky_symlinks
>> +- protected_hardlinks
>> +- protected_symlinks
>
> It's silly to add protected_sticky_symlinks and to then rename it in
> the very next patch.  I have reworked my copy of the earlier
> fs-symlink-restrictions-on-sticky-directories.patch so that it adds
> "protected_symlinks".  I then reworked this patch
> (fs-hardlink-creation-restrictions.patch) to add protected_hardlinks.
> Nice and simple.

Okay, sounds good. Thanks for the clean-ups!

>>
>> ...
>>
>> +static inline void
>> +audit_log_link_denied(const char *operation, struct path *link)
>> +{
>> +     struct audit_buffer *ab;
>> +
>> +     ab = audit_log_start(current->audit_context, GFP_KERNEL, AUDIT_AVC);
>> +     audit_log_format(ab, "op=%s action=denied", operation);
>> +     audit_log_format(ab, " pid=%d comm=", current->pid);
>> +     audit_log_untrustedstring(ab, current->comm);
>> +     audit_log_d_path(ab, " path=", link);
>> +     audit_log_format(ab, " dev=");
>> +     audit_log_untrustedstring(ab, link->dentry->d_inode->i_sb->s_id);
>> +     audit_log_format(ab, " ino=%lu", link->dentry->d_inode->i_ino);
>> +     audit_log_end(ab);
>> +}
>
> This is far too large to inline.  It has two callsites and gcc is
> probably smart anough to uninline it for us.  I queued a patch to
> remove the `inline'.

Okay, seems fine to me.

>>  /**
>>   * may_follow_link - Check symlink following for unsafe situations
>> - * @dentry: The inode/dentry of the symlink
>> - * @nameidata: The path data of the symlink
>> + * @link: The path of the symlink
>>   *
>> - * In the case of the protected_sticky_symlinks sysctl being enabled,
>> + * In the case of the sysctl_protected_symlinks sysctl being enabled,
>>   * CAP_DAC_OVERRIDE needs to be specifically ignored if the symlink is
>>   * in a sticky world-writable directory. This is to protect privileged
>>   * processes from failing races against path names that may change out
>> @@ -643,19 +660,20 @@ int sysctl_protected_sticky_symlinks __read_mostly =
>>   *
>>   * Returns 0 if following the symlink is allowed, -ve on error.
>>   */
>> -static inline int
>> -may_follow_link(struct dentry *dentry, struct nameidata *nameidata)
>> +static inline int may_follow_link(struct path *link)
>>  {
>>       int error = 0;
>>       const struct inode *parent;
>>       const struct inode *inode;
>>       const struct cred *cred;
>> +     struct dentry *dentry;
>>
>> -     if (!sysctl_protected_sticky_symlinks)
>> +     if (!sysctl_protected_symlinks)
>>               return 0;
>
> This is also too large to inline.  It doesn't matter a lot - it's not a
> hot path.  I removed this `inline' as well.  gcc will probably inline
> it anyway.
>
> If we were really concerned about speed here, we might decide to inline
> the test of sysctl_protected_symlinks and to uninline the remainder of
> the function.
>
>
>>       /* Allowed if owner and follower match. */
>>       cred = current_cred();
>> +     dentry = link->dentry;
>>       inode = dentry->d_inode;
>>       if (cred->fsuid == inode->i_uid)
>>               return 0;
>>
>> ...
>>
>> +static inline int may_linkat(struct path *link)
>> +{
>> +     int error = 0;
>> +     const struct cred *cred;
>> +     struct inode *inode;
>> +     int mode;
>> +
>> +     if (!sysctl_protected_hardlinks)
>> +             return 0;
>> +
>> +     cred = current_cred();
>> +     inode = link->dentry->d_inode;
>> +     mode = inode->i_mode;
>> +
>> +     if (cred->fsuid != inode->i_uid &&
>> +         (!S_ISREG(mode) || (mode & S_ISUID) ||
>> +          ((mode & (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP)) ||
>> +          (inode_permission(inode, MAY_READ | MAY_WRITE))) &&
>> +         !capable(CAP_FOWNER))
>> +             error = -EPERM;
>
> omigod.  Geeze man, whose side are you on?

Heh. Yeah, this was not my favorite if statement.

> I suggest something like this:
>
> --- a/fs/namei.c~fs-hardlink-creation-restrictions-fix-fix
> +++ a/fs/namei.c
> @@ -692,6 +692,23 @@ static inline int may_follow_link(struct
>        return error;
>  }
>
> +static bool foobar(struct inode *inode, umode_t mode)
> +{
> +       /* nice comment */
> +       if (!S_ISREG(mode))
> +               return true;
> +       /* nice comment */
> +       if (mode & S_ISUID)
> +               return true;
> +       /* nice comment */
> +       if ((mode & (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP))
> +               return true;
> +       /* nice comment */
> +       if (inode_permission(inode, MAY_READ | MAY_WRITE))
> +               return true;
> +       return false;
> +}
> +
>  /**
>  * may_linkat - Check permissions for creating a hardlink
>  * @link: the source to hardlink from
> @@ -722,11 +739,8 @@ static int may_linkat(struct path *link)
>        inode = link->dentry->d_inode;
>        mode = inode->i_mode;
>
> -       if (cred->fsuid != inode->i_uid &&
> -           (!S_ISREG(mode) || (mode & S_ISUID) ||
> -            ((mode & (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP)) ||
> -            (inode_permission(inode, MAY_READ | MAY_WRITE))) &&
> -           !capable(CAP_FOWNER))
> +       if (cred->fsuid != inode->i_uid && foobar(inode, mode) &&
> +                       !capable(CAP_FOWNER))
>                error = -EPERM;
>
>        if (error)
>
> Please take a look at that, pick a replacement for foobar, fill in the
> nice comments which explain our reasoning, retest it and send it back
> at me?

Sounds good -- I'll get it written and tested shortly.

> Also please take a look at local variable "mode" and decide if there
> was a good reason for it being `int' instead of mode_t?

Ah, sorry. That was an oversight on my part -- I'd ported these before
mode_t existed. I'll fix that too.

Thanks!

-Kees

-- 
Kees Cook
ChromeOS Security
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.