|
Message-ID: <21942.1559304135@warthog.procyon.org.uk> Date: Fri, 31 May 2019 13:02:15 +0100 From: David Howells <dhowells@...hat.com> To: Peter Zijlstra <peterz@...radead.org> Cc: dhowells@...hat.com, Jann Horn <jannh@...gle.com>, Greg KH <gregkh@...uxfoundation.org>, Al Viro <viro@...iv.linux.org.uk>, raven@...maw.net, linux-fsdevel <linux-fsdevel@...r.kernel.org>, Linux API <linux-api@...r.kernel.org>, linux-block@...r.kernel.org, keyrings@...r.kernel.org, linux-security-module <linux-security-module@...r.kernel.org>, kernel list <linux-kernel@...r.kernel.org>, Kees Cook <keescook@...omium.org>, Kernel Hardening <kernel-hardening@...ts.openwall.com> Subject: Re: [PATCH 1/7] General notification queue with user mmap()'able ring buffer Peter Zijlstra <peterz@...radead.org> wrote: > Can you re-iterate the exact problem? I konw we talked about this in the > past, but I seem to have misplaced those memories :/ Take this for example: void afs_put_call(struct afs_call *call) { struct afs_net *net = call->net; int n = atomic_dec_return(&call->usage); int o = atomic_read(&net->nr_outstanding_calls); trace_afs_call(call, afs_call_trace_put, n + 1, o, __builtin_return_address(0)); ASSERTCMP(n, >=, 0); if (n == 0) { ... } } I am printing the usage count in the afs_call tracepoint so that I can use it to debug refcount bugs. If I do it like this: void afs_put_call(struct afs_call *call) { int n = refcount_read(&call->usage); int o = atomic_read(&net->nr_outstanding_calls); trace_afs_call(call, afs_call_trace_put, n, o, __builtin_return_address(0)); if (refcount_dec_and_test(&call->usage)) { ... } } then there's a temporal gap between the usage count being read and the actual atomic decrement in which another CPU can alter the count. This can be exacerbated by an interrupt occurring, a softirq occurring or someone enabling the tracepoint. I can't do the tracepoint after the decrement if refcount_dec_and_test() returns false unless I save all the values from the object that I might need as the object could be destroyed any time from that point on. In this particular case, that's just call->debug_id, but it could be other things in other cases. Note that I also can't touch the afs_net object in that situation either, and the outstanding calls count that I record will potentially be out of date - but there's not a lot I can do about that. David
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.