|
Message-ID: <0f17eb05-c183-bec9-0076-5ddd00d70f15@rasmusvillemoes.dk> Date: Sun, 18 Mar 2018 21:34:12 +0100 From: Rasmus Villemoes <linux@...musvillemoes.dk> To: Lukas Wunner <lukas@...ner.de>, Rasmus Villemoes <linux@...musvillemoes.dk> Cc: Laura Abbott <labbott@...hat.com>, Linus Walleij <linus.walleij@...aro.org>, Kees Cook <keescook@...omium.org>, linux-gpio@...r.kernel.org, linux-kernel@...r.kernel.org, kernel-hardening@...ts.openwall.com, Mathias Duckeck <m.duckeck@...bus.de>, Nandor Han <nandor.han@...com>, Semi Malinen <semi.malinen@...com>, Patrice Chotard <patrice.chotard@...com> Subject: Re: [PATCH 1/4] gpio: Remove VLA from gpiolib On 2018-03-18 15:23, Lukas Wunner wrote: >>> >>> Other random thoughts: maybe two allocations for each loop iteration is >>> a bit much. Maybe do a first pass over the array and collect the maximal >>> chip->ngpio, do the memory allocation and freeing outside the loop (then >>> you'd of course need to preserve the memset() with appropriate length >>> computed). And maybe even just do one allocation, making bits point at >>> the second half. >> >> I think those are great ideas because the function is kind of a hotpath >> and usage of VLAs was motivated by the desire to make it fast. >> >> I'd go one step further and store the maximum ngpio of all registered >> chips in a global variable (and update it in gpiochip_add_data_with_key()), >> then allocate 2 * max_ngpio once before entering the loop (as you've >> suggested). That would avoid the first pass to determine the maximum >> chip->ngpio. In most systems max_ngpio will be < 64, so one or two >> unsigned longs depending on the arch's bitness. > > Actually, scratch that. If ngpio is usually smallish, we can just > allocate reasonably sized space for mask and bits on the stack, Yes. > and fall back to the kcalloc slowpath only if chip->ngpio exceeds > that limit. Well, I'd suggest not adding that fallback code now, but simply add a check in gpiochip_add_data_with_key to ensure ngpio is sane (and refuse to register the chip otherwise), at least if we know that every currently supported/known chip is covered by the 256 (?). That keeps the code simple and fast, and then if somebody has a chip with 40000 gpio lines, we can add a fallback path. Or we could consider alternative solutions, to avoid a 10000 byte GFP_ATOMIC allocation (maybe hang a pre-allocation off the gpio_chip; that's only two more bits per descriptor, and there's already a whole gpio_desc for each - but not sure about the locking in that case). Rasmus
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.