- 10 6月, 2022 5 次提交
-
-
由 Jason A. Donenfeld 提交于
With arch randomness being used by every distro and enabled in defconfigs, the distinction between rng_has_arch_random() and rng_is_initialized() is now rather small. In fact, the places where they differ are now places where paranoid users and system builders really don't want arch randomness to be used, in which case we should respect that choice, or places where arch randomness is known to be broken, in which case that choice is all the more important. So this commit just removes the function and its one user. Reviewed-by: Petr Mladek <pmladek@suse.com> # for vsprintf.c Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
-
由 Jason A. Donenfeld 提交于
Stephen reported that a static key warning splat appears during early boot on systems that credit randomness from device trees that contain an "rng-seed" property, because because setup_machine_fdt() is called before jump_label_init() during setup_arch(): static_key_enable_cpuslocked(): static key '0xffffffe51c6fcfc0' used before call to jump_label_init() WARNING: CPU: 0 PID: 0 at kernel/jump_label.c:166 static_key_enable_cpuslocked+0xb0/0xb8 Modules linked in: CPU: 0 PID: 0 Comm: swapper Not tainted 5.18.0+ #224 44b43e377bfc84bc99bb5ab885ff694984ee09ff pstate: 600001c9 (nZCv dAIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : static_key_enable_cpuslocked+0xb0/0xb8 lr : static_key_enable_cpuslocked+0xb0/0xb8 sp : ffffffe51c393cf0 x29: ffffffe51c393cf0 x28: 000000008185054c x27: 00000000f1042f10 x26: 0000000000000000 x25: 00000000f10302b2 x24: 0000002513200000 x23: 0000002513200000 x22: ffffffe51c1c9000 x21: fffffffdfdc00000 x20: ffffffe51c2f0831 x19: ffffffe51c6fcfc0 x18: 00000000ffff1020 x17: 00000000e1e2ac90 x16: 00000000000000e0 x15: ffffffe51b710708 x14: 0000000000000066 x13: 0000000000000018 x12: 0000000000000000 x11: 0000000000000000 x10: 00000000ffffffff x9 : 0000000000000000 x8 : 0000000000000000 x7 : 61632065726f6665 x6 : 6220646573752027 x5 : ffffffe51c641d25 x4 : ffffffe51c13142c x3 : ffff0a00ffffff05 x2 : 40000000ffffe003 x1 : 00000000000001c0 x0 : 0000000000000065 Call trace: static_key_enable_cpuslocked+0xb0/0xb8 static_key_enable+0x2c/0x40 crng_set_ready+0x24/0x30 execute_in_process_context+0x80/0x90 _credit_init_bits+0x100/0x154 add_bootloader_randomness+0x64/0x78 early_init_dt_scan_chosen+0x140/0x184 early_init_dt_scan_nodes+0x28/0x4c early_init_dt_scan+0x40/0x44 setup_machine_fdt+0x7c/0x120 setup_arch+0x74/0x1d8 start_kernel+0x84/0x44c __primary_switched+0xc0/0xc8 ---[ end trace 0000000000000000 ]--- random: crng init done Machine model: Google Lazor (rev1 - 2) with LTE A trivial fix went in to address this on arm64, 73e2d827 ("arm64: Initialize jump labels before setup_machine_fdt()"). I wrote patches as well for arm32 and risc-v. But still patches are needed on xtensa, powerpc, arc, and mips. So that's 7 platforms where things aren't quite right. This sort of points to larger issues that might need a larger solution. Instead, this commit just defers setting the static branch until later in the boot process. random_init() is called after jump_label_init() has been called, and so is always a safe place from which to adjust the static branch. Fixes: f5bda35f ("random: use static branch for crng_ready()") Reported-by: NStephen Boyd <swboyd@chromium.org> Reported-by: NPhil Elwell <phil@raspberrypi.com> Tested-by: NPhil Elwell <phil@raspberrypi.com> Reviewed-by: NArd Biesheuvel <ardb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
-
由 Jason A. Donenfeld 提交于
Rather than accounting in bytes and multiplying (shifting), we can just account in bits and avoid the shift. The main motivation for this is there are other patches in flux that expand this code a bit, and avoiding the duplication of "* 8" everywhere makes things a bit clearer. Cc: stable@vger.kernel.org Fixes: 12e45a2a ("random: credit architectural init the exact amount") Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
-
由 Jason A. Donenfeld 提交于
add_bootloader_randomness() and the variables it touches are only used during __init and not after, so mark these as __init. At the same time, unexport this, since it's only called by other __init code that's built-in. Cc: stable@vger.kernel.org Fixes: 428826f5 ("fdt: add support for rng-seed") Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
-
由 Jason A. Donenfeld 提交于
The current flow expands to: if (crng_ready()) ... else if (...) if (!crng_ready()) ... The second crng_ready() call is redundant, but can't so easily be optimized out by the compiler. This commit simplifies that to: if (crng_ready() ... else if (...) ... Fixes: 560181c2 ("random: move initialization functions out of hot pages") Cc: stable@vger.kernel.org Cc: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
-
- 23 5月, 2022 1 次提交
-
-
由 Jason A. Donenfeld 提交于
get_random_bytes_user() checks for signals after producing a PAGE_SIZE worth of output, just like /dev/zero does. write_pool() is doing basically the same work (actually, slightly more expensive), and so should stop to check for signals in the same way. Let's also name it write_pool_user() to match get_random_bytes_user(), so this won't be misused in the future. Before this patch, massive writes to /dev/urandom would tie up the process for an extremely long time and make it unterminatable. After, it can be successfully interrupted. The following test program can be used to see this works as intended: #include <unistd.h> #include <fcntl.h> #include <signal.h> #include <stdio.h> static unsigned char x[~0U]; static void handle(int) { } int main(int argc, char *argv[]) { pid_t pid = getpid(), child; int fd; signal(SIGUSR1, handle); if (!(child = fork())) { for (;;) kill(pid, SIGUSR1); } fd = open("/dev/urandom", O_WRONLY); pause(); printf("interrupted after writing %zd bytes\n", write(fd, x, sizeof(x))); close(fd); kill(child, SIGTERM); return 0; } Result before: "interrupted after writing 2147479552 bytes" Result after: "interrupted after writing 4096 bytes" Cc: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
-
- 21 5月, 2022 3 次提交
-
-
由 Jens Axboe 提交于
Now that random/urandom is using {read,write}_iter, we can wire it up to using the generic splice handlers. Fixes: 36e2c742 ("fs: don't allow splice read/write without explicit ops") Signed-off-by: NJens Axboe <axboe@kernel.dk> [Jason: added the splice_write path. Note that sendfile() and such still does not work for read, though it does for write, because of a file type restriction in splice_direct_to_actor(), which I'll address separately.] Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
-
由 Jens Axboe 提交于
Now that the read side has been converted to fix a regression with splice, convert the write side as well to have some symmetry in the interface used (and help deprecate ->write()). Signed-off-by: NJens Axboe <axboe@kernel.dk> [Jason: cleaned up random_ioctl a bit, require full writes in RNDADDENTROPY since it's crediting entropy, simplify control flow of write_pool(), and incorporate suggestions from Al.] Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
-
由 Jens Axboe 提交于
This is a pre-requisite to wiring up splice() again for the random and urandom drivers. It also allows us to remove the INT_MAX check in getrandom(), because import_single_range() applies capping internally. Signed-off-by: NJens Axboe <axboe@kernel.dk> [Jason: rewrote get_random_bytes_user() to simplify and also incorporate additional suggestions from Al.] Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
-
- 19 5月, 2022 7 次提交
-
-
由 Jason A. Donenfeld 提交于
There are currently two separate batched entropy implementations, for u32 and u64, with nearly identical code, with the goal of avoiding unaligned memory accesses and letting the buffers be used more efficiently. Having to maintain these two functions independently is a bit of a hassle though, considering that they always need to be kept in sync. This commit factors them out into a type-generic macro, so that the expansion produces the same code as before, such that diffing the assembly shows no differences. This will also make it easier in the future to add u16 and u8 batches. This was initially tested using an always_inline function and letting gcc constant fold the type size in, but the code gen was less efficient, and in general it was more verbose and harder to follow. So this patch goes with the boring macro solution, similar to what's already done for the _wait functions in random.h. Cc: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
-
由 Jason A. Donenfeld 提交于
randomize_page is an mm function. It is documented like one. It contains the history of one. It has the naming convention of one. It looks just like another very similar function in mm, randomize_stack_top(). And it has always been maintained and updated by mm people. There is no need for it to be in random.c. In the "which shape does not look like the other ones" test, pointing to randomize_page() is correct. So move randomize_page() into mm/util.c, right next to the similar randomize_stack_top() function. This commit contains no actual code changes. Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
-
由 Jason A. Donenfeld 提交于
The register_random_ready_notifier() notifier is somewhat complicated, and was already recently rewritten to use notifier blocks. It is only used now by one consumer in the kernel, vsprintf.c, for which the async mechanism is really overly complex for what it actually needs. This commit removes register_random_ready_notifier() and unregister_random_ ready_notifier(), because it just adds complication with little utility, and changes vsprintf.c to just check on `!rng_is_initialized() && !rng_has_arch_random()`, which will eventually be true. Performance- wise, that code was already using a static branch, so there's basically no overhead at all to this change. Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Acked-by: Petr Mladek <pmladek@suse.com> # for vsprintf.c Reviewed-by: NPetr Mladek <pmladek@suse.com> Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
-
由 Jason A. Donenfeld 提交于
The RNG incorporates RDRAND into its state at boot and every time it reseeds, so there's no reason for callers to use it directly. The hashing that the RNG does on it is preferable to using the bytes raw. The only current use case of get_random_bytes_arch() is vsprintf's siphash key for pointer hashing, which uses it to initialize the pointer secret earlier than usual if RDRAND is available. In order to replace this narrow use case, just expose whether RDRAND is mixed into the RNG, with a new function called rng_has_arch_random(). With that taken care of, there are no users of get_random_bytes_arch() left, so it can be removed. Later, if trust_cpu gets turned on by default (as most distros are doing), this one use of rng_has_arch_random() can probably go away as well. Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Acked-by: Petr Mladek <pmladek@suse.com> # for vsprintf.c Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
-
由 Jason A. Donenfeld 提交于
Much of random.c is devoted to initializing the rng and accounting for when a sufficient amount of entropy has been added. In a perfect world, this would all happen during init, and so we could mark these functions as __init. But in reality, this isn't the case: sometimes the rng only finishes initializing some seconds after system init is finished. For this reason, at the moment, a whole host of functions that are only used relatively close to system init and then never again are intermixed with functions that are used in hot code all the time. This creates more cache misses than necessary. In order to pack the hot code closer together, this commit moves the initialization functions that can't be marked as __init into .text.unlikely by way of the __cold attribute. Of particular note is moving credit_init_bits() into a macro wrapper that inlines the crng_ready() static branch check. This avoids a function call to a nop+ret, and most notably prevents extra entropy arithmetic from being computed in mix_interrupt_randomness(). Reviewed-by: NDominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
-
由 Jason A. Donenfeld 提交于
The current code was a mix of "nbytes", "count", "size", "buffer", "in", and so forth. Instead, let's clean this up by naming input parameters "buf" (or "ubuf") and "len", so that you always understand that you're reading this variety of function argument. Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
-
由 Jason A. Donenfeld 提交于
Since crng_ready() is only false briefly during initialization and then forever after becomes true, we don't need to evaluate it after, making it a prime candidate for a static branch. One complication, however, is that it changes state in a particular call to credit_init_bits(), which might be made from atomic context, which means we must kick off a workqueue to change the static key. Further complicating things, credit_init_bits() may be called sufficiently early on in system initialization such that system_wq is NULL. Fortunately, there exists the nice function execute_in_process_context(), which will immediately execute the function if !in_interrupt(), and otherwise defer it to a workqueue. During early init, before workqueues are available, in_interrupt() is always false, because interrupts haven't even been enabled yet, which means the function in that case executes immediately. Later on, after workqueues are available, in_interrupt() might be true, but in that case, the work is queued in system_wq and all goes well. Cc: Theodore Ts'o <tytso@mit.edu> Cc: Sultan Alsawaf <sultan@kerneltoast.com> Reviewed-by: NDominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
-
- 18 5月, 2022 10 次提交
-
-
由 Jason A. Donenfeld 提交于
RDRAND and RDSEED can fail sometimes, which is fine. We currently initialize the RNG with 512 bits of RDRAND/RDSEED. We only need 256 bits of those to succeed in order to initialize the RNG. Instead of the current "all or nothing" approach, actually credit these contributions the amount that is actually contributed. Reviewed-by: NDominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
-
由 Jason A. Donenfeld 提交于
Currently, start_kernel() adds latent entropy and the command line to the entropy bool *after* the RNG has been initialized, deferring when it's actually used by things like stack canaries until the next time the pool is seeded. This surely is not intended. Rather than splitting up which entropy gets added where and when between start_kernel() and random_init(), just do everything in random_init(), which should eliminate these kinds of bugs in the future. While we're at it, rename the awkwardly titled "rand_initialize()" to the more standard "random_init()" nomenclature. Reviewed-by: NDominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
-
由 Jason A. Donenfeld 提交于
This expands to exactly the same code that it replaces, but makes things consistent by using the same macro for jiffy comparisons throughout. Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
-
由 Jason A. Donenfeld 提交于
The CONFIG_WARN_ALL_UNSEEDED_RANDOM debug option controls whether the kernel warns about all unseeded randomness or just the first instance. There's some complicated rate limiting and comparison to the previous caller, such that even with CONFIG_WARN_ALL_UNSEEDED_RANDOM enabled, developers still don't see all the messages or even an accurate count of how many were missed. This is the result of basically parallel mechanisms aimed at accomplishing more or less the same thing, added at different points in random.c history, which sort of compete with the first-instance-only limiting we have now. It turns out, however, that nobody cares about the first unseeded randomness instance of in-kernel users. The same first user has been there for ages now, and nobody is doing anything about it. It isn't even clear that anybody _can_ do anything about it. Most places that can do something about it have switched over to using get_random_bytes_wait() or wait_for_random_bytes(), which is the right thing to do, but there is still much code that needs randomness sometimes during init, and as a geeneral rule, if you're not using one of the _wait functions or the readiness notifier callback, you're bound to be doing it wrong just based on that fact alone. So warning about this same first user that can't easily change is simply not an effective mechanism for anything at all. Users can't do anything about it, as the Kconfig text points out -- the problem isn't in userspace code -- and kernel developers don't or more often can't react to it. Instead, show the warning for all instances when CONFIG_WARN_ALL_UNSEEDED_RANDOM is set, so that developers can debug things need be, or if it isn't set, don't show a warning at all. At the same time, CONFIG_WARN_ALL_UNSEEDED_RANDOM now implies setting random.ratelimit_disable=1 on by default, since if you care about one you probably care about the other too. And we can clean up usage around the related urandom_warning ratelimiter as well (whose behavior isn't changing), so that it properly counts missed messages after the 10 message threshold is reached. Cc: Theodore Ts'o <tytso@mit.edu> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
-
由 Jason A. Donenfeld 提交于
Initialization happens once -- by way of credit_init_bits() -- and then it never happens again. Therefore, it doesn't need to be in crng_reseed(), which is a hot path that is called multiple times. It also doesn't make sense to have there, as initialization activity is better associated with initialization routines. After the prior commit, crng_reseed() now won't be called by multiple concurrent callers, which means that we can safely move the "finialize_init" logic into crng_init_bits() unconditionally. Reviewed-by: NDominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
-
由 Jason A. Donenfeld 提交于
Since all changes of crng_init now go through credit_init_bits(), we can fix a long standing race in which two concurrent callers of credit_init_bits() have the new bit count >= some threshold, but are doing so with crng_init as a lower threshold, checked outside of a lock, resulting in crng_reseed() or similar being called twice. In order to fix this, we can use the original cmpxchg value of the bit count, and only change crng_init when the bit count transitions from below a threshold to meeting the threshold. Reviewed-by: NDominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
-
由 Jason A. Donenfeld 提交于
crng_init represents a state machine, with three states, and various rules for transitions. For the longest time, we've been managing these with "0", "1", and "2", and expecting people to figure it out. To make the code more obvious, replace these with proper enum values representing the transition, and then redocument what each of these states mean. Reviewed-by: NDominik Brodowski <linux@dominikbrodowski.net> Cc: Joe Perches <joe@perches.com> Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
-
由 Jason A. Donenfeld 提交于
The SipHash family of permutations is currently used in three places: - siphash.c itself, used in the ordinary way it was intended. - random32.c, in a construction from an anonymous contributor. - random.c, as part of its fast_mix function. Each one of these places reinvents the wheel with the same C code, same rotation constants, and same symmetry-breaking constants. This commit tidies things up a bit by placing macros for the permutations and constants into siphash.h, where each of the three .c users can access them. It also leaves a note dissuading more users of them from emerging. Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
-
由 Jason A. Donenfeld 提交于
Now that fast_mix() has more than one caller, gcc no longer inlines it. That's fine. But it also doesn't handle the compound literal argument we pass it very efficiently, nor does it handle the loop as well as it could. So just expand the code to spell out this function so that it generates the same code as it did before. Performance-wise, this now behaves as it did before the last commit. The difference in actual code size on x86 is 45 bytes, which is less than a cache line. Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
-
由 Jason A. Donenfeld 提交于
Years ago, a separate fast pool was added for interrupts, so that the cost associated with taking the input pool spinlocks and mixing into it would be avoided in places where latency is critical. However, one oversight was that add_input_randomness() and add_disk_randomness() still sometimes are called directly from the interrupt handler, rather than being deferred to a thread. This means that some unlucky interrupts will be caught doing a blake2s_compress() call and potentially spinning on input_pool.lock, which can also be taken by unprivileged users by writing into /dev/urandom. In order to fix this, add_timer_randomness() now checks whether it is being called from a hard IRQ and if so, just mixes into the per-cpu IRQ fast pool using fast_mix(), which is much faster and can be done lock-free. A nice consequence of this, as well, is that it means hard IRQ context FPU support is likely no longer useful. The entropy estimation algorithm used by add_timer_randomness() is also somewhat different than the one used for add_interrupt_randomness(). The former looks at deltas of deltas of deltas, while the latter just waits for 64 interrupts for one bit or for one second since the last bit. In order to bridge these, and since add_interrupt_randomness() runs after an add_timer_randomness() that's called from hard IRQ, we add to the fast pool credit the related amount, and then subtract one to account for add_interrupt_randomness()'s contribution. A downside of this, however, is that the num argument is potentially attacker controlled, which puts a bit more pressure on the fast_mix() sponge to do more than it's really intended to do. As a mitigating factor, the first 96 bits of input aren't attacker controlled (a cycle counter followed by zeros), which means it's essentially two rounds of siphash rather than one, which is somewhat better. It's also not that much different from add_interrupt_randomness()'s use of the irq stack instruction pointer register. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Filipe Manana <fdmanana@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
-
- 16 5月, 2022 1 次提交
-
-
由 Jason A. Donenfeld 提交于
There are no code changes here; this is just a reordering of functions, so that in subsequent commits, the timer entropy functions can call into the interrupt ones. Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
-
- 15 5月, 2022 1 次提交
-
-
由 Jason A. Donenfeld 提交于
Per the thread linked below, "premature next" is not considered to be a realistic threat model, and leads to more serious security problems. "Premature next" is the scenario in which: - Attacker compromises the current state of a fully initialized RNG via some kind of infoleak. - New bits of entropy are added directly to the key used to generate the /dev/urandom stream, without any buffering or pooling. - Attacker then, somehow having read access to /dev/urandom, samples RNG output and brute forces the individual new bits that were added. - Result: the RNG never "recovers" from the initial compromise, a so-called violation of what academics term "post-compromise security". The usual solutions to this involve some form of delaying when entropy gets mixed into the crng. With Fortuna, this involves multiple input buckets. With what the Linux RNG was trying to do prior, this involves entropy estimation. However, by delaying when entropy gets mixed in, it also means that RNG compromises are extremely dangerous during the window of time before the RNG has gathered enough entropy, during which time nonces may become predictable (or repeated), ephemeral keys may not be secret, and so forth. Moreover, it's unclear how realistic "premature next" is from an attack perspective, if these attacks even make sense in practice. Put together -- and discussed in more detail in the thread below -- these constitute grounds for just doing away with the current code that pretends to handle premature next. I say "pretends" because it wasn't doing an especially great job at it either; should we change our mind about this direction, we would probably implement Fortuna to "fix" the "problem", in which case, removing the pretend solution still makes sense. This also reduces the crng reseed period from 5 minutes down to 1 minute. The rationale from the thread might lead us toward reducing that even further in the future (or even eliminating it), but that remains a topic of a future commit. At a high level, this patch changes semantics from: Before: Seed for the first time after 256 "bits" of estimated entropy have been accumulated since the system booted. Thereafter, reseed once every five minutes, but only if 256 new "bits" have been accumulated since the last reseeding. After: Seed for the first time after 256 "bits" of estimated entropy have been accumulated since the system booted. Thereafter, reseed once every minute. Most of this patch is renaming and removing: POOL_MIN_BITS becomes POOL_INIT_BITS, credit_entropy_bits() becomes credit_init_bits(), crng_reseed() loses its "force" parameter since it's now always true, the drain_entropy() function no longer has any use so it's removed, entropy estimation is skipped if we've already init'd, the various notifiers for "low on entropy" are now only active prior to init, and finally, some documentation comments are cleaned up here and there. Link: https://lore.kernel.org/lkml/YmlMGx6+uigkGiZ0@zx2c4.com/ Cc: Theodore Ts'o <tytso@mit.edu> Cc: Nadia Heninger <nadiah@cs.ucsd.edu> Cc: Tom Ristenpart <ristenpart@cornell.edu> Reviewed-by: NEric Biggers <ebiggers@google.com> Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
-
- 14 5月, 2022 5 次提交
-
-
由 Jason A. Donenfeld 提交于
Before, the first 64 bytes of input, regardless of how entropic it was, would be used to mutate the crng base key directly, and none of those bytes would be credited as having entropy. Then 256 bits of credited input would be accumulated, and only then would the rng transition from the earlier "fast init" phase into being actually initialized. The thinking was that by mixing and matching fast init and real init, an attacker who compromised the fast init state, considered easy to do given how little entropy might be in those first 64 bytes, would then be able to bruteforce bits from the actual initialization. By keeping these separate, bruteforcing became impossible. However, by not crediting potentially creditable bits from those first 64 bytes of input, we delay initialization, and actually make the problem worse, because it means the user is drawing worse random numbers for a longer period of time. Instead, we can take the first 128 bits as fast init, and allow them to be credited, and then hold off on the next 128 bits until they've accumulated. This is still a wide enough margin to prevent bruteforcing the rng state, while still initializing much faster. Then, rather than trying to piecemeal inject into the base crng key at various points, instead just extract from the pool when we need it, for the crng_init==0 phase. Performance may even be better for the various inputs here, since there are likely more calls to mix_pool_bytes() then there are to get_random_bytes() during this phase of system execution. Since the preinit injection code is gone, bootloader randomness can then do something significantly more straight forward, removing the weird system_wq hack in hwgenerator randomness. Cc: Theodore Ts'o <tytso@mit.edu> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
-
由 Jason A. Donenfeld 提交于
It's too hard to keep the batches synchronized, and pointless anyway, since in !crng_ready(), we're updating the base_crng key really often, where batching only hurts. So instead, if the crng isn't ready, just call into get_random_bytes(). At this stage nothing is performance critical anyhow. Cc: Theodore Ts'o <tytso@mit.edu> Reviewed-by: NDominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
-
由 Jason A. Donenfeld 提交于
Since the RNG loses freshness with system suspend/hibernation, when we resume, immediately reseed using whatever data we can, which for this particular case is the various timestamps regarding system suspend time, in addition to more generally the RDSEED/RDRAND/RDTSC values that happen whenever the crng reseeds. On systems that suspend and resume automatically all the time -- such as Android -- we skip the reseeding on suspend resumption, since that could wind up being far too busy. This is the same trade-off made in WireGuard. In addition to reseeding upon resumption always mix into the pool these various stamps on every power notification event. Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
-
由 Jason A. Donenfeld 提交于
Currently, we do the jitter dance if two consecutive reads to the cycle counter return different values. If they do, then we consider the cycle counter to be fast enough that one trip through the scheduler will yield one "bit" of credited entropy. If those two reads return the same value, then we assume the cycle counter is too slow to show meaningful differences. This methodology is flawed for a variety of reasons, one of which Eric posted a patch to fix in [1]. The issue that patch solves is that on a system with a slow counter, you might be [un]lucky and read the counter _just_ before it changes, so that the second cycle counter you read differs from the first, even though there's usually quite a large period of time in between the two. For example: | real time | cycle counter | | --------- | ------------- | | 3 | 5 | | 4 | 5 | | 5 | 5 | | 6 | 5 | | 7 | 5 | <--- a | 8 | 6 | <--- b | 9 | 6 | <--- c If we read the counter at (a) and compare it to (b), we might be fooled into thinking that it's a fast counter, when in reality it is not. The solution in [1] is to also compare counter (b) to counter (c), on the theory that if the counter is _actually_ slow, and (a)!=(b), then certainly (b)==(c). This helps solve this particular issue, in one sense, but in another sense, it mostly functions to disallow jitter entropy on these systems, rather than simply taking more samples in that case. Instead, this patch takes a different approach. Right now we assume that a difference in one set of consecutive samples means one "bit" of credited entropy per scheduler trip. We can extend this so that a difference in two sets of consecutive samples means one "bit" of credited entropy per /two/ scheduler trips, and three for three, and four for four. In other words, we can increase the amount of jitter "work" we require for each "bit", depending on how slow the cycle counter is. So this patch takes whole bunch of samples, sees how many of them are different, and divides to find the amount of work required per "bit", and also requires that at least some minimum of them are different in order to attempt any jitter entropy. Note that this approach is still far from perfect. It's not a real statistical estimate on how much these samples vary; it's not a real-time analysis of the relevant input data. That remains a project for another time. However, it makes the same (partly flawed) assumptions as the code that's there now, so it's probably not worse than the status quo, and it handles the issue Eric mentioned in [1]. But, again, it's probably a far cry from whatever a really robust version of this would be. [1] https://lore.kernel.org/lkml/20220421233152.58522-1-ebiggers@kernel.org/ https://lore.kernel.org/lkml/20220421192939.250680-1-ebiggers@kernel.org/ Cc: Eric Biggers <ebiggers@google.com> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
-
由 Jason A. Donenfeld 提交于
All platforms are now guaranteed to provide some value for random_get_entropy(). In case some bug leads to this not being so, we print a warning, because that indicates that something is really very wrong (and likely other things are impacted too). This should never be hit, but it's a good and cheap way of finding out if something ever is problematic. Since we now have viable fallback code for random_get_entropy() on all platforms, which is, in the worst case, not worse than jiffies, we can count on getting the best possible value out of it. That means there's no longer a use for using jiffies as entropy input. It also means we no longer have a reason for doing the round-robin register flow in the IRQ handler, which was always of fairly dubious value. Instead we can greatly simplify the IRQ handler inputs and also unify the construction between 64-bits and 32-bits. We now collect the cycle counter and the return address, since those are the two things that matter. Because the return address and the irq number are likely related, to the extent we mix in the irq number, we can just xor it into the top unchanging bytes of the return address, rather than the bottom changing bytes of the cycle counter as before. Then, we can do a fixed 2 rounds of SipHash/HSipHash. Finally, we use the same construction of hashing only half of the [H]SipHash state on 32-bit and 64-bit. We're not actually discarding any entropy, since that entropy is carried through until the next time. And more importantly, it lets us do the same sponge-like construction everywhere. Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
-
- 25 4月, 2022 1 次提交
-
-
由 Jason A. Donenfeld 提交于
This reverts 35a33ff3 ("random: use memmove instead of memcpy for remaining 32 bytes"), which was made on a totally bogus basis. The thing it was worried about overlapping came from the stack, not from one of its arguments, as Eric pointed out. But the fact that this confusion even happened draws attention to the fact that it's a bit non-obvious that the random_data parameter can alias chacha_state, and in fact should do so when the caller can't rely on the stack being cleared in a timely manner. So this commit documents that. Reported-by: NEric Biggers <ebiggers@kernel.org> Reviewed-by: NEric Biggers <ebiggers@google.com> Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
-
- 16 4月, 2022 1 次提交
-
-
由 Jason A. Donenfeld 提交于
In order to immediately overwrite the old key on the stack, before servicing a userspace request for bytes, we use the remaining 32 bytes of block 0 as the key. This means moving indices 8,9,a,b,c,d,e,f -> 4,5,6,7,8,9,a,b. Since 4 < 8, for the kernel implementations of memcpy(), this doesn't actually appear to be a problem in practice. But relying on that characteristic seems a bit brittle. So let's change that to a proper memmove(), which is the by-the-books way of handling overlapping memory copies. Reviewed-by: NDominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
-
- 13 4月, 2022 2 次提交
-
-
由 Jason A. Donenfeld 提交于
Some implementations were returning type `unsigned long`, while others that fell back to get_cycles() were implicitly returning a `cycles_t` or an untyped constant int literal. That makes for weird and confusing code, and basically all code in the kernel already handled it like it was an `unsigned long`. I recently tried to handle it as the largest type it could be, a `cycles_t`, but doing so doesn't really help with much. Instead let's just make random_get_entropy() return an unsigned long all the time. This also matches the commonly used `arch_get_random_long()` function, so now RDRAND and RDTSC return the same sized integer, which means one can fallback to the other more gracefully. Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: Theodore Ts'o <tytso@mit.edu> Acked-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
-
由 Jason A. Donenfeld 提交于
Rather than failing entirely if a copy_to_user() fails at some point, instead we should return a partial read for the amount that succeeded prior, unless none succeeded at all, in which case we return -EFAULT as before. This makes it consistent with other reader interfaces. For example, the following snippet for /dev/zero outputs "4" followed by "1": int fd; void *x = mmap(NULL, 4096, PROT_WRITE, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0); assert(x != MAP_FAILED); fd = open("/dev/zero", O_RDONLY); assert(fd >= 0); printf("%zd\n", read(fd, x, 4)); printf("%zd\n", read(fd, x + 4095, 4)); close(fd); This brings that same standard behavior to the various RNG reader interfaces. While we're at it, we can streamline the loop logic a little bit. Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org> Cc: Jann Horn <jannh@google.com> Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
-
- 07 4月, 2022 1 次提交
-
-
由 Jason A. Donenfeld 提交于
In 1448769c ("random: check for signal_pending() outside of need_resched() check"), Jann pointed out that we previously were only checking the TIF_NOTIFY_SIGNAL and TIF_SIGPENDING flags if the process had TIF_NEED_RESCHED set, which meant in practice, super long reads to /dev/[u]random would delay signal handling by a long time. I tried this using the below program, and indeed I wasn't able to interrupt a /dev/urandom read until after several megabytes had been read. The bug he fixed has always been there, and so code that reads from /dev/urandom without checking the return value of read() has mostly worked for a long time, for most sizes, not just for <= 256. Maybe it makes sense to keep that code working. The reason it was so small prior, ignoring the fact that it didn't work anyway, was likely because /dev/random used to block, and that could happen for pretty large lengths of time while entropy was gathered. But now, it's just a chacha20 call, which is extremely fast and is just operating on pure data, without having to wait for some external event. In that sense, /dev/[u]random is a lot more like /dev/zero. Taking a page out of /dev/zero's read_zero() function, it always returns at least one chunk, and then checks for signals after each chunk. Chunk sizes there are of length PAGE_SIZE. Let's just copy the same thing for /dev/[u]random, and check for signals and cond_resched() for every PAGE_SIZE amount of data. This makes the behavior more consistent with expectations, and should mitigate the impact of Jann's fix for the age-old signal check bug. ---- test program ---- #include <unistd.h> #include <signal.h> #include <stdio.h> #include <sys/random.h> static unsigned char x[~0U]; static void handle(int) { } int main(int argc, char *argv[]) { pid_t pid = getpid(), child; signal(SIGUSR1, handle); if (!(child = fork())) { for (;;) kill(pid, SIGUSR1); } pause(); printf("interrupted after reading %zd bytes\n", getrandom(x, sizeof(x), 0)); kill(child, SIGTERM); return 0; } Cc: Jann Horn <jannh@google.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
-
- 06 4月, 2022 2 次提交
-
-
由 Jann Horn 提交于
signal_pending() checks TIF_NOTIFY_SIGNAL and TIF_SIGPENDING, which signal that the task should bail out of the syscall when possible. This is a separate concept from need_resched(), which checks TIF_NEED_RESCHED, signaling that the task should preempt. In particular, with the current code, the signal_pending() bailout probably won't work reliably. Change this to look like other functions that read lots of data, such as read_zero(). Fixes: 1da177e4 ("Linux-2.6.12-rc2") Signed-off-by: NJann Horn <jannh@google.com> Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
-
由 Jason A. Donenfeld 提交于
The fast key erasure RNG design relies on the key that's used to be used and then discarded. We do this, making judicious use of memzero_explicit(). However, reads to /dev/urandom and calls to getrandom() involve a copy_to_user(), and userspace can use FUSE or userfaultfd, or make a massive call, dynamically remap memory addresses as it goes, and set the process priority to idle, in order to keep a kernel stack alive indefinitely. By probing /proc/sys/kernel/random/entropy_avail to learn when the crng key is refreshed, a malicious userspace could mount this attack every 5 minutes thereafter, breaking the crng's forward secrecy. In order to fix this, we just overwrite the stack's key with the first 32 bytes of the "free" fast key erasure output. If we're returning <= 32 bytes to the user, then we can still return those bytes directly, so that short reads don't become slower. And for long reads, the difference is hopefully lost in the amortization, so it doesn't change much, with that amortization helping variously for medium reads. We don't need to do this for get_random_bytes() and the various kernel-space callers, and later, if we ever switch to always batching, this won't be necessary either, so there's no need to change the API of these functions. Cc: Theodore Ts'o <tytso@mit.edu> Reviewed-by: NJann Horn <jannh@google.com> Fixes: c92e040d ("random: add backtracking protection to the CRNG") Fixes: 186873c5 ("random: use simpler fast key erasure flow on per-cpu keys") Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
-