提交 · 4c95236a335d8b62aa8dbd587bed6a5f30d8265a · openeuler / Kernel

30 9月, 2022 3 次提交

random: add 8-bit and 16-bit batches · 585cd5fe

由 Jason A. Donenfeld 提交于 9月 28, 2022

There are numerous places in the kernel that would be sped up by having
smaller batches. Currently those callsites do `get_random_u32() & 0xff`
or similar. Since these are pretty spread out, and will require patches
to multiple different trees, let's get ahead of the curve and lay the
foundation for `get_random_u8()` and `get_random_u16()`, so that it's
then possible to start submitting conversion patches leisurely.
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

585cd5fe

random: use init_utsname() instead of utsname() · dd54fd7d

由 Jason A. Donenfeld 提交于 9月 27, 2022

Rather than going through the current-> indirection for utsname, at this
point in boot, init_utsname()==utsname(), so just use it directly that
way. Additionally, init_utsname() appears to be available nearly always,
so move it into random_init_early().
Suggested-by: NKees Cook <keescook@chromium.org>
Reviewed-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

dd54fd7d

random: split initialization into early step and later step · f6238499

由 Jason A. Donenfeld 提交于 9月 26, 2022

The full RNG initialization relies on some timestamps, made possible
with initialization functions like time_init() and timekeeping_init().
However, these are only available rather late in initialization.
Meanwhile, other things, such as memory allocator functions, make use of
the RNG much earlier.

So split RNG initialization into two phases. We can provide arch
randomness very early on, and then later, after timekeeping and such are
available, initialize the rest.

This ensures that, for example, slabs are properly randomized if RDRAND
is available. Without this, CONFIG_SLAB_FREELIST_RANDOM=y loses a degree
of its security, because its random seed is potentially deterministic,
since it hasn't yet incorporated RDRAND. It also makes it possible to
use a better seed in kfence, which currently relies on only the cycle
counter.

Another positive consequence is that on systems with RDRAND, running
with CONFIG_WARN_ALL_UNSEEDED_RANDOM=y results in no warnings at all.

One subtle side effect of this change is that on systems with no RDRAND,
RDTSC is now only queried by random_init() once, committing the moment
of the function call, instead of multiple times as before. This is
intentional, as the multiple RDTSCs in a loop before weren't
accomplishing very much, with jitter being better provided by
try_to_generate_entropy(). Plus, filling blocks with RDTSC is still
being done in extract_entropy(), which is necessarily called before
random bytes are served anyway.

Cc: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: NKees Cook <keescook@chromium.org>
Reviewed-by: NDominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

f6238499

29 9月, 2022 2 次提交

random: use expired timer rather than wq for mixing fast pool · 748bc4dd

由 Jason A. Donenfeld 提交于 9月 22, 2022

Previously, the fast pool was dumped into the main pool periodically in
the fast pool's hard IRQ handler. This worked fine and there weren't
problems with it, until RT came around. Since RT converts spinlocks into
sleeping locks, problems cropped up. Rather than switching to raw
spinlocks, the RT developers preferred we make the transformation from
originally doing:

    do_some_stuff()
    spin_lock()
    do_some_other_stuff()
    spin_unlock()

to doing:

    do_some_stuff()
    queue_work_on(some_other_stuff_worker)

This is an ordinary pattern done all over the kernel. However, Sherry
noticed a 10% performance regression in qperf TCP over a 40gbps
InfiniBand card. Quoting her message:

> MT27500 Family [ConnectX-3] cards:
> Infiniband device 'mlx4_0' port 1 status:
> default gid: fe80:0000:0000:0000:0010:e000:0178:9eb1
> base lid: 0x6
> sm lid: 0x1
> state: 4: ACTIVE
> phys state: 5: LinkUp
> rate: 40 Gb/sec (4X QDR)
> link_layer: InfiniBand
>
> Cards are configured with IP addresses on private subnet for IPoIB
> performance testing.
> Regression identified in this bug is in TCP latency in this stack as reported
> by qperf tcp_lat metric:
>
> We have one system listen as a qperf server:
> [root@yourQperfServer ~]# qperf
>
> Have the other system connect to qperf server as a client (in this
> case, it’s X7 server with Mellanox card):
> [root@yourQperfClient ~]# numactl -m0 -N0 qperf 20.20.20.101 -v -uu -ub --time 60 --wait_server 20 -oo msg_size:4K:1024K:*2 tcp_lat

Rather than incur the scheduling latency from queue_work_on, we can
instead switch to running on the next timer tick, on the same core. This
also batches things a bit more -- once per jiffy -- which is okay now
that mix_interrupt_randomness() can credit multiple bits at once.
Reported-by: NSherry Yang <sherry.yang@oracle.com>
Tested-by: NPaul Webb <paul.x.webb@oracle.com>
Cc: Sherry Yang <sherry.yang@oracle.com>
Cc: Phillip Goerl <phillip.goerl@oracle.com>
Cc: Jack Vogel <jack.vogel@oracle.com>
Cc: Nicky Veitch <nicky.veitch@oracle.com>
Cc: Colm Harrington <colm.harrington@oracle.com>
Cc: Ramanan Govindarajan <ramanan.govindarajan@oracle.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Tejun Heo <tj@kernel.org>
Cc: Sultan Alsawaf <sultan@kerneltoast.com>
Cc: stable@vger.kernel.org
Fixes: 58340f8e ("random: defer fast pool mixing to worker")
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

748bc4dd

random: avoid reading two cache lines on irq randomness · 9ee0507e

由 Jason A. Donenfeld 提交于 9月 22, 2022

In order to avoid reading and dirtying two cache lines on every IRQ,
move the work_struct to the bottom of the fast_pool struct. add_
interrupt_randomness() always touches .pool and .count, which are
currently split, because .mix pushes everything down. Instead, move .mix
to the bottom, so that .pool and .count are always in the first cache
line, since .mix is only accessed when the pool is full.

Fixes: 58340f8e ("random: defer fast pool mixing to worker")
Reviewed-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

9ee0507e

23 9月, 2022 4 次提交

random: clamp credited irq bits to maximum mixed · e78a802a

由 Jason A. Donenfeld 提交于 9月 23, 2022

Since the most that's mixed into the pool is sizeof(long)*2, don't
credit more than that many bytes of entropy.

Fixes: e3e33fc2 ("random: do not use input pool from hard IRQs")
Cc: stable@vger.kernel.org
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

e78a802a

random: throttle hwrng writes if no entropy is credited · d775335e

由 Jason A. Donenfeld 提交于 9月 20, 2022

If a hwrng source does not provide an entropy estimate, it currently
does not contribute at all to the CRNG. In order to help fix this, in
case add_hwgenerator_randomness() is called with the entropy parameter
set to zero, go to sleep until one reseed interval has passed.

While the hwrng thread currently only runs under conditions where this
is non-zero, this change is not harmful and prepares for future updates
to the hwrng core.

Cc: Herbert Xu <herbert@gondor.apana.org.au>
Reviewed-by: NDominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

d775335e

random: use hwgenerator randomness more frequently at early boot · 745558f9

由 Dominik Brodowski 提交于 9月 04, 2022

Mix in randomness from hw-rng sources more frequently during early
boot, approximately once for every rng reseed.
Signed-off-by: NDominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

745558f9

random: restore O_NONBLOCK support · cd4f24ae

由 Jason A. Donenfeld 提交于 9月 08, 2022

Prior to 5.6, when /dev/random was opened with O_NONBLOCK, it would
return -EAGAIN if there was no entropy. When the pools were unified in
5.6, this was lost. The post 5.6 behavior of blocking until the pool is
initialized, and ignoring O_NONBLOCK in the process, went unnoticed,
with no reports about the regression received for two and a half years.
However, eventually this indeed did break somebody's userspace.

So we restore the old behavior, by returning -EAGAIN if the pool is not
initialized. Unlike the old /dev/random, this can only occur during
early boot, after which it never blocks again.

In order to make this O_NONBLOCK behavior consistent with other
expectations, also respect users reading with preadv2(RWF_NOWAIT) and
similar.

Fixes: 30c08efe ("random: make /dev/random be almost like /dev/urandom")
Reported-by: NGuozihua <guozihua@huawei.com>
Reported-by: NZhongguohua <zhongguohua1@huawei.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Andrew Lutomirski <luto@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

cd4f24ae

30 7月, 2022 1 次提交

random: correct spelling of "overwrites" · 7f637be4

由 Jason A. Donenfeld 提交于 7月 30, 2022

It was missing an 'r'.

Fixes: 186873c5 ("random: use simpler fast key erasure flow on per-cpu keys")
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

7f637be4

25 7月, 2022 1 次提交

random: handle archrandom with multiple longs · d349ab99

由 Jason A. Donenfeld 提交于 7月 17, 2022

The archrandom interface was originally designed for x86, which supplies
RDRAND/RDSEED for receiving random words into registers, resulting in
one function to generate an int and another to generate a long. However,
other architectures don't follow this.

On arm64, the SMCCC TRNG interface can return between one and three
longs. On s390, the CPACF TRNG interface can return arbitrary amounts,
with four longs having the same cost as one. On UML, the os_getrandom()
interface can return arbitrary amounts.

So change the api signature to take a "max_longs" parameter designating
the maximum number of longs requested, and then return the number of
longs generated.

Since callers need to check this return value and loop anyway, each arch
implementation does not bother implementing its own loop to try again to
fill the maximum number of longs. Additionally, all existing callers
pass in a constant max_longs parameter. Taken together, these two things
mean that the codegen doesn't really change much for one-word-at-a-time
platforms, while performance is greatly improved on platforms such as
s390.
Acked-by: NHeiko Carstens <hca@linux.ibm.com>
Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
Acked-by: NMark Rutland <mark.rutland@arm.com>
Acked-by: NMichael Ellerman <mpe@ellerman.id.au>
Acked-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

d349ab99

18 7月, 2022 1 次提交

random: use try_cmpxchg in _credit_init_bits · b7a68f67

由 Uros Bizjak 提交于 7月 14, 2022

Use `!try_cmpxchg(ptr, &orig, new)` instead of `cmpxchg(ptr, orig, new)
!= orig` in _credit_init_bits. This has two benefits:

- The x86 cmpxchg instruction returns success in the ZF flag, so this
  change saves a compare after cmpxchg, as well as a related move
  instruction in front of cmpxchg.

- try_cmpxchg implicitly assigns the *ptr value to &orig when cmpxchg
  fails, enabling further code simplifications.

This patch has no functional change.
Signed-off-by: NUros Bizjak <ubizjak@gmail.com>
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

b7a68f67

17 7月, 2022 1 次提交

random: cap jitter samples per bit to factor of HZ · 829d680e

由 Jason A. Donenfeld 提交于 7月 13, 2022

Currently the jitter mechanism will require two timer ticks per
iteration, and it requires N iterations per bit. This N is determined
with a small measurement, and if it's too big, it won't waste time with
jitter entropy because it'd take too long or not have sufficient entropy
anyway.

With the current max N of 32, there are large timeouts on systems with a
small CONFIG_HZ. Rather than set that maximum to 32, instead choose a
factor of CONFIG_HZ. In this case, 1/30 seems to yield sane values for
different configurations of CONFIG_HZ.
Reported-by: NVladimir Murzin <vladimir.murzin@arm.com>
Fixes: 78c768e6 ("random: vary jitter iterations based on cycle counter speed")
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
Tested-by: NVladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

829d680e

01 7月, 2022 1 次提交

pm/sleep: Add PM_USERSPACE_AUTOSLEEP Kconfig · 261e224d

由 Kalesh Singh 提交于 6月 30, 2022

Systems that initiate frequent suspend/resume from userspace
can make the kernel aware by enabling PM_USERSPACE_AUTOSLEEP
config.

This allows for certain sleep-sensitive code (wireguard/rng) to
decide on what preparatory work should be performed (or not) in
their pm_notification callbacks.

This patch was prompted by the discussion at [1] which attempts
to remove CONFIG_ANDROID that currently guards these code paths.

[1] https://lore.kernel.org/r/20220629150102.1582425-1-hch@lst.de/Suggested-by: NJason A. Donenfeld <Jason@zx2c4.com>
Acked-by: NJason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: NKalesh Singh <kaleshsingh@google.com>
Link: https://lore.kernel.org/r/20220630191230.235306-1-kaleshsingh@google.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

261e224d

20 6月, 2022 3 次提交

random: update comment from copy_to_user() -> copy_to_iter() · 63b8ea5e

由 Jason A. Donenfeld 提交于 6月 20, 2022

This comment wasn't updated when we moved from read() to read_iter(), so
this patch makes the trivial fix.

Fixes: 1b388e77 ("random: convert to using fops->read_iter()")
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

63b8ea5e

random: quiet urandom warning ratelimit suppression message · c01d4d0a

由 Jason A. Donenfeld 提交于 6月 16, 2022

random.c ratelimits how much it warns about uninitialized urandom reads
using __ratelimit(). When the RNG is finally initialized, it prints the
number of missed messages due to ratelimiting.

It has been this way since that functionality was introduced back in
2018. Recently, cc1e127b ("random: remove ratelimiting for in-kernel
unseeded randomness") put a bit more stress on the urandom ratelimiting,
which teased out a bug in the implementation.

Specifically, when under pressure, __ratelimit() will print its own
message and reset the count back to 0, making the final message at the
end less useful. Secondly, it does so as a pr_warn(), which apparently
is undesirable for people's CI.

Fortunately, __ratelimit() has the RATELIMIT_MSG_ON_RELEASE flag exactly
for this purpose, so we set the flag.

Fixes: 4e00b339 ("random: rate limit unseeded randomness warnings")
Cc: stable@vger.kernel.org
Reported-by: NJon Hunter <jonathanh@nvidia.com>
Reported-by: NRon Economos <re@w6rz.net>
Tested-by: NRon Economos <re@w6rz.net>
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

c01d4d0a

random: schedule mix_interrupt_randomness() less often · 534d2eaf

由 Jason A. Donenfeld 提交于 6月 16, 2022

It used to be that mix_interrupt_randomness() would credit 1 bit each
time it ran, and so add_interrupt_randomness() would schedule mix() to
run every 64 interrupts, a fairly arbitrary number, but nonetheless
considered to be a decent enough conservative estimate.

Since e3e33fc2 ("random: do not use input pool from hard IRQs"),
mix() is now able to credit multiple bits, depending on the number of
calls to add(). This was done for reasons separate from this commit, but
it has the nice side effect of enabling this patch to schedule mix()
less often.

Currently the rules are:
a) Credit 1 bit for every 64 calls to add().
b) Schedule mix() once a second that add() is called.
c) Schedule mix() once every 64 calls to add().

Rules (a) and (c) no longer need to be coupled. It's still important to
have _some_ value in (c), so that we don't "over-saturate" the fast
pool, but the once per second we get from rule (b) is a plenty enough
baseline. So, by increasing the 64 in rule (c) to something larger, we
avoid calling queue_work_on() as frequently during irq storms.

This commit changes that 64 in rule (c) to be 1024, which means we
schedule mix() 16 times less often. And it does *not* need to change the
64 in rule (a).

Fixes: 58340f8e ("random: defer fast pool mixing to worker")
Cc: stable@vger.kernel.org
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Acked-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

534d2eaf

10 6月, 2022 5 次提交

random: remove rng_has_arch_random() · e052a478

由 Jason A. Donenfeld 提交于 6月 08, 2022

With arch randomness being used by every distro and enabled in
defconfigs, the distinction between rng_has_arch_random() and
rng_is_initialized() is now rather small. In fact, the places where they
differ are now places where paranoid users and system builders really
don't want arch randomness to be used, in which case we should respect
that choice, or places where arch randomness is known to be broken, in
which case that choice is all the more important. So this commit just
removes the function and its one user.

Reviewed-by: Petr Mladek <pmladek@suse.com> # for vsprintf.c
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

e052a478

random: do not use jump labels before they are initialized · 60e5b288

由 Jason A. Donenfeld 提交于 6月 07, 2022

Stephen reported that a static key warning splat appears during early
boot on systems that credit randomness from device trees that contain an
"rng-seed" property, because because setup_machine_fdt() is called
before jump_label_init() during setup_arch():

 static_key_enable_cpuslocked(): static key '0xffffffe51c6fcfc0' used before call to jump_label_init()
 WARNING: CPU: 0 PID: 0 at kernel/jump_label.c:166 static_key_enable_cpuslocked+0xb0/0xb8
 Modules linked in:
 CPU: 0 PID: 0 Comm: swapper Not tainted 5.18.0+ #224 44b43e377bfc84bc99bb5ab885ff694984ee09ff
 pstate: 600001c9 (nZCv dAIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
 pc : static_key_enable_cpuslocked+0xb0/0xb8
 lr : static_key_enable_cpuslocked+0xb0/0xb8
 sp : ffffffe51c393cf0
 x29: ffffffe51c393cf0 x28: 000000008185054c x27: 00000000f1042f10
 x26: 0000000000000000 x25: 00000000f10302b2 x24: 0000002513200000
 x23: 0000002513200000 x22: ffffffe51c1c9000 x21: fffffffdfdc00000
 x20: ffffffe51c2f0831 x19: ffffffe51c6fcfc0 x18: 00000000ffff1020
 x17: 00000000e1e2ac90 x16: 00000000000000e0 x15: ffffffe51b710708
 x14: 0000000000000066 x13: 0000000000000018 x12: 0000000000000000
 x11: 0000000000000000 x10: 00000000ffffffff x9 : 0000000000000000
 x8 : 0000000000000000 x7 : 61632065726f6665 x6 : 6220646573752027
 x5 : ffffffe51c641d25 x4 : ffffffe51c13142c x3 : ffff0a00ffffff05
 x2 : 40000000ffffe003 x1 : 00000000000001c0 x0 : 0000000000000065
 Call trace:
  static_key_enable_cpuslocked+0xb0/0xb8
  static_key_enable+0x2c/0x40
  crng_set_ready+0x24/0x30
  execute_in_process_context+0x80/0x90
  _credit_init_bits+0x100/0x154
  add_bootloader_randomness+0x64/0x78
  early_init_dt_scan_chosen+0x140/0x184
  early_init_dt_scan_nodes+0x28/0x4c
  early_init_dt_scan+0x40/0x44
  setup_machine_fdt+0x7c/0x120
  setup_arch+0x74/0x1d8
  start_kernel+0x84/0x44c
  __primary_switched+0xc0/0xc8
 ---[ end trace 0000000000000000 ]---
 random: crng init done
 Machine model: Google Lazor (rev1 - 2) with LTE

A trivial fix went in to address this on arm64, 73e2d827 ("arm64:
Initialize jump labels before setup_machine_fdt()"). I wrote patches as
well for arm32 and risc-v. But still patches are needed on xtensa,
powerpc, arc, and mips. So that's 7 platforms where things aren't quite
right. This sort of points to larger issues that might need a larger
solution.

Instead, this commit just defers setting the static branch until later
in the boot process. random_init() is called after jump_label_init() has
been called, and so is always a safe place from which to adjust the
static branch.

Fixes: f5bda35f ("random: use static branch for crng_ready()")
Reported-by: NStephen Boyd <swboyd@chromium.org>
Reported-by: NPhil Elwell <phil@raspberrypi.com>
Tested-by: NPhil Elwell <phil@raspberrypi.com>
Reviewed-by: NArd Biesheuvel <ardb@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

60e5b288

random: account for arch randomness in bits · 77fc95f8

由 Jason A. Donenfeld 提交于 6月 07, 2022

Rather than accounting in bytes and multiplying (shifting), we can just
account in bits and avoid the shift. The main motivation for this is
there are other patches in flux that expand this code a bit, and
avoiding the duplication of "* 8" everywhere makes things a bit clearer.

Cc: stable@vger.kernel.org
Fixes: 12e45a2a ("random: credit architectural init the exact amount")
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

77fc95f8

random: mark bootloader randomness code as __init · 39e0f991

由 Jason A. Donenfeld 提交于 6月 07, 2022

add_bootloader_randomness() and the variables it touches are only used
during __init and not after, so mark these as __init. At the same time,
unexport this, since it's only called by other __init code that's
built-in.

Cc: stable@vger.kernel.org
Fixes: 428826f5 ("fdt: add support for rng-seed")
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

39e0f991

random: avoid checking crng_ready() twice in random_init() · 9b29b6b2

由 Jason A. Donenfeld 提交于 6月 07, 2022

The current flow expands to:

    if (crng_ready())
       ...
    else if (...)
        if (!crng_ready())
            ...

The second crng_ready() call is redundant, but can't so easily be
optimized out by the compiler.

This commit simplifies that to:

    if (crng_ready()
        ...
    else if (...)
        ...

Fixes: 560181c2 ("random: move initialization functions out of hot pages")
Cc: stable@vger.kernel.org
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

9b29b6b2

23 5月, 2022 1 次提交

random: check for signals after page of pool writes · 1ce6c8d6

由 Jason A. Donenfeld 提交于 5月 22, 2022

get_random_bytes_user() checks for signals after producing a PAGE_SIZE
worth of output, just like /dev/zero does. write_pool() is doing
basically the same work (actually, slightly more expensive), and so
should stop to check for signals in the same way. Let's also name it
write_pool_user() to match get_random_bytes_user(), so this won't be
misused in the future.

Before this patch, massive writes to /dev/urandom would tie up the
process for an extremely long time and make it unterminatable. After, it
can be successfully interrupted. The following test program can be used
to see this works as intended:

  #include <unistd.h>
  #include <fcntl.h>
  #include <signal.h>
  #include <stdio.h>

  static unsigned char x[~0U];

  static void handle(int) { }

  int main(int argc, char *argv[])
  {
    pid_t pid = getpid(), child;
    int fd;
    signal(SIGUSR1, handle);
    if (!(child = fork())) {
      for (;;)
        kill(pid, SIGUSR1);
    }
    fd = open("/dev/urandom", O_WRONLY);
    pause();
    printf("interrupted after writing %zd bytes\n", write(fd, x, sizeof(x)));
    close(fd);
    kill(child, SIGTERM);
    return 0;
  }

Result before: "interrupted after writing 2147479552 bytes"
Result after: "interrupted after writing 4096 bytes"

Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

1ce6c8d6

21 5月, 2022 3 次提交

random: wire up fops->splice_{read,write}_iter() · 79025e72

由 Jens Axboe 提交于 5月 19, 2022

Now that random/urandom is using {read,write}_iter, we can wire it up to
using the generic splice handlers.

Fixes: 36e2c742 ("fs: don't allow splice read/write without explicit ops")
Signed-off-by: NJens Axboe <axboe@kernel.dk>
[Jason: added the splice_write path. Note that sendfile() and such still
 does not work for read, though it does for write, because of a file
 type restriction in splice_direct_to_actor(), which I'll address
 separately.]
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

79025e72

random: convert to using fops->write_iter() · 22b0a222

由 Jens Axboe 提交于 5月 19, 2022

Now that the read side has been converted to fix a regression with
splice, convert the write side as well to have some symmetry in the
interface used (and help deprecate ->write()).
Signed-off-by: NJens Axboe <axboe@kernel.dk>
[Jason: cleaned up random_ioctl a bit, require full writes in
 RNDADDENTROPY since it's crediting entropy, simplify control flow of
 write_pool(), and incorporate suggestions from Al.]
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

22b0a222

random: convert to using fops->read_iter() · 1b388e77

由 Jens Axboe 提交于 5月 19, 2022

This is a pre-requisite to wiring up splice() again for the random
and urandom drivers. It also allows us to remove the INT_MAX check in
getrandom(), because import_single_range() applies capping internally.
Signed-off-by: NJens Axboe <axboe@kernel.dk>
[Jason: rewrote get_random_bytes_user() to simplify and also incorporate
 additional suggestions from Al.]
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

1b388e77

19 5月, 2022 7 次提交

random: unify batched entropy implementations · 3092adce

由 Jason A. Donenfeld 提交于 5月 15, 2022

There are currently two separate batched entropy implementations, for
u32 and u64, with nearly identical code, with the goal of avoiding
unaligned memory accesses and letting the buffers be used more
efficiently. Having to maintain these two functions independently is a
bit of a hassle though, considering that they always need to be kept in
sync.

This commit factors them out into a type-generic macro, so that the
expansion produces the same code as before, such that diffing the
assembly shows no differences. This will also make it easier in the
future to add u16 and u8 batches.

This was initially tested using an always_inline function and letting
gcc constant fold the type size in, but the code gen was less efficient,
and in general it was more verbose and harder to follow. So this patch
goes with the boring macro solution, similar to what's already done for
the _wait functions in random.h.

Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

3092adce

random: move randomize_page() into mm where it belongs · 5ad7dd88

由 Jason A. Donenfeld 提交于 5月 14, 2022

randomize_page is an mm function. It is documented like one. It contains
the history of one. It has the naming convention of one. It looks
just like another very similar function in mm, randomize_stack_top().
And it has always been maintained and updated by mm people. There is no
need for it to be in random.c. In the "which shape does not look like
the other ones" test, pointing to randomize_page() is correct.

So move randomize_page() into mm/util.c, right next to the similar
randomize_stack_top() function.

This commit contains no actual code changes.

Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

5ad7dd88

random: remove mostly unused async readiness notifier · 6701de6c

由 Jason A. Donenfeld 提交于 5月 15, 2022

The register_random_ready_notifier() notifier is somewhat complicated,
and was already recently rewritten to use notifier blocks. It is only
used now by one consumer in the kernel, vsprintf.c, for which the async
mechanism is really overly complex for what it actually needs. This
commit removes register_random_ready_notifier() and unregister_random_
ready_notifier(), because it just adds complication with little utility,
and changes vsprintf.c to just check on `!rng_is_initialized() &&
!rng_has_arch_random()`, which will eventually be true. Performance-
wise, that code was already using a static branch, so there's basically
no overhead at all to this change.

Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Acked-by: Petr Mladek <pmladek@suse.com> # for vsprintf.c
Reviewed-by: NPetr Mladek <pmladek@suse.com>
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

6701de6c

random: remove get_random_bytes_arch() and add rng_has_arch_random() · 248561ad

由 Jason A. Donenfeld 提交于 5月 14, 2022

The RNG incorporates RDRAND into its state at boot and every time it
reseeds, so there's no reason for callers to use it directly. The
hashing that the RNG does on it is preferable to using the bytes raw.

The only current use case of get_random_bytes_arch() is vsprintf's
siphash key for pointer hashing, which uses it to initialize the pointer
secret earlier than usual if RDRAND is available. In order to replace
this narrow use case, just expose whether RDRAND is mixed into the RNG,
with a new function called rng_has_arch_random(). With that taken care
of, there are no users of get_random_bytes_arch() left, so it can be
removed.

Later, if trust_cpu gets turned on by default (as most distros are
doing), this one use of rng_has_arch_random() can probably go away as
well.

Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Acked-by: Petr Mladek <pmladek@suse.com> # for vsprintf.c
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

248561ad

random: move initialization functions out of hot pages · 560181c2

由 Jason A. Donenfeld 提交于 5月 13, 2022

Much of random.c is devoted to initializing the rng and accounting for
when a sufficient amount of entropy has been added. In a perfect world,
this would all happen during init, and so we could mark these functions
as __init. But in reality, this isn't the case: sometimes the rng only
finishes initializing some seconds after system init is finished.

For this reason, at the moment, a whole host of functions that are only
used relatively close to system init and then never again are intermixed
with functions that are used in hot code all the time. This creates more
cache misses than necessary.

In order to pack the hot code closer together, this commit moves the
initialization functions that can't be marked as __init into
.text.unlikely by way of the __cold attribute.

Of particular note is moving credit_init_bits() into a macro wrapper
that inlines the crng_ready() static branch check. This avoids a
function call to a nop+ret, and most notably prevents extra entropy
arithmetic from being computed in mix_interrupt_randomness().
Reviewed-by: NDominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

560181c2

random: make consistent use of buf and len · a1940263

由 Jason A. Donenfeld 提交于 5月 13, 2022

The current code was a mix of "nbytes", "count", "size", "buffer", "in",
and so forth. Instead, let's clean this up by naming input parameters
"buf" (or "ubuf") and "len", so that you always understand that you're
reading this variety of function argument.
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

a1940263

random: use static branch for crng_ready() · f5bda35f

由 Jason A. Donenfeld 提交于 5月 03, 2022

Since crng_ready() is only false briefly during initialization and then
forever after becomes true, we don't need to evaluate it after, making
it a prime candidate for a static branch.

One complication, however, is that it changes state in a particular call
to credit_init_bits(), which might be made from atomic context, which
means we must kick off a workqueue to change the static key. Further
complicating things, credit_init_bits() may be called sufficiently early
on in system initialization such that system_wq is NULL.

Fortunately, there exists the nice function execute_in_process_context(),
which will immediately execute the function if !in_interrupt(), and
otherwise defer it to a workqueue. During early init, before workqueues
are available, in_interrupt() is always false, because interrupts
haven't even been enabled yet, which means the function in that case
executes immediately. Later on, after workqueues are available,
in_interrupt() might be true, but in that case, the work is queued in
system_wq and all goes well.

Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Sultan Alsawaf <sultan@kerneltoast.com>
Reviewed-by: NDominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

f5bda35f

18 5月, 2022 7 次提交

random: credit architectural init the exact amount · 12e45a2a

由 Jason A. Donenfeld 提交于 5月 12, 2022

RDRAND and RDSEED can fail sometimes, which is fine. We currently
initialize the RNG with 512 bits of RDRAND/RDSEED. We only need 256 bits
of those to succeed in order to initialize the RNG. Instead of the
current "all or nothing" approach, actually credit these contributions
the amount that is actually contributed.
Reviewed-by: NDominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

12e45a2a

random: handle latent entropy and command line from random_init() · 2f14062b

由 Jason A. Donenfeld 提交于 5月 05, 2022

Currently, start_kernel() adds latent entropy and the command line to
the entropy bool *after* the RNG has been initialized, deferring when
it's actually used by things like stack canaries until the next time
the pool is seeded. This surely is not intended.

Rather than splitting up which entropy gets added where and when between
start_kernel() and random_init(), just do everything in random_init(),
which should eliminate these kinds of bugs in the future.

While we're at it, rename the awkwardly titled "rand_initialize()" to
the more standard "random_init()" nomenclature.
Reviewed-by: NDominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

2f14062b

random: use proper jiffies comparison macro · 8a5b8a4a

由 Jason A. Donenfeld 提交于 5月 10, 2022

This expands to exactly the same code that it replaces, but makes things
consistent by using the same macro for jiffy comparisons throughout.
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

8a5b8a4a

random: remove ratelimiting for in-kernel unseeded randomness · cc1e127b

由 Jason A. Donenfeld 提交于 5月 09, 2022

The CONFIG_WARN_ALL_UNSEEDED_RANDOM debug option controls whether the
kernel warns about all unseeded randomness or just the first instance.
There's some complicated rate limiting and comparison to the previous
caller, such that even with CONFIG_WARN_ALL_UNSEEDED_RANDOM enabled,
developers still don't see all the messages or even an accurate count of
how many were missed. This is the result of basically parallel
mechanisms aimed at accomplishing more or less the same thing, added at
different points in random.c history, which sort of compete with the
first-instance-only limiting we have now.

It turns out, however, that nobody cares about the first unseeded
randomness instance of in-kernel users. The same first user has been
there for ages now, and nobody is doing anything about it. It isn't even
clear that anybody _can_ do anything about it. Most places that can do
something about it have switched over to using get_random_bytes_wait()
or wait_for_random_bytes(), which is the right thing to do, but there is
still much code that needs randomness sometimes during init, and as a
geeneral rule, if you're not using one of the _wait functions or the
readiness notifier callback, you're bound to be doing it wrong just
based on that fact alone.

So warning about this same first user that can't easily change is simply
not an effective mechanism for anything at all. Users can't do anything
about it, as the Kconfig text points out -- the problem isn't in
userspace code -- and kernel developers don't or more often can't react
to it.

Instead, show the warning for all instances when CONFIG_WARN_ALL_UNSEEDED_RANDOM
is set, so that developers can debug things need be, or if it isn't set,
don't show a warning at all.

At the same time, CONFIG_WARN_ALL_UNSEEDED_RANDOM now implies setting
random.ratelimit_disable=1 on by default, since if you care about one
you probably care about the other too. And we can clean up usage around
the related urandom_warning ratelimiter as well (whose behavior isn't
changing), so that it properly counts missed messages after the 10
message threshold is reached.

Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

cc1e127b

random: move initialization out of reseeding hot path · 68c9c8b1

由 Jason A. Donenfeld 提交于 5月 09, 2022

Initialization happens once -- by way of credit_init_bits() -- and then
it never happens again. Therefore, it doesn't need to be in
crng_reseed(), which is a hot path that is called multiple times. It
also doesn't make sense to have there, as initialization activity is
better associated with initialization routines.

After the prior commit, crng_reseed() now won't be called by multiple
concurrent callers, which means that we can safely move the
"finialize_init" logic into crng_init_bits() unconditionally.
Reviewed-by: NDominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

68c9c8b1

random: avoid initializing twice in credit race · fed7ef06

由 Jason A. Donenfeld 提交于 5月 09, 2022

Since all changes of crng_init now go through credit_init_bits(), we can
fix a long standing race in which two concurrent callers of
credit_init_bits() have the new bit count >= some threshold, but are
doing so with crng_init as a lower threshold, checked outside of a lock,
resulting in crng_reseed() or similar being called twice.

In order to fix this, we can use the original cmpxchg value of the bit
count, and only change crng_init when the bit count transitions from
below a threshold to meeting the threshold.
Reviewed-by: NDominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

fed7ef06

random: use symbolic constants for crng_init states · e3d2c5e7

由 Jason A. Donenfeld 提交于 5月 08, 2022

crng_init represents a state machine, with three states, and various
rules for transitions. For the longest time, we've been managing these
with "0", "1", and "2", and expecting people to figure it out. To make
the code more obvious, replace these with proper enum values
representing the transition, and then redocument what each of these
states mean.
Reviewed-by: NDominik Brodowski <linux@dominikbrodowski.net>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

e3d2c5e7

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功