1. 30 9月, 2022 3 次提交
    • J
      random: add 8-bit and 16-bit batches · 585cd5fe
      Jason A. Donenfeld 提交于
      There are numerous places in the kernel that would be sped up by having
      smaller batches. Currently those callsites do `get_random_u32() & 0xff`
      or similar. Since these are pretty spread out, and will require patches
      to multiple different trees, let's get ahead of the curve and lay the
      foundation for `get_random_u8()` and `get_random_u16()`, so that it's
      then possible to start submitting conversion patches leisurely.
      Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
      585cd5fe
    • J
      random: use init_utsname() instead of utsname() · dd54fd7d
      Jason A. Donenfeld 提交于
      Rather than going through the current-> indirection for utsname, at this
      point in boot, init_utsname()==utsname(), so just use it directly that
      way. Additionally, init_utsname() appears to be available nearly always,
      so move it into random_init_early().
      Suggested-by: NKees Cook <keescook@chromium.org>
      Reviewed-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
      dd54fd7d
    • J
      random: split initialization into early step and later step · f6238499
      Jason A. Donenfeld 提交于
      The full RNG initialization relies on some timestamps, made possible
      with initialization functions like time_init() and timekeeping_init().
      However, these are only available rather late in initialization.
      Meanwhile, other things, such as memory allocator functions, make use of
      the RNG much earlier.
      
      So split RNG initialization into two phases. We can provide arch
      randomness very early on, and then later, after timekeeping and such are
      available, initialize the rest.
      
      This ensures that, for example, slabs are properly randomized if RDRAND
      is available. Without this, CONFIG_SLAB_FREELIST_RANDOM=y loses a degree
      of its security, because its random seed is potentially deterministic,
      since it hasn't yet incorporated RDRAND. It also makes it possible to
      use a better seed in kfence, which currently relies on only the cycle
      counter.
      
      Another positive consequence is that on systems with RDRAND, running
      with CONFIG_WARN_ALL_UNSEEDED_RANDOM=y results in no warnings at all.
      
      One subtle side effect of this change is that on systems with no RDRAND,
      RDTSC is now only queried by random_init() once, committing the moment
      of the function call, instead of multiple times as before. This is
      intentional, as the multiple RDTSCs in a loop before weren't
      accomplishing very much, with jitter being better provided by
      try_to_generate_entropy(). Plus, filling blocks with RDTSC is still
      being done in extract_entropy(), which is necessarily called before
      random bytes are served anyway.
      
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Reviewed-by: NKees Cook <keescook@chromium.org>
      Reviewed-by: NDominik Brodowski <linux@dominikbrodowski.net>
      Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
      f6238499
  2. 29 9月, 2022 2 次提交
    • J
      random: use expired timer rather than wq for mixing fast pool · 748bc4dd
      Jason A. Donenfeld 提交于
      Previously, the fast pool was dumped into the main pool periodically in
      the fast pool's hard IRQ handler. This worked fine and there weren't
      problems with it, until RT came around. Since RT converts spinlocks into
      sleeping locks, problems cropped up. Rather than switching to raw
      spinlocks, the RT developers preferred we make the transformation from
      originally doing:
      
          do_some_stuff()
          spin_lock()
          do_some_other_stuff()
          spin_unlock()
      
      to doing:
      
          do_some_stuff()
          queue_work_on(some_other_stuff_worker)
      
      This is an ordinary pattern done all over the kernel. However, Sherry
      noticed a 10% performance regression in qperf TCP over a 40gbps
      InfiniBand card. Quoting her message:
      
      > MT27500 Family [ConnectX-3] cards:
      > Infiniband device 'mlx4_0' port 1 status:
      > default gid: fe80:0000:0000:0000:0010:e000:0178:9eb1
      > base lid: 0x6
      > sm lid: 0x1
      > state: 4: ACTIVE
      > phys state: 5: LinkUp
      > rate: 40 Gb/sec (4X QDR)
      > link_layer: InfiniBand
      >
      > Cards are configured with IP addresses on private subnet for IPoIB
      > performance testing.
      > Regression identified in this bug is in TCP latency in this stack as reported
      > by qperf tcp_lat metric:
      >
      > We have one system listen as a qperf server:
      > [root@yourQperfServer ~]# qperf
      >
      > Have the other system connect to qperf server as a client (in this
      > case, it’s X7 server with Mellanox card):
      > [root@yourQperfClient ~]# numactl -m0 -N0 qperf 20.20.20.101 -v -uu -ub --time 60 --wait_server 20 -oo msg_size:4K:1024K:*2 tcp_lat
      
      Rather than incur the scheduling latency from queue_work_on, we can
      instead switch to running on the next timer tick, on the same core. This
      also batches things a bit more -- once per jiffy -- which is okay now
      that mix_interrupt_randomness() can credit multiple bits at once.
      Reported-by: NSherry Yang <sherry.yang@oracle.com>
      Tested-by: NPaul Webb <paul.x.webb@oracle.com>
      Cc: Sherry Yang <sherry.yang@oracle.com>
      Cc: Phillip Goerl <phillip.goerl@oracle.com>
      Cc: Jack Vogel <jack.vogel@oracle.com>
      Cc: Nicky Veitch <nicky.veitch@oracle.com>
      Cc: Colm Harrington <colm.harrington@oracle.com>
      Cc: Ramanan Govindarajan <ramanan.govindarajan@oracle.com>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Dominik Brodowski <linux@dominikbrodowski.net>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Sultan Alsawaf <sultan@kerneltoast.com>
      Cc: stable@vger.kernel.org
      Fixes: 58340f8e ("random: defer fast pool mixing to worker")
      Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
      748bc4dd
    • J
      random: avoid reading two cache lines on irq randomness · 9ee0507e
      Jason A. Donenfeld 提交于
      In order to avoid reading and dirtying two cache lines on every IRQ,
      move the work_struct to the bottom of the fast_pool struct. add_
      interrupt_randomness() always touches .pool and .count, which are
      currently split, because .mix pushes everything down. Instead, move .mix
      to the bottom, so that .pool and .count are always in the first cache
      line, since .mix is only accessed when the pool is full.
      
      Fixes: 58340f8e ("random: defer fast pool mixing to worker")
      Reviewed-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
      9ee0507e
  3. 23 9月, 2022 4 次提交
  4. 30 7月, 2022 1 次提交
  5. 25 7月, 2022 1 次提交
    • J
      random: handle archrandom with multiple longs · d349ab99
      Jason A. Donenfeld 提交于
      The archrandom interface was originally designed for x86, which supplies
      RDRAND/RDSEED for receiving random words into registers, resulting in
      one function to generate an int and another to generate a long. However,
      other architectures don't follow this.
      
      On arm64, the SMCCC TRNG interface can return between one and three
      longs. On s390, the CPACF TRNG interface can return arbitrary amounts,
      with four longs having the same cost as one. On UML, the os_getrandom()
      interface can return arbitrary amounts.
      
      So change the api signature to take a "max_longs" parameter designating
      the maximum number of longs requested, and then return the number of
      longs generated.
      
      Since callers need to check this return value and loop anyway, each arch
      implementation does not bother implementing its own loop to try again to
      fill the maximum number of longs. Additionally, all existing callers
      pass in a constant max_longs parameter. Taken together, these two things
      mean that the codegen doesn't really change much for one-word-at-a-time
      platforms, while performance is greatly improved on platforms such as
      s390.
      Acked-by: NHeiko Carstens <hca@linux.ibm.com>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Acked-by: NMark Rutland <mark.rutland@arm.com>
      Acked-by: NMichael Ellerman <mpe@ellerman.id.au>
      Acked-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
      d349ab99
  6. 18 7月, 2022 1 次提交
    • U
      random: use try_cmpxchg in _credit_init_bits · b7a68f67
      Uros Bizjak 提交于
      Use `!try_cmpxchg(ptr, &orig, new)` instead of `cmpxchg(ptr, orig, new)
      != orig` in _credit_init_bits. This has two benefits:
      
      - The x86 cmpxchg instruction returns success in the ZF flag, so this
        change saves a compare after cmpxchg, as well as a related move
        instruction in front of cmpxchg.
      
      - try_cmpxchg implicitly assigns the *ptr value to &orig when cmpxchg
        fails, enabling further code simplifications.
      
      This patch has no functional change.
      Signed-off-by: NUros Bizjak <ubizjak@gmail.com>
      Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
      b7a68f67
  7. 17 7月, 2022 1 次提交
  8. 01 7月, 2022 1 次提交
  9. 20 6月, 2022 3 次提交
    • J
      random: update comment from copy_to_user() -> copy_to_iter() · 63b8ea5e
      Jason A. Donenfeld 提交于
      This comment wasn't updated when we moved from read() to read_iter(), so
      this patch makes the trivial fix.
      
      Fixes: 1b388e77 ("random: convert to using fops->read_iter()")
      Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
      63b8ea5e
    • J
      random: quiet urandom warning ratelimit suppression message · c01d4d0a
      Jason A. Donenfeld 提交于
      random.c ratelimits how much it warns about uninitialized urandom reads
      using __ratelimit(). When the RNG is finally initialized, it prints the
      number of missed messages due to ratelimiting.
      
      It has been this way since that functionality was introduced back in
      2018. Recently, cc1e127b ("random: remove ratelimiting for in-kernel
      unseeded randomness") put a bit more stress on the urandom ratelimiting,
      which teased out a bug in the implementation.
      
      Specifically, when under pressure, __ratelimit() will print its own
      message and reset the count back to 0, making the final message at the
      end less useful. Secondly, it does so as a pr_warn(), which apparently
      is undesirable for people's CI.
      
      Fortunately, __ratelimit() has the RATELIMIT_MSG_ON_RELEASE flag exactly
      for this purpose, so we set the flag.
      
      Fixes: 4e00b339 ("random: rate limit unseeded randomness warnings")
      Cc: stable@vger.kernel.org
      Reported-by: NJon Hunter <jonathanh@nvidia.com>
      Reported-by: NRon Economos <re@w6rz.net>
      Tested-by: NRon Economos <re@w6rz.net>
      Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
      c01d4d0a
    • J
      random: schedule mix_interrupt_randomness() less often · 534d2eaf
      Jason A. Donenfeld 提交于
      It used to be that mix_interrupt_randomness() would credit 1 bit each
      time it ran, and so add_interrupt_randomness() would schedule mix() to
      run every 64 interrupts, a fairly arbitrary number, but nonetheless
      considered to be a decent enough conservative estimate.
      
      Since e3e33fc2 ("random: do not use input pool from hard IRQs"),
      mix() is now able to credit multiple bits, depending on the number of
      calls to add(). This was done for reasons separate from this commit, but
      it has the nice side effect of enabling this patch to schedule mix()
      less often.
      
      Currently the rules are:
      a) Credit 1 bit for every 64 calls to add().
      b) Schedule mix() once a second that add() is called.
      c) Schedule mix() once every 64 calls to add().
      
      Rules (a) and (c) no longer need to be coupled. It's still important to
      have _some_ value in (c), so that we don't "over-saturate" the fast
      pool, but the once per second we get from rule (b) is a plenty enough
      baseline. So, by increasing the 64 in rule (c) to something larger, we
      avoid calling queue_work_on() as frequently during irq storms.
      
      This commit changes that 64 in rule (c) to be 1024, which means we
      schedule mix() 16 times less often. And it does *not* need to change the
      64 in rule (a).
      
      Fixes: 58340f8e ("random: defer fast pool mixing to worker")
      Cc: stable@vger.kernel.org
      Cc: Dominik Brodowski <linux@dominikbrodowski.net>
      Acked-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
      534d2eaf
  10. 10 6月, 2022 5 次提交
    • J
      random: remove rng_has_arch_random() · e052a478
      Jason A. Donenfeld 提交于
      With arch randomness being used by every distro and enabled in
      defconfigs, the distinction between rng_has_arch_random() and
      rng_is_initialized() is now rather small. In fact, the places where they
      differ are now places where paranoid users and system builders really
      don't want arch randomness to be used, in which case we should respect
      that choice, or places where arch randomness is known to be broken, in
      which case that choice is all the more important. So this commit just
      removes the function and its one user.
      
      Reviewed-by: Petr Mladek <pmladek@suse.com> # for vsprintf.c
      Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
      e052a478
    • J
      random: do not use jump labels before they are initialized · 60e5b288
      Jason A. Donenfeld 提交于
      Stephen reported that a static key warning splat appears during early
      boot on systems that credit randomness from device trees that contain an
      "rng-seed" property, because because setup_machine_fdt() is called
      before jump_label_init() during setup_arch():
      
       static_key_enable_cpuslocked(): static key '0xffffffe51c6fcfc0' used before call to jump_label_init()
       WARNING: CPU: 0 PID: 0 at kernel/jump_label.c:166 static_key_enable_cpuslocked+0xb0/0xb8
       Modules linked in:
       CPU: 0 PID: 0 Comm: swapper Not tainted 5.18.0+ #224 44b43e377bfc84bc99bb5ab885ff694984ee09ff
       pstate: 600001c9 (nZCv dAIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
       pc : static_key_enable_cpuslocked+0xb0/0xb8
       lr : static_key_enable_cpuslocked+0xb0/0xb8
       sp : ffffffe51c393cf0
       x29: ffffffe51c393cf0 x28: 000000008185054c x27: 00000000f1042f10
       x26: 0000000000000000 x25: 00000000f10302b2 x24: 0000002513200000
       x23: 0000002513200000 x22: ffffffe51c1c9000 x21: fffffffdfdc00000
       x20: ffffffe51c2f0831 x19: ffffffe51c6fcfc0 x18: 00000000ffff1020
       x17: 00000000e1e2ac90 x16: 00000000000000e0 x15: ffffffe51b710708
       x14: 0000000000000066 x13: 0000000000000018 x12: 0000000000000000
       x11: 0000000000000000 x10: 00000000ffffffff x9 : 0000000000000000
       x8 : 0000000000000000 x7 : 61632065726f6665 x6 : 6220646573752027
       x5 : ffffffe51c641d25 x4 : ffffffe51c13142c x3 : ffff0a00ffffff05
       x2 : 40000000ffffe003 x1 : 00000000000001c0 x0 : 0000000000000065
       Call trace:
        static_key_enable_cpuslocked+0xb0/0xb8
        static_key_enable+0x2c/0x40
        crng_set_ready+0x24/0x30
        execute_in_process_context+0x80/0x90
        _credit_init_bits+0x100/0x154
        add_bootloader_randomness+0x64/0x78
        early_init_dt_scan_chosen+0x140/0x184
        early_init_dt_scan_nodes+0x28/0x4c
        early_init_dt_scan+0x40/0x44
        setup_machine_fdt+0x7c/0x120
        setup_arch+0x74/0x1d8
        start_kernel+0x84/0x44c
        __primary_switched+0xc0/0xc8
       ---[ end trace 0000000000000000 ]---
       random: crng init done
       Machine model: Google Lazor (rev1 - 2) with LTE
      
      A trivial fix went in to address this on arm64, 73e2d827 ("arm64:
      Initialize jump labels before setup_machine_fdt()"). I wrote patches as
      well for arm32 and risc-v. But still patches are needed on xtensa,
      powerpc, arc, and mips. So that's 7 platforms where things aren't quite
      right. This sort of points to larger issues that might need a larger
      solution.
      
      Instead, this commit just defers setting the static branch until later
      in the boot process. random_init() is called after jump_label_init() has
      been called, and so is always a safe place from which to adjust the
      static branch.
      
      Fixes: f5bda35f ("random: use static branch for crng_ready()")
      Reported-by: NStephen Boyd <swboyd@chromium.org>
      Reported-by: NPhil Elwell <phil@raspberrypi.com>
      Tested-by: NPhil Elwell <phil@raspberrypi.com>
      Reviewed-by: NArd Biesheuvel <ardb@kernel.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
      60e5b288
    • J
      random: account for arch randomness in bits · 77fc95f8
      Jason A. Donenfeld 提交于
      Rather than accounting in bytes and multiplying (shifting), we can just
      account in bits and avoid the shift. The main motivation for this is
      there are other patches in flux that expand this code a bit, and
      avoiding the duplication of "* 8" everywhere makes things a bit clearer.
      
      Cc: stable@vger.kernel.org
      Fixes: 12e45a2a ("random: credit architectural init the exact amount")
      Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
      77fc95f8
    • J
      random: mark bootloader randomness code as __init · 39e0f991
      Jason A. Donenfeld 提交于
      add_bootloader_randomness() and the variables it touches are only used
      during __init and not after, so mark these as __init. At the same time,
      unexport this, since it's only called by other __init code that's
      built-in.
      
      Cc: stable@vger.kernel.org
      Fixes: 428826f5 ("fdt: add support for rng-seed")
      Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
      39e0f991
    • J
      random: avoid checking crng_ready() twice in random_init() · 9b29b6b2
      Jason A. Donenfeld 提交于
      The current flow expands to:
      
          if (crng_ready())
             ...
          else if (...)
              if (!crng_ready())
                  ...
      
      The second crng_ready() call is redundant, but can't so easily be
      optimized out by the compiler.
      
      This commit simplifies that to:
      
          if (crng_ready()
              ...
          else if (...)
              ...
      
      Fixes: 560181c2 ("random: move initialization functions out of hot pages")
      Cc: stable@vger.kernel.org
      Cc: Dominik Brodowski <linux@dominikbrodowski.net>
      Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
      9b29b6b2
  11. 23 5月, 2022 1 次提交
    • J
      random: check for signals after page of pool writes · 1ce6c8d6
      Jason A. Donenfeld 提交于
      get_random_bytes_user() checks for signals after producing a PAGE_SIZE
      worth of output, just like /dev/zero does. write_pool() is doing
      basically the same work (actually, slightly more expensive), and so
      should stop to check for signals in the same way. Let's also name it
      write_pool_user() to match get_random_bytes_user(), so this won't be
      misused in the future.
      
      Before this patch, massive writes to /dev/urandom would tie up the
      process for an extremely long time and make it unterminatable. After, it
      can be successfully interrupted. The following test program can be used
      to see this works as intended:
      
        #include <unistd.h>
        #include <fcntl.h>
        #include <signal.h>
        #include <stdio.h>
      
        static unsigned char x[~0U];
      
        static void handle(int) { }
      
        int main(int argc, char *argv[])
        {
          pid_t pid = getpid(), child;
          int fd;
          signal(SIGUSR1, handle);
          if (!(child = fork())) {
            for (;;)
              kill(pid, SIGUSR1);
          }
          fd = open("/dev/urandom", O_WRONLY);
          pause();
          printf("interrupted after writing %zd bytes\n", write(fd, x, sizeof(x)));
          close(fd);
          kill(child, SIGTERM);
          return 0;
        }
      
      Result before: "interrupted after writing 2147479552 bytes"
      Result after: "interrupted after writing 4096 bytes"
      
      Cc: Dominik Brodowski <linux@dominikbrodowski.net>
      Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
      1ce6c8d6
  12. 21 5月, 2022 3 次提交
    • J
      random: wire up fops->splice_{read,write}_iter() · 79025e72
      Jens Axboe 提交于
      Now that random/urandom is using {read,write}_iter, we can wire it up to
      using the generic splice handlers.
      
      Fixes: 36e2c742 ("fs: don't allow splice read/write without explicit ops")
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      [Jason: added the splice_write path. Note that sendfile() and such still
       does not work for read, though it does for write, because of a file
       type restriction in splice_direct_to_actor(), which I'll address
       separately.]
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
      79025e72
    • J
      random: convert to using fops->write_iter() · 22b0a222
      Jens Axboe 提交于
      Now that the read side has been converted to fix a regression with
      splice, convert the write side as well to have some symmetry in the
      interface used (and help deprecate ->write()).
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      [Jason: cleaned up random_ioctl a bit, require full writes in
       RNDADDENTROPY since it's crediting entropy, simplify control flow of
       write_pool(), and incorporate suggestions from Al.]
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
      22b0a222
    • J
      random: convert to using fops->read_iter() · 1b388e77
      Jens Axboe 提交于
      This is a pre-requisite to wiring up splice() again for the random
      and urandom drivers. It also allows us to remove the INT_MAX check in
      getrandom(), because import_single_range() applies capping internally.
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      [Jason: rewrote get_random_bytes_user() to simplify and also incorporate
       additional suggestions from Al.]
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
      1b388e77
  13. 19 5月, 2022 7 次提交
    • J
      random: unify batched entropy implementations · 3092adce
      Jason A. Donenfeld 提交于
      There are currently two separate batched entropy implementations, for
      u32 and u64, with nearly identical code, with the goal of avoiding
      unaligned memory accesses and letting the buffers be used more
      efficiently. Having to maintain these two functions independently is a
      bit of a hassle though, considering that they always need to be kept in
      sync.
      
      This commit factors them out into a type-generic macro, so that the
      expansion produces the same code as before, such that diffing the
      assembly shows no differences. This will also make it easier in the
      future to add u16 and u8 batches.
      
      This was initially tested using an always_inline function and letting
      gcc constant fold the type size in, but the code gen was less efficient,
      and in general it was more verbose and harder to follow. So this patch
      goes with the boring macro solution, similar to what's already done for
      the _wait functions in random.h.
      
      Cc: Dominik Brodowski <linux@dominikbrodowski.net>
      Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
      3092adce
    • J
      random: move randomize_page() into mm where it belongs · 5ad7dd88
      Jason A. Donenfeld 提交于
      randomize_page is an mm function. It is documented like one. It contains
      the history of one. It has the naming convention of one. It looks
      just like another very similar function in mm, randomize_stack_top().
      And it has always been maintained and updated by mm people. There is no
      need for it to be in random.c. In the "which shape does not look like
      the other ones" test, pointing to randomize_page() is correct.
      
      So move randomize_page() into mm/util.c, right next to the similar
      randomize_stack_top() function.
      
      This commit contains no actual code changes.
      
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
      5ad7dd88
    • J
      random: remove mostly unused async readiness notifier · 6701de6c
      Jason A. Donenfeld 提交于
      The register_random_ready_notifier() notifier is somewhat complicated,
      and was already recently rewritten to use notifier blocks. It is only
      used now by one consumer in the kernel, vsprintf.c, for which the async
      mechanism is really overly complex for what it actually needs. This
      commit removes register_random_ready_notifier() and unregister_random_
      ready_notifier(), because it just adds complication with little utility,
      and changes vsprintf.c to just check on `!rng_is_initialized() &&
      !rng_has_arch_random()`, which will eventually be true. Performance-
      wise, that code was already using a static branch, so there's basically
      no overhead at all to this change.
      
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
      Acked-by: Petr Mladek <pmladek@suse.com> # for vsprintf.c
      Reviewed-by: NPetr Mladek <pmladek@suse.com>
      Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
      6701de6c
    • J
      random: remove get_random_bytes_arch() and add rng_has_arch_random() · 248561ad
      Jason A. Donenfeld 提交于
      The RNG incorporates RDRAND into its state at boot and every time it
      reseeds, so there's no reason for callers to use it directly. The
      hashing that the RNG does on it is preferable to using the bytes raw.
      
      The only current use case of get_random_bytes_arch() is vsprintf's
      siphash key for pointer hashing, which uses it to initialize the pointer
      secret earlier than usual if RDRAND is available. In order to replace
      this narrow use case, just expose whether RDRAND is mixed into the RNG,
      with a new function called rng_has_arch_random(). With that taken care
      of, there are no users of get_random_bytes_arch() left, so it can be
      removed.
      
      Later, if trust_cpu gets turned on by default (as most distros are
      doing), this one use of rng_has_arch_random() can probably go away as
      well.
      
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
      Acked-by: Petr Mladek <pmladek@suse.com> # for vsprintf.c
      Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
      248561ad
    • J
      random: move initialization functions out of hot pages · 560181c2
      Jason A. Donenfeld 提交于
      Much of random.c is devoted to initializing the rng and accounting for
      when a sufficient amount of entropy has been added. In a perfect world,
      this would all happen during init, and so we could mark these functions
      as __init. But in reality, this isn't the case: sometimes the rng only
      finishes initializing some seconds after system init is finished.
      
      For this reason, at the moment, a whole host of functions that are only
      used relatively close to system init and then never again are intermixed
      with functions that are used in hot code all the time. This creates more
      cache misses than necessary.
      
      In order to pack the hot code closer together, this commit moves the
      initialization functions that can't be marked as __init into
      .text.unlikely by way of the __cold attribute.
      
      Of particular note is moving credit_init_bits() into a macro wrapper
      that inlines the crng_ready() static branch check. This avoids a
      function call to a nop+ret, and most notably prevents extra entropy
      arithmetic from being computed in mix_interrupt_randomness().
      Reviewed-by: NDominik Brodowski <linux@dominikbrodowski.net>
      Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
      560181c2
    • J
      random: make consistent use of buf and len · a1940263
      Jason A. Donenfeld 提交于
      The current code was a mix of "nbytes", "count", "size", "buffer", "in",
      and so forth. Instead, let's clean this up by naming input parameters
      "buf" (or "ubuf") and "len", so that you always understand that you're
      reading this variety of function argument.
      Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
      a1940263
    • J
      random: use static branch for crng_ready() · f5bda35f
      Jason A. Donenfeld 提交于
      Since crng_ready() is only false briefly during initialization and then
      forever after becomes true, we don't need to evaluate it after, making
      it a prime candidate for a static branch.
      
      One complication, however, is that it changes state in a particular call
      to credit_init_bits(), which might be made from atomic context, which
      means we must kick off a workqueue to change the static key. Further
      complicating things, credit_init_bits() may be called sufficiently early
      on in system initialization such that system_wq is NULL.
      
      Fortunately, there exists the nice function execute_in_process_context(),
      which will immediately execute the function if !in_interrupt(), and
      otherwise defer it to a workqueue. During early init, before workqueues
      are available, in_interrupt() is always false, because interrupts
      haven't even been enabled yet, which means the function in that case
      executes immediately. Later on, after workqueues are available,
      in_interrupt() might be true, but in that case, the work is queued in
      system_wq and all goes well.
      
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: Sultan Alsawaf <sultan@kerneltoast.com>
      Reviewed-by: NDominik Brodowski <linux@dominikbrodowski.net>
      Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
      f5bda35f
  14. 18 5月, 2022 7 次提交
    • J
      random: credit architectural init the exact amount · 12e45a2a
      Jason A. Donenfeld 提交于
      RDRAND and RDSEED can fail sometimes, which is fine. We currently
      initialize the RNG with 512 bits of RDRAND/RDSEED. We only need 256 bits
      of those to succeed in order to initialize the RNG. Instead of the
      current "all or nothing" approach, actually credit these contributions
      the amount that is actually contributed.
      Reviewed-by: NDominik Brodowski <linux@dominikbrodowski.net>
      Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
      12e45a2a
    • J
      random: handle latent entropy and command line from random_init() · 2f14062b
      Jason A. Donenfeld 提交于
      Currently, start_kernel() adds latent entropy and the command line to
      the entropy bool *after* the RNG has been initialized, deferring when
      it's actually used by things like stack canaries until the next time
      the pool is seeded. This surely is not intended.
      
      Rather than splitting up which entropy gets added where and when between
      start_kernel() and random_init(), just do everything in random_init(),
      which should eliminate these kinds of bugs in the future.
      
      While we're at it, rename the awkwardly titled "rand_initialize()" to
      the more standard "random_init()" nomenclature.
      Reviewed-by: NDominik Brodowski <linux@dominikbrodowski.net>
      Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
      2f14062b
    • J
      random: use proper jiffies comparison macro · 8a5b8a4a
      Jason A. Donenfeld 提交于
      This expands to exactly the same code that it replaces, but makes things
      consistent by using the same macro for jiffy comparisons throughout.
      Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
      8a5b8a4a
    • J
      random: remove ratelimiting for in-kernel unseeded randomness · cc1e127b
      Jason A. Donenfeld 提交于
      The CONFIG_WARN_ALL_UNSEEDED_RANDOM debug option controls whether the
      kernel warns about all unseeded randomness or just the first instance.
      There's some complicated rate limiting and comparison to the previous
      caller, such that even with CONFIG_WARN_ALL_UNSEEDED_RANDOM enabled,
      developers still don't see all the messages or even an accurate count of
      how many were missed. This is the result of basically parallel
      mechanisms aimed at accomplishing more or less the same thing, added at
      different points in random.c history, which sort of compete with the
      first-instance-only limiting we have now.
      
      It turns out, however, that nobody cares about the first unseeded
      randomness instance of in-kernel users. The same first user has been
      there for ages now, and nobody is doing anything about it. It isn't even
      clear that anybody _can_ do anything about it. Most places that can do
      something about it have switched over to using get_random_bytes_wait()
      or wait_for_random_bytes(), which is the right thing to do, but there is
      still much code that needs randomness sometimes during init, and as a
      geeneral rule, if you're not using one of the _wait functions or the
      readiness notifier callback, you're bound to be doing it wrong just
      based on that fact alone.
      
      So warning about this same first user that can't easily change is simply
      not an effective mechanism for anything at all. Users can't do anything
      about it, as the Kconfig text points out -- the problem isn't in
      userspace code -- and kernel developers don't or more often can't react
      to it.
      
      Instead, show the warning for all instances when CONFIG_WARN_ALL_UNSEEDED_RANDOM
      is set, so that developers can debug things need be, or if it isn't set,
      don't show a warning at all.
      
      At the same time, CONFIG_WARN_ALL_UNSEEDED_RANDOM now implies setting
      random.ratelimit_disable=1 on by default, since if you care about one
      you probably care about the other too. And we can clean up usage around
      the related urandom_warning ratelimiter as well (whose behavior isn't
      changing), so that it properly counts missed messages after the 10
      message threshold is reached.
      
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: Dominik Brodowski <linux@dominikbrodowski.net>
      Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
      cc1e127b
    • J
      random: move initialization out of reseeding hot path · 68c9c8b1
      Jason A. Donenfeld 提交于
      Initialization happens once -- by way of credit_init_bits() -- and then
      it never happens again. Therefore, it doesn't need to be in
      crng_reseed(), which is a hot path that is called multiple times. It
      also doesn't make sense to have there, as initialization activity is
      better associated with initialization routines.
      
      After the prior commit, crng_reseed() now won't be called by multiple
      concurrent callers, which means that we can safely move the
      "finialize_init" logic into crng_init_bits() unconditionally.
      Reviewed-by: NDominik Brodowski <linux@dominikbrodowski.net>
      Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
      68c9c8b1
    • J
      random: avoid initializing twice in credit race · fed7ef06
      Jason A. Donenfeld 提交于
      Since all changes of crng_init now go through credit_init_bits(), we can
      fix a long standing race in which two concurrent callers of
      credit_init_bits() have the new bit count >= some threshold, but are
      doing so with crng_init as a lower threshold, checked outside of a lock,
      resulting in crng_reseed() or similar being called twice.
      
      In order to fix this, we can use the original cmpxchg value of the bit
      count, and only change crng_init when the bit count transitions from
      below a threshold to meeting the threshold.
      Reviewed-by: NDominik Brodowski <linux@dominikbrodowski.net>
      Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
      fed7ef06
    • J
      random: use symbolic constants for crng_init states · e3d2c5e7
      Jason A. Donenfeld 提交于
      crng_init represents a state machine, with three states, and various
      rules for transitions. For the longest time, we've been managing these
      with "0", "1", and "2", and expecting people to figure it out. To make
      the code more obvious, replace these with proper enum values
      representing the transition, and then redocument what each of these
      states mean.
      Reviewed-by: NDominik Brodowski <linux@dominikbrodowski.net>
      Cc: Joe Perches <joe@perches.com>
      Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
      e3d2c5e7