1. 29 9月, 2022 2 次提交
    • J
      random: use expired timer rather than wq for mixing fast pool · 748bc4dd
      Jason A. Donenfeld 提交于
      Previously, the fast pool was dumped into the main pool periodically in
      the fast pool's hard IRQ handler. This worked fine and there weren't
      problems with it, until RT came around. Since RT converts spinlocks into
      sleeping locks, problems cropped up. Rather than switching to raw
      spinlocks, the RT developers preferred we make the transformation from
      originally doing:
      
          do_some_stuff()
          spin_lock()
          do_some_other_stuff()
          spin_unlock()
      
      to doing:
      
          do_some_stuff()
          queue_work_on(some_other_stuff_worker)
      
      This is an ordinary pattern done all over the kernel. However, Sherry
      noticed a 10% performance regression in qperf TCP over a 40gbps
      InfiniBand card. Quoting her message:
      
      > MT27500 Family [ConnectX-3] cards:
      > Infiniband device 'mlx4_0' port 1 status:
      > default gid: fe80:0000:0000:0000:0010:e000:0178:9eb1
      > base lid: 0x6
      > sm lid: 0x1
      > state: 4: ACTIVE
      > phys state: 5: LinkUp
      > rate: 40 Gb/sec (4X QDR)
      > link_layer: InfiniBand
      >
      > Cards are configured with IP addresses on private subnet for IPoIB
      > performance testing.
      > Regression identified in this bug is in TCP latency in this stack as reported
      > by qperf tcp_lat metric:
      >
      > We have one system listen as a qperf server:
      > [root@yourQperfServer ~]# qperf
      >
      > Have the other system connect to qperf server as a client (in this
      > case, it’s X7 server with Mellanox card):
      > [root@yourQperfClient ~]# numactl -m0 -N0 qperf 20.20.20.101 -v -uu -ub --time 60 --wait_server 20 -oo msg_size:4K:1024K:*2 tcp_lat
      
      Rather than incur the scheduling latency from queue_work_on, we can
      instead switch to running on the next timer tick, on the same core. This
      also batches things a bit more -- once per jiffy -- which is okay now
      that mix_interrupt_randomness() can credit multiple bits at once.
      Reported-by: NSherry Yang <sherry.yang@oracle.com>
      Tested-by: NPaul Webb <paul.x.webb@oracle.com>
      Cc: Sherry Yang <sherry.yang@oracle.com>
      Cc: Phillip Goerl <phillip.goerl@oracle.com>
      Cc: Jack Vogel <jack.vogel@oracle.com>
      Cc: Nicky Veitch <nicky.veitch@oracle.com>
      Cc: Colm Harrington <colm.harrington@oracle.com>
      Cc: Ramanan Govindarajan <ramanan.govindarajan@oracle.com>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Dominik Brodowski <linux@dominikbrodowski.net>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Sultan Alsawaf <sultan@kerneltoast.com>
      Cc: stable@vger.kernel.org
      Fixes: 58340f8e ("random: defer fast pool mixing to worker")
      Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
      748bc4dd
    • J
      random: avoid reading two cache lines on irq randomness · 9ee0507e
      Jason A. Donenfeld 提交于
      In order to avoid reading and dirtying two cache lines on every IRQ,
      move the work_struct to the bottom of the fast_pool struct. add_
      interrupt_randomness() always touches .pool and .count, which are
      currently split, because .mix pushes everything down. Instead, move .mix
      to the bottom, so that .pool and .count are always in the first cache
      line, since .mix is only accessed when the pool is full.
      
      Fixes: 58340f8e ("random: defer fast pool mixing to worker")
      Reviewed-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
      9ee0507e
  2. 23 9月, 2022 8 次提交
    • J
      random: clamp credited irq bits to maximum mixed · e78a802a
      Jason A. Donenfeld 提交于
      Since the most that's mixed into the pool is sizeof(long)*2, don't
      credit more than that many bytes of entropy.
      
      Fixes: e3e33fc2 ("random: do not use input pool from hard IRQs")
      Cc: stable@vger.kernel.org
      Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
      e78a802a
    • J
      random: throttle hwrng writes if no entropy is credited · d775335e
      Jason A. Donenfeld 提交于
      If a hwrng source does not provide an entropy estimate, it currently
      does not contribute at all to the CRNG. In order to help fix this, in
      case add_hwgenerator_randomness() is called with the entropy parameter
      set to zero, go to sleep until one reseed interval has passed.
      
      While the hwrng thread currently only runs under conditions where this
      is non-zero, this change is not harmful and prepares for future updates
      to the hwrng core.
      
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Reviewed-by: NDominik Brodowski <linux@dominikbrodowski.net>
      Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
      d775335e
    • D
      random: use hwgenerator randomness more frequently at early boot · 745558f9
      Dominik Brodowski 提交于
      Mix in randomness from hw-rng sources more frequently during early
      boot, approximately once for every rng reseed.
      Signed-off-by: NDominik Brodowski <linux@dominikbrodowski.net>
      Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
      745558f9
    • J
      random: restore O_NONBLOCK support · cd4f24ae
      Jason A. Donenfeld 提交于
      Prior to 5.6, when /dev/random was opened with O_NONBLOCK, it would
      return -EAGAIN if there was no entropy. When the pools were unified in
      5.6, this was lost. The post 5.6 behavior of blocking until the pool is
      initialized, and ignoring O_NONBLOCK in the process, went unnoticed,
      with no reports about the regression received for two and a half years.
      However, eventually this indeed did break somebody's userspace.
      
      So we restore the old behavior, by returning -EAGAIN if the pool is not
      initialized. Unlike the old /dev/random, this can only occur during
      early boot, after which it never blocks again.
      
      In order to make this O_NONBLOCK behavior consistent with other
      expectations, also respect users reading with preadv2(RWF_NOWAIT) and
      similar.
      
      Fixes: 30c08efe ("random: make /dev/random be almost like /dev/urandom")
      Reported-by: NGuozihua <guozihua@huawei.com>
      Reported-by: NZhongguohua <zhongguohua1@huawei.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: Andrew Lutomirski <luto@kernel.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
      cd4f24ae
    • L
      Merge tag 'net-6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 504c25cb
      Linus Torvalds 提交于
      Pull networking fixes from Jakub Kicinski:
       "Including fixes from wifi, netfilter and can.
      
        A handful of awaited fixes here - revert of the FEC changes, bluetooth
        fix, fixes for iwlwifi spew.
      
        We added a warning in PHY/MDIO code which is triggering on a couple of
        platforms in a false-positive-ish way. If we can't iron that out over
        the week we'll drop it and re-add for 6.1.
      
        I've added a new "follow up fixes" section for fixes to fixes in
        6.0-rcs but it may actually give the false impression that those are
        problematic or that more testing time would have caught them. So
        likely a one time thing.
      
        Follow up fixes:
      
         - nf_tables_addchain: fix nft_counters_enabled underflow
      
         - ebtables: fix memory leak when blob is malformed
      
         - nf_ct_ftp: fix deadlock when nat rewrite is needed
      
        Current release - regressions:
      
         - Revert "fec: Restart PPS after link state change" and the related
           "net: fec: Use a spinlock to guard `fep->ptp_clk_on`"
      
         - Bluetooth: fix HCIGETDEVINFO regression
      
         - wifi: mt76: fix 5 GHz connection regression on mt76x0/mt76x2
      
         - mptcp: fix fwd memory accounting on coalesce
      
         - rwlock removal fall out:
            - ipmr: always call ip{,6}_mr_forward() from RCU read-side
              critical section
            - ipv6: fix crash when IPv6 is administratively disabled
      
         - tcp: read multiple skbs in tcp_read_skb()
      
         - mdio_bus_phy_resume state warning fallout:
            - eth: ravb: fix PHY state warning splat during system resume
            - eth: sh_eth: fix PHY state warning splat during system resume
      
        Current release - new code bugs:
      
         - wifi: iwlwifi: don't spam logs with NSS>2 messages
      
         - eth: mtk_eth_soc: enable XDP support just for MT7986 SoC
      
        Previous releases - regressions:
      
         - bonding: fix NULL deref in bond_rr_gen_slave_id
      
         - wifi: iwlwifi: mark IWLMEI as broken
      
        Previous releases - always broken:
      
         - nf_conntrack helpers:
            - irc: tighten matching on DCC message
            - sip: fix ct_sip_walk_headers
            - osf: fix possible bogus match in nf_osf_find()
      
         - ipvlan: fix out-of-bound bugs caused by unset skb->mac_header
      
         - core: fix flow symmetric hash
      
         - bonding, team: unsync device addresses on ndo_stop
      
         - phy: micrel: fix shared interrupt on LAN8814"
      
      * tag 'net-6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (83 commits)
        selftests: forwarding: add shebang for sch_red.sh
        bnxt: prevent skb UAF after handing over to PTP worker
        net: marvell: Fix refcounting bugs in prestera_port_sfp_bind()
        net: sched: fix possible refcount leak in tc_new_tfilter()
        net: sunhme: Fix packet reception for len < RX_COPY_THRESHOLD
        udp: Use WARN_ON_ONCE() in udp_read_skb()
        selftests: bonding: cause oops in bond_rr_gen_slave_id
        bonding: fix NULL deref in bond_rr_gen_slave_id
        net: phy: micrel: fix shared interrupt on LAN8814
        net/smc: Stop the CLC flow if no link to map buffers on
        ice: Fix ice_xdp_xmit() when XDP TX queue number is not sufficient
        net: atlantic: fix potential memory leak in aq_ndev_close()
        can: gs_usb: gs_usb_set_phys_id(): return with error if identify is not supported
        can: gs_usb: gs_can_open(): fix race dev->can.state condition
        can: flexcan: flexcan_mailbox_read() fix return value for drop = true
        net: sh_eth: Fix PHY state warning splat during system resume
        net: ravb: Fix PHY state warning splat during system resume
        netfilter: nf_ct_ftp: fix deadlock when nat rewrite is needed
        netfilter: ebtables: fix memory leak when blob is malformed
        netfilter: nf_tables: fix percpu memory leak at nf_tables_addchain()
        ...
      504c25cb
    • L
      Merge tag 'efi-urgent-for-v6.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi · 129e7152
      Linus Torvalds 提交于
      Pull EFI fixes from Ard Biesheuvel:
      
       - Use the right variable to check for shim insecure mode
      
       - Wipe setup_data field when booting via EFI
      
       - Add missing error check to efibc driver
      
      * tag 'efi-urgent-for-v6.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
        efi: libstub: check Shim mode using MokSBStateRT
        efi: x86: Wipe setup_data on pure EFI boot
        efi: efibc: Guard against allocation failure
      129e7152
    • L
      Merge tag 'gpio-fixes-for-v6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux · 5e0a93e4
      Linus Torvalds 提交于
      Pull gpio fixes from Bartosz Golaszewski:
      
       - fix a NULL-pointer dereference at driver unbind and a potential
         resource leak in error path in gpio-mockup
      
       - make the irqchip immutable in gpio-ftgpio010
      
       - fix dereferencing a potentially uninitialized variable in gpio-tqmx86
      
       - fix interrupt registering in gpiolib's character device code
      
      * tag 'gpio-fixes-for-v6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
        gpiolib: cdev: Set lineevent_state::irq after IRQ register successfully
        gpio: tqmx86: fix uninitialized variable girq
        gpio: ftgpio010: Make irqchip immutable
        gpio: mockup: Fix potential resource leakage when register a chip
        gpio: mockup: fix NULL pointer dereference when removing debugfs
      5e0a93e4
    • L
      Merge tag 'perf-tools-fixes-for-v6.0-2022-09-21' of... · 9597f088
      Linus Torvalds 提交于
      Merge tag 'perf-tools-fixes-for-v6.0-2022-09-21' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
      
      Pull perf tools fixes from Arnaldo Carvalho de Melo:
      
       - Fix polling of system-wide events related to mixing per-cpu and
         per-thread events.
      
       - Do not check if /proc/modules is unchanged when copying /proc/kcore,
         that doesn't get in the way of post processing analysis.
      
       - Include program header in ELF files generated for JIT files, so that
         they can be opened by tools using elfutils libraries.
      
       - Enter namespaces when synthesizing build-ids.
      
       - Fix some bugs related to a recent cpu_map overhaul where we should be
         using an index and not the cpu number.
      
       - Fix BPF program ELF section name, using the naming expected by libbpf
         when using BPF counters in 'perf stat'.
      
       - Add a new test for perf stat cgroup BPF counter.
      
       - Adjust check on 'perf test wp' for older kernels, where the
         PERF_EVENT_IOC_MODIFY_ATTRIBUTES ioctl isn't supported.
      
       - Sync x86 cpufeatures with the kernel sources, no changes in tooling.
      
      * tag 'perf-tools-fixes-for-v6.0-2022-09-21' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
        perf tools: Honor namespace when synthesizing build-ids
        tools headers cpufeatures: Sync with the kernel sources
        perf kcore_copy: Do not check /proc/modules is unchanged
        libperf evlist: Fix polling of system-wide events
        perf record: Fix cpu mask bit setting for mixed mmaps
        perf test: Skip wp modify test on old kernels
        perf jit: Include program header in ELF files
        perf test: Add a new test for perf stat cgroup BPF counter
        perf stat: Use evsel->core.cpus to iterate cpus in BPF cgroup counters
        perf stat: Fix cpu map index in bperf cgroup code
        perf stat: Fix BPF program section name
      9597f088
  3. 22 9月, 2022 27 次提交
  4. 21 9月, 2022 3 次提交