1. 10 11月, 2020 2 次提交
  2. 04 11月, 2020 1 次提交
    • O
      can: can_create_echo_skb(): fix echo skb generation: always use skb_clone() · 286228d3
      Oleksij Rempel 提交于
      All user space generated SKBs are owned by a socket (unless injected into the
      key via AF_PACKET). If a socket is closed, all associated skbs will be cleaned
      up.
      
      This leads to a problem when a CAN driver calls can_put_echo_skb() on a
      unshared SKB. If the socket is closed prior to the TX complete handler,
      can_get_echo_skb() and the subsequent delivering of the echo SKB to all
      registered callbacks, a SKB with a refcount of 0 is delivered.
      
      To avoid the problem, in can_get_echo_skb() the original SKB is now always
      cloned, regardless of shared SKB or not. If the process exists it can now
      safely discard its SKBs, without disturbing the delivery of the echo SKB.
      
      The problem shows up in the j1939 stack, when it clones the incoming skb, which
      detects the already 0 refcount.
      
      We can easily reproduce this with following example:
      
      testj1939 -B -r can0: &
      cansend can0 1823ff40#0123
      
      WARNING: CPU: 0 PID: 293 at lib/refcount.c:25 refcount_warn_saturate+0x108/0x174
      refcount_t: addition on 0; use-after-free.
      Modules linked in: coda_vpu imx_vdoa videobuf2_vmalloc dw_hdmi_ahb_audio vcan
      CPU: 0 PID: 293 Comm: cansend Not tainted 5.5.0-rc6-00376-g9e20dcb7040d #1
      Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
      Backtrace:
      [<c010f570>] (dump_backtrace) from [<c010f90c>] (show_stack+0x20/0x24)
      [<c010f8ec>] (show_stack) from [<c0c3e1a4>] (dump_stack+0x8c/0xa0)
      [<c0c3e118>] (dump_stack) from [<c0127fec>] (__warn+0xe0/0x108)
      [<c0127f0c>] (__warn) from [<c01283c8>] (warn_slowpath_fmt+0xa8/0xcc)
      [<c0128324>] (warn_slowpath_fmt) from [<c0539c0c>] (refcount_warn_saturate+0x108/0x174)
      [<c0539b04>] (refcount_warn_saturate) from [<c0ad2cac>] (j1939_can_recv+0x20c/0x210)
      [<c0ad2aa0>] (j1939_can_recv) from [<c0ac9dc8>] (can_rcv_filter+0xb4/0x268)
      [<c0ac9d14>] (can_rcv_filter) from [<c0aca2cc>] (can_receive+0xb0/0xe4)
      [<c0aca21c>] (can_receive) from [<c0aca348>] (can_rcv+0x48/0x98)
      [<c0aca300>] (can_rcv) from [<c09b1fdc>] (__netif_receive_skb_one_core+0x64/0x88)
      [<c09b1f78>] (__netif_receive_skb_one_core) from [<c09b2070>] (__netif_receive_skb+0x38/0x94)
      [<c09b2038>] (__netif_receive_skb) from [<c09b2130>] (netif_receive_skb_internal+0x64/0xf8)
      [<c09b20cc>] (netif_receive_skb_internal) from [<c09b21f8>] (netif_receive_skb+0x34/0x19c)
      [<c09b21c4>] (netif_receive_skb) from [<c0791278>] (can_rx_offload_napi_poll+0x58/0xb4)
      
      Fixes: 0ae89beb ("can: add destructor for self generated skbs")
      Signed-off-by: NOleksij Rempel <o.rempel@pengutronix.de>
      Link: http://lore.kernel.org/r/20200124132656.22156-1-o.rempel@pengutronix.deAcked-by: NOliver Hartkopp <socketcan@hartkopp.net>
      Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>
      286228d3
  3. 03 11月, 2020 3 次提交
    • J
      mm: always have io_remap_pfn_range() set pgprot_decrypted() · f8f6ae5d
      Jason Gunthorpe 提交于
      The purpose of io_remap_pfn_range() is to map IO memory, such as a
      memory mapped IO exposed through a PCI BAR.  IO devices do not
      understand encryption, so this memory must always be decrypted.
      Automatically call pgprot_decrypted() as part of the generic
      implementation.
      
      This fixes a bug where enabling AMD SME causes subsystems, such as RDMA,
      using io_remap_pfn_range() to expose BAR pages to user space to fail.
      The CPU will encrypt access to those BAR pages instead of passing
      unencrypted IO directly to the device.
      
      Places not mapping IO should use remap_pfn_range().
      
      Fixes: aca20d54 ("x86/mm: Add support to make use of Secure Memory Encryption")
      Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brijesh Singh <brijesh.singh@amd.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: "Dave Young" <dyoung@redhat.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Larry Woodman <lwoodman@redhat.com>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Toshimitsu Kani <toshi.kani@hpe.com>
      Cc: <stable@vger.kernel.org>
      Link: https://lkml.kernel.org/r/0-v1-025d64bdf6c4+e-amd_sme_fix_jgg@nvidia.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f8f6ae5d
    • R
      PM: runtime: Drop pm_runtime_clean_up_links() · d6e36668
      Rafael J. Wysocki 提交于
      After commit d12544fb ("PM: runtime: Remove link state checks in
      rpm_get/put_supplier()") nothing prevents the consumer device's
      runtime PM from acquiring additional references to the supplier
      device after pm_runtime_clean_up_links() has run (or even while it
      is running), so calling this function from __device_release_driver()
      may be pointless (or even harmful).
      
      Moreover, it ignores stateless device links, so the runtime PM
      handling of managed and stateless device links is inconsistent
      because of it, so better get rid of it entirely.
      
      Fixes: d12544fb ("PM: runtime: Remove link state checks in rpm_get/put_supplier()")
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Cc: 5.1+ <stable@vger.kernel.org> # 5.1+
      Tested-by: NXiang Chen <chenxiang66@hisilicon.com>
      Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d6e36668
    • R
      PM: runtime: Drop runtime PM references to supplier on link removal · e0e398e2
      Rafael J. Wysocki 提交于
      While removing a device link, drop the supplier device's runtime PM
      usage counter as many times as needed to drop all of the runtime PM
      references to it from the consumer in addition to dropping the
      consumer's link count.
      
      Fixes: baa8809f ("PM / runtime: Optimize the use of device links")
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Cc: 5.1+ <stable@vger.kernel.org> # 5.1+
      Tested-by: NXiang Chen <chenxiang66@hisilicon.com>
      Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e0e398e2
  4. 31 10月, 2020 1 次提交
  5. 30 10月, 2020 9 次提交
  6. 29 10月, 2020 8 次提交
  7. 28 10月, 2020 4 次提交
    • A
      module: use hidden visibility for weak symbol references · 13150bc5
      Ard Biesheuvel 提交于
      Geert reports that commit be288182 ("arm64/build: Assert for
      unwanted sections") results in build errors on arm64 for configurations
      that have CONFIG_MODULES disabled.
      
      The commit in question added ASSERT()s to the arm64 linker script to
      ensure that linker generated sections such as .got.plt etc are empty,
      but as it turns out, there are corner cases where the linker does emit
      content into those sections. More specifically, weak references to
      function symbols (which can remain unsatisfied, and can therefore not
      be emitted as relative references) will be emitted as GOT and PLT
      entries when linking the kernel in PIE mode (which is the case when
      CONFIG_RELOCATABLE is enabled, which is on by default).
      
      What happens is that code such as
      
      	struct device *(*fn)(struct device *dev);
      	struct device *iommu_device;
      
      	fn = symbol_get(mdev_get_iommu_device);
      	if (fn) {
      		iommu_device = fn(dev);
      
      essentially gets converted into the following when CONFIG_MODULES is off:
      
      	struct device *iommu_device;
      
      	if (&mdev_get_iommu_device) {
      		iommu_device = mdev_get_iommu_device(dev);
      
      where mdev_get_iommu_device is emitted as a weak symbol reference into
      the object file. The first reference is decorated with an ordinary
      ABS64 data relocation (which yields 0x0 if the reference remains
      unsatisfied). However, the indirect call is turned into a direct call
      covered by a R_AARCH64_CALL26 relocation, which is converted into a
      call via a PLT entry taking the target address from the associated
      GOT entry.
      
      Given that such GOT and PLT entries are unnecessary for fully linked
      binaries such as the kernel, let's give these weak symbol references
      hidden visibility, so that the linker knows that the weak reference
      via R_AARCH64_CALL26 can simply remain unsatisfied.
      Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
      Tested-by: NGeert Uytterhoeven <geert+renesas@glider.be>
      Reviewed-by: NFangrui Song <maskray@google.com>
      Acked-by: NJessica Yu <jeyu@kernel.org>
      Cc: Jessica Yu <jeyu@kernel.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Link: https://lore.kernel.org/r/20201027151132.14066-1-ardb@kernel.orgSigned-off-by: NWill Deacon <will@kernel.org>
      13150bc5
    • M
      usb: fix kernel-doc markups · cbdc0f54
      Mauro Carvalho Chehab 提交于
      There is a common comment marked, instead, with kernel-doc
      notation.
      
      Also, some identifiers have different names between their
      prototypes and the kernel-doc markup.
      Signed-off-by: NMauro Carvalho Chehab <mchehab+huawei@kernel.org>
      Acked-by: NFelipe Balbi <balbi@kernel.org>
      Link: https://lore.kernel.org/r/0b964be3884def04fcd20ea5c12cb90d0014871c.1603469755.git.mchehab+huawei@kernel.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cbdc0f54
    • S
      KVM: arm64: ARM_SMCCC_ARCH_WORKAROUND_1 doesn't return SMCCC_RET_NOT_REQUIRED · 1de111b5
      Stephen Boyd 提交于
      According to the SMCCC spec[1](7.5.2 Discovery) the
      ARM_SMCCC_ARCH_WORKAROUND_1 function id only returns 0, 1, and
      SMCCC_RET_NOT_SUPPORTED.
      
       0 is "workaround required and safe to call this function"
       1 is "workaround not required but safe to call this function"
       SMCCC_RET_NOT_SUPPORTED is "might be vulnerable or might not be, who knows, I give up!"
      
      SMCCC_RET_NOT_SUPPORTED might as well mean "workaround required, except
      calling this function may not work because it isn't implemented in some
      cases". Wonderful. We map this SMC call to
      
       0 is SPECTRE_MITIGATED
       1 is SPECTRE_UNAFFECTED
       SMCCC_RET_NOT_SUPPORTED is SPECTRE_VULNERABLE
      
      For KVM hypercalls (hvc), we've implemented this function id to return
      SMCCC_RET_NOT_SUPPORTED, 0, and SMCCC_RET_NOT_REQUIRED. One of those
      isn't supposed to be there. Per the code we call
      arm64_get_spectre_v2_state() to figure out what to return for this
      feature discovery call.
      
       0 is SPECTRE_MITIGATED
       SMCCC_RET_NOT_REQUIRED is SPECTRE_UNAFFECTED
       SMCCC_RET_NOT_SUPPORTED is SPECTRE_VULNERABLE
      
      Let's clean this up so that KVM tells the guest this mapping:
      
       0 is SPECTRE_MITIGATED
       1 is SPECTRE_UNAFFECTED
       SMCCC_RET_NOT_SUPPORTED is SPECTRE_VULNERABLE
      
      Note: SMCCC_RET_NOT_AFFECTED is 1 but isn't part of the SMCCC spec
      
      Fixes: c118bbb5 ("arm64: KVM: Propagate full Spectre v2 workaround state to KVM guests")
      Signed-off-by: NStephen Boyd <swboyd@chromium.org>
      Acked-by: NMarc Zyngier <maz@kernel.org>
      Acked-by: NWill Deacon <will@kernel.org>
      Cc: Andre Przywara <andre.przywara@arm.com>
      Cc: Steven Price <steven.price@arm.com>
      Cc: Marc Zyngier <maz@kernel.org>
      Cc: stable@vger.kernel.org
      Link: https://developer.arm.com/documentation/den0028/latest [1]
      Link: https://lore.kernel.org/r/20201023154751.1973872-1-swboyd@chromium.orgSigned-off-by: NWill Deacon <will@kernel.org>
      1de111b5
    • R
      cpufreq: Introduce CPUFREQ_NEED_UPDATE_LIMITS driver flag · 1c534352
      Rafael J. Wysocki 提交于
      Generally, a cpufreq driver may need to update some internal upper
      and lower frequency boundaries on policy max and min changes,
      respectively, but currently this does not work if the target
      frequency does not change along with the policy limit.
      
      Namely, if the target frequency does not change along with the
      policy min or max, the "target_freq == policy->cur" check in
      __cpufreq_driver_target() prevents driver callbacks from being
      invoked and they do not even have a chance to update the
      corresponding internal boundary.
      
      This particularly affects the "powersave" and "performance"
      governors that always set the target frequency to one of the
      policy limits and it never changes when the other limit is updated.
      
      To allow cpufreq the drivers needing to update internal frequency
      boundaries on policy limits changes to avoid this issue, introduce
      a new driver flag, CPUFREQ_NEED_UPDATE_LIMITS, that (when set) will
      neutralize the check mentioned above.
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
      1c534352
  8. 27 10月, 2020 1 次提交
    • P
      RDMA/mlx5: Fix devlink deadlock on net namespace deletion · fbdd0049
      Parav Pandit 提交于
      When a mlx5 core devlink instance is reloaded in different net namespace,
      its associated IB device is deleted and recreated.
      
      Example sequence is:
      $ ip netns add foo
      $ devlink dev reload pci/0000:00:08.0 netns foo
      $ ip netns del foo
      
      mlx5 IB device needs to attach and detach the netdevice to it through the
      netdev notifier chain during load and unload sequence.  A below call graph
      of the unload flow.
      
      cleanup_net()
         down_read(&pernet_ops_rwsem); <- first sem acquired
           ops_pre_exit_list()
             pre_exit()
               devlink_pernet_pre_exit()
                 devlink_reload()
                   mlx5_devlink_reload_down()
                     mlx5_unload_one()
                     [...]
                       mlx5_ib_remove()
                         mlx5_ib_unbind_slave_port()
                           mlx5_remove_netdev_notifier()
                             unregister_netdevice_notifier()
                               down_write(&pernet_ops_rwsem);<- recurrsive lock
      
      Hence, when net namespace is deleted, mlx5 reload results in deadlock.
      
      When deadlock occurs, devlink mutex is also held. This not only deadlocks
      the mlx5 device under reload, but all the processes which attempt to
      access unrelated devlink devices are deadlocked.
      
      Hence, fix this by mlx5 ib driver to register for per net netdev notifier
      instead of global one, which operats on the net namespace without holding
      the pernet_ops_rwsem.
      
      Fixes: 4383cfcc ("net/mlx5: Add devlink reload")
      Link: https://lore.kernel.org/r/20201026134359.23150-1-parav@nvidia.comSigned-off-by: NParav Pandit <parav@nvidia.com>
      Signed-off-by: NLeon Romanovsky <leonro@nvidia.com>
      Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
      fbdd0049
  9. 26 10月, 2020 3 次提交
  10. 25 10月, 2020 2 次提交
    • W
      random32: add noise from network and scheduling activity · 3744741a
      Willy Tarreau 提交于
      With the removal of the interrupt perturbations in previous random32
      change (random32: make prandom_u32() output unpredictable), the PRNG
      has become 100% deterministic again. While SipHash is expected to be
      way more robust against brute force than the previous Tausworthe LFSR,
      there's still the risk that whoever has even one temporary access to
      the PRNG's internal state is able to predict all subsequent draws till
      the next reseed (roughly every minute). This may happen through a side
      channel attack or any data leak.
      
      This patch restores the spirit of commit f227e3ec ("random32: update
      the net random state on interrupt and activity") in that it will perturb
      the internal PRNG's statee using externally collected noise, except that
      it will not pick that noise from the random pool's bits nor upon
      interrupt, but will rather combine a few elements along the Tx path
      that are collectively hard to predict, such as dev, skb and txq
      pointers, packet length and jiffies values. These ones are combined
      using a single round of SipHash into a single long variable that is
      mixed with the net_rand_state upon each invocation.
      
      The operation was inlined because it produces very small and efficient
      code, typically 3 xor, 2 add and 2 rol. The performance was measured
      to be the same (even very slightly better) than before the switch to
      SipHash; on a 6-core 12-thread Core i7-8700k equipped with a 40G NIC
      (i40e), the connection rate dropped from 556k/s to 555k/s while the
      SYN cookie rate grew from 5.38 Mpps to 5.45 Mpps.
      
      Link: https://lore.kernel.org/netdev/20200808152628.GA27941@SDF.ORG/
      Cc: George Spelvin <lkml@sdf.org>
      Cc: Amit Klein <aksecurity@gmail.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: tytso@mit.edu
      Cc: Florian Westphal <fw@strlen.de>
      Cc: Marc Plumb <lkml.mplumb@gmail.com>
      Tested-by: NSedat Dilek <sedat.dilek@gmail.com>
      Signed-off-by: NWilly Tarreau <w@1wt.eu>
      3744741a
    • G
      random32: make prandom_u32() output unpredictable · c51f8f88
      George Spelvin 提交于
      Non-cryptographic PRNGs may have great statistical properties, but
      are usually trivially predictable to someone who knows the algorithm,
      given a small sample of their output.  An LFSR like prandom_u32() is
      particularly simple, even if the sample is widely scattered bits.
      
      It turns out the network stack uses prandom_u32() for some things like
      random port numbers which it would prefer are *not* trivially predictable.
      Predictability led to a practical DNS spoofing attack.  Oops.
      
      This patch replaces the LFSR with a homebrew cryptographic PRNG based
      on the SipHash round function, which is in turn seeded with 128 bits
      of strong random key.  (The authors of SipHash have *not* been consulted
      about this abuse of their algorithm.)  Speed is prioritized over security;
      attacks are rare, while performance is always wanted.
      
      Replacing all callers of prandom_u32() is the quick fix.
      Whether to reinstate a weaker PRNG for uses which can tolerate it
      is an open question.
      
      Commit f227e3ec ("random32: update the net random state on interrupt
      and activity") was an earlier attempt at a solution.  This patch replaces
      it.
      Reported-by: NAmit Klein <aksecurity@gmail.com>
      Cc: Willy Tarreau <w@1wt.eu>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: tytso@mit.edu
      Cc: Florian Westphal <fw@strlen.de>
      Cc: Marc Plumb <lkml.mplumb@gmail.com>
      Fixes: f227e3ec ("random32: update the net random state on interrupt and activity")
      Signed-off-by: NGeorge Spelvin <lkml@sdf.org>
      Link: https://lore.kernel.org/netdev/20200808152628.GA27941@SDF.ORG/
      [ willy: partial reversal of f227e3ec; moved SIPROUND definitions
        to prandom.h for later use; merged George's prandom_seed() proposal;
        inlined siprand_u32(); replaced the net_rand_state[] array with 4
        members to fix a build issue; cosmetic cleanups to make checkpatch
        happy; fixed RANDOM32_SELFTEST build ]
      Signed-off-by: NWilly Tarreau <w@1wt.eu>
      c51f8f88
  11. 23 10月, 2020 4 次提交
  12. 22 10月, 2020 2 次提交