1. 27 7月, 2019 3 次提交
  2. 26 7月, 2019 2 次提交
    • L
      Merge tag 'pm-5.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 6789f873
      Linus Torvalds 提交于
      Pull power management fixes from Rafael Wysocki
       "These fix two issues related to the RAPL MMIO interface support added
        recently and one cpufreq driver issue.
      
        Specifics:
      
         - Initialize the power capping subsystem and the RAPL driver earlier
           in case the int340X thermal driver is built-in and attempts to
           register an MMIO interface for RAPL which must not happen before
           the requisite infrastructure is ready (Zhang Rui)
      
         - Fix the int340X thermal driver's RAPL MMIO interface registration
           error path (Rafael Wysocki)
      
         - Fix possible use-after-free in the pasemi cpufreq driver (Wen
           Yang)"
      
      * tag 'pm-5.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        cpufreq/pasemi: fix use-after-free in pas_cpufreq_cpu_init()
        int340X/processor_thermal_device: Fix proc_thermal_rapl_remove()
        powercap: Invoke powercap_init() and rapl_init() earlier
      6789f873
    • L
      Merge tag 'riscv/for-v5.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux · a51edf75
      Linus Torvalds 提交于
      Pull RISC-V updates from Paul Walmsley:
       "Four minor RISC-V-related changes:
      
         - Add support for the new clone3 syscall for RV64, relying on the
           generic support
      
         - Add DT data for the gigabit Ethernet controller on the SiFive FU540
           and the HiFive Unleashed board
      
         - Update MAINTAINERS to add me to the arch/riscv maintainers' list
      
         - Add support for PCIe message-signaled interrupts by reusing the
           generic header file"
      
      * tag 'riscv/for-v5.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
        riscv: dts: Add DT node for SiFive FU540 Ethernet controller driver
        riscv: include generic support for MSI irqdomains
        MAINTAINERS: Add Paul as a RISC-V maintainer
        riscv: enable sys_clone3 syscall for rv64
      a51edf75
  3. 25 7月, 2019 8 次提交
    • L
      Merge tag 'ktest-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest · da3cc2e6
      Linus Torvalds 提交于
      Pull ktest fixlets from Steven Rostedt:
       "This contains only simple spelling fixes"
      
      * tag 'ktest-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest:
        ktest: Fix some typos in config-bisect.pl
      da3cc2e6
    • L
      Merge branch 'access-creds' · a29a0a46
      Linus Torvalds 提交于
      The access() (and faccessat()) credentials change can cause an
      unnecessary load on the RCU machinery because every access() call ends
      up freeing the temporary access credential using RCU.
      
      This isn't really noticeable on small machines, but if you have hundreds
      of cores you can cause huge slowdowns due to RCU storms.
      
      It's easy to avoid: the temporary access crededntials aren't actually
      normally accessed using RCU at all, so we can avoid the whole issue by
      just marking them as such.
      
      * access-creds:
        access: avoid the RCU grace period for the temporary subjective credentials
      a29a0a46
    • R
      Merge branch 'pm-cpufreq' · fdc75701
      Rafael J. Wysocki 提交于
      * pm-cpufreq:
        cpufreq/pasemi: fix use-after-free in pas_cpufreq_cpu_init()
      fdc75701
    • M
      aecea57f
    • L
      access: avoid the RCU grace period for the temporary subjective credentials · d7852fbd
      Linus Torvalds 提交于
      It turns out that 'access()' (and 'faccessat()') can cause a lot of RCU
      work because it installs a temporary credential that gets allocated and
      freed for each system call.
      
      The allocation and freeing overhead is mostly benign, but because
      credentials can be accessed under the RCU read lock, the freeing
      involves a RCU grace period.
      
      Which is not a huge deal normally, but if you have a lot of access()
      calls, this causes a fair amount of seconday damage: instead of having a
      nice alloc/free patterns that hits in hot per-CPU slab caches, you have
      all those delayed free's, and on big machines with hundreds of cores,
      the RCU overhead can end up being enormous.
      
      But it turns out that all of this is entirely unnecessary.  Exactly
      because access() only installs the credential as the thread-local
      subjective credential, the temporary cred pointer doesn't actually need
      to be RCU free'd at all.  Once we're done using it, we can just free it
      synchronously and avoid all the RCU overhead.
      
      So add a 'non_rcu' flag to 'struct cred', which can be set by users that
      know they only use it in non-RCU context (there are other potential
      users for this).  We can make it a union with the rcu freeing list head
      that we need for the RCU case, so this doesn't need any extra storage.
      
      Note that this also makes 'get_current_cred()' clear the new non_rcu
      flag, in case we have filesystems that take a long-term reference to the
      cred and then expect the RCU delayed freeing afterwards.  It's not
      entirely clear that this is required, but it makes for clear semantics:
      the subjective cred remains non-RCU as long as you only access it
      synchronously using the thread-local accessors, but you _can_ use it as
      a generic cred if you want to.
      
      It is possible that we should just remove the whole RCU markings for
      ->cred entirely.  Only ->real_cred is really supposed to be accessed
      through RCU, and the long-term cred copies that nfs uses might want to
      explicitly re-enable RCU freeing if required, rather than have
      get_current_cred() do it implicitly.
      
      But this is a "minimal semantic changes" change for the immediate
      problem.
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NPaul E. McKenney <paulmck@linux.ibm.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Jan Glauber <jglauber@marvell.com>
      Cc: Jiri Kosina <jikos@kernel.org>
      Cc: Jayachandran Chandrasekharan Nair <jnair@marvell.com>
      Cc: Greg KH <greg@kroah.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Miklos Szeredi <miklos@szeredi.hu>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d7852fbd
    • L
      Merge tag 'powerpc-5.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · bed38c3e
      Linus Torvalds 提交于
      Pull powerpc fixes from Michael Ellerman:
       "An assortment of non-regression fixes that have accumulated since the
        start of the merge window.
      
         - A fix for a user triggerable oops on machines where transactional
           memory is disabled, eg. Power9 bare metal, Power8 with TM disabled
           on the command line, or all Power7 or earlier machines.
      
         - Three fixes for handling of PMU and power saving registers when
           running nested KVM on Power9.
      
         - Two fixes for bugs found while stress testing the XIVE interrupt
           controller code, also on Power9.
      
         - A fix to allow guests to boot under Qemu/KVM on Power9 using the
           the Hash MMU with >= 1TB of memory.
      
         - Two fixes for bugs in the recent DMA cleanup, one of which could
           lead to checkstops.
      
         - And finally three fixes for the PAPR SCM nvdimm driver.
      
        Thanks to: Alexey Kardashevskiy, Andrea Arcangeli, Cédric Le Goater,
        Christoph Hellwig, David Gibson, Gautham R. Shenoy, Michael Neuling,
        Oliver O'Halloran, Satheesh Rajendran, Shawn Anastasio, Suraj Jitindar
        Singh, Vaibhav Jain"
      
      * tag 'powerpc-5.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/papr_scm: Force a scm-unbind if initial scm-bind fails
        powerpc/papr_scm: Update drc_pmem_unbind() to use H_SCM_UNBIND_ALL
        powerpc/pseries: Update SCM hcall op-codes in hvcall.h
        powerpc/tm: Fix oops on sigreturn on systems without TM
        powerpc/dma: Fix invalid DMA mmap behavior
        KVM: PPC: Book3S HV: XIVE: fix rollback when kvmppc_xive_create fails
        powerpc/xive: Fix loop exit-condition in xive_find_target_in_mask()
        powerpc: fix off by one in max_zone_pfn initialization for ZONE_DMA
        KVM: PPC: Book3S HV: Save and restore guest visible PSSCR bits on pseries
        powerpc/pmu: Set pmcregs_in_use in paca when running as LPAR
        KVM: PPC: Book3S HV: Always save guest pmu for guest capable of nesting
        powerpc/mm: Limit rma_size to 1TB when running without HV mode
      bed38c3e
    • L
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 76260774
      Linus Torvalds 提交于
      Pull KVM fixes from Paolo Bonzini:
       "Bugfixes, a pvspinlock optimization, and documentation moving"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: X86: Boost queue head vCPU to mitigate lock waiter preemption
        Documentation: move Documentation/virtual to Documentation/virt
        KVM: nVMX: Set cached_vmcs12 and cached_shadow_vmcs12 NULL after free
        KVM: X86: Dynamically allocate user_fpu
        KVM: X86: Fix fpu state crash in kvm guest
        Revert "kvm: x86: Use task structs fpu field for user"
        KVM: nVMX: Clear pending KVM_REQ_GET_VMCS12_PAGES when leaving nested
      76260774
    • L
      Merge tag 'dma-mapping-5.3-2' of git://git.infradead.org/users/hch/dma-mapping · c2626876
      Linus Torvalds 提交于
      Pull dma-mapping regression fix from Christoph Hellwig:
       "Ensure that dma_addressing_limited doesn't crash on devices without a
        dma mask (Eric Auger)"
      
      * tag 'dma-mapping-5.3-2' of git://git.infradead.org/users/hch/dma-mapping:
        dma-mapping: use dma_get_mask in dma_addressing_limited
      c2626876
  4. 24 7月, 2019 3 次提交
    • W
      KVM: X86: Boost queue head vCPU to mitigate lock waiter preemption · 266e85a5
      Wanpeng Li 提交于
      Commit 11752adb (locking/pvqspinlock: Implement hybrid PV queued/unfair locks)
      introduces hybrid PV queued/unfair locks
       - queued mode (no starvation)
       - unfair mode (good performance on not heavily contended lock)
      The lock waiter goes into the unfair mode especially in VMs with over-commit
      vCPUs since increaing over-commitment increase the likehood that the queue
      head vCPU may have been preempted and not actively spinning.
      
      However, reschedule queue head vCPU timely to acquire the lock still can get
      better performance than just depending on lock stealing in over-subscribe
      scenario.
      
      Testing on 80 HT 2 socket Xeon Skylake server, with 80 vCPUs VM 80GB RAM:
      ebizzy -M
                   vanilla     boosting    improved
       1VM          23520        25040         6%
       2VM           8000        13600        70%
       3VM           3100         5400        74%
      
      The lock holder vCPU yields to the queue head vCPU when unlock, to boost queue
      head vCPU which is involuntary preemption or the one which is voluntary halt
      due to fail to acquire the lock after a short spin in the guest.
      
      Cc: Waiman Long <longman@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      266e85a5
    • C
      Documentation: move Documentation/virtual to Documentation/virt · 2f5947df
      Christoph Hellwig 提交于
      Renaming docs seems to be en vogue at the moment, so fix on of the
      grossly misnamed directories.  We usually never use "virtual" as
      a shortcut for virtualization in the kernel, but always virt,
      as seen in the virt/ top-level directory.  Fix up the documentation
      to match that.
      
      Fixes: ed16648e ("Move kvm, uml, and lguest subdirectories under a common "virtual" directory, I.E:")
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      2f5947df
    • L
      Merge branch 'parisc-5.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux · ad5e427e
      Linus Torvalds 提交于
      Pull parisc fixes from Helge Deller:
      
       - Fix build issues when kprobes are enabled
      
       - Speed up ITLB/DTLB cache flushes when running on machines with
         combined TLBs
      
      * 'parisc-5.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
        parisc: Flush ITLB in flush_tlb_all_local() only on split TLB machines
        parisc: add kprobe_fault_handler()
      ad5e427e
  5. 23 7月, 2019 13 次提交
  6. 22 7月, 2019 11 次提交
    • S
      iommu/vt-d: Print pasid table entries MSB to LSB in debugfs · 7f6cade5
      Sai Praneeth Prakhya 提交于
      Commit dd5142ca ("iommu/vt-d: Add debugfs support to show scalable mode
      DMAR table internals") prints content of pasid table entries from LSB to
      MSB where as other entries are printed MSB to LSB. So, to maintain
      uniformity among all entries and to not confuse the user, print MSB first.
      
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Lu Baolu <baolu.lu@linux.intel.com>
      Cc: Sohil Mehta <sohil.mehta@intel.com>
      Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
      Signed-off-by: NSai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
      Fixes: dd5142ca ("iommu/vt-d: Add debugfs support to show scalable mode DMAR table internals")
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      7f6cade5
    • C
      iommu/iova: Remove stale cached32_node · 9eed17d3
      Chris Wilson 提交于
      Since the cached32_node is allowed to be advanced above dma_32bit_pfn
      (to provide a shortcut into the limited range), we need to be careful to
      remove the to be freed node if it is the cached32_node.
      
      [   48.477773] BUG: KASAN: use-after-free in __cached_rbnode_delete_update+0x68/0x110
      [   48.477812] Read of size 8 at addr ffff88870fc19020 by task kworker/u8:1/37
      [   48.477843]
      [   48.477879] CPU: 1 PID: 37 Comm: kworker/u8:1 Tainted: G     U            5.2.0+ #735
      [   48.477915] Hardware name: Intel Corporation NUC7i5BNK/NUC7i5BNB, BIOS BNKBL357.86A.0052.2017.0918.1346 09/18/2017
      [   48.478047] Workqueue: i915 __i915_gem_free_work [i915]
      [   48.478075] Call Trace:
      [   48.478111]  dump_stack+0x5b/0x90
      [   48.478137]  print_address_description+0x67/0x237
      [   48.478178]  ? __cached_rbnode_delete_update+0x68/0x110
      [   48.478212]  __kasan_report.cold.3+0x1c/0x38
      [   48.478240]  ? __cached_rbnode_delete_update+0x68/0x110
      [   48.478280]  ? __cached_rbnode_delete_update+0x68/0x110
      [   48.478308]  __cached_rbnode_delete_update+0x68/0x110
      [   48.478344]  private_free_iova+0x2b/0x60
      [   48.478378]  iova_magazine_free_pfns+0x46/0xa0
      [   48.478403]  free_iova_fast+0x277/0x340
      [   48.478443]  fq_ring_free+0x15a/0x1a0
      [   48.478473]  queue_iova+0x19c/0x1f0
      [   48.478597]  cleanup_page_dma.isra.64+0x62/0xb0 [i915]
      [   48.478712]  __gen8_ppgtt_cleanup+0x63/0x80 [i915]
      [   48.478826]  __gen8_ppgtt_cleanup+0x42/0x80 [i915]
      [   48.478940]  __gen8_ppgtt_clear+0x433/0x4b0 [i915]
      [   48.479053]  __gen8_ppgtt_clear+0x462/0x4b0 [i915]
      [   48.479081]  ? __sg_free_table+0x9e/0xf0
      [   48.479116]  ? kfree+0x7f/0x150
      [   48.479234]  i915_vma_unbind+0x1e2/0x240 [i915]
      [   48.479352]  i915_vma_destroy+0x3a/0x280 [i915]
      [   48.479465]  __i915_gem_free_objects+0xf0/0x2d0 [i915]
      [   48.479579]  __i915_gem_free_work+0x41/0xa0 [i915]
      [   48.479607]  process_one_work+0x495/0x710
      [   48.479642]  worker_thread+0x4c7/0x6f0
      [   48.479687]  ? process_one_work+0x710/0x710
      [   48.479724]  kthread+0x1b2/0x1d0
      [   48.479774]  ? kthread_create_worker_on_cpu+0xa0/0xa0
      [   48.479820]  ret_from_fork+0x1f/0x30
      [   48.479864]
      [   48.479907] Allocated by task 631:
      [   48.479944]  save_stack+0x19/0x80
      [   48.479994]  __kasan_kmalloc.constprop.6+0xc1/0xd0
      [   48.480038]  kmem_cache_alloc+0x91/0xf0
      [   48.480082]  alloc_iova+0x2b/0x1e0
      [   48.480125]  alloc_iova_fast+0x58/0x376
      [   48.480166]  intel_alloc_iova+0x90/0xc0
      [   48.480214]  intel_map_sg+0xde/0x1f0
      [   48.480343]  i915_gem_gtt_prepare_pages+0xb8/0x170 [i915]
      [   48.480465]  huge_get_pages+0x232/0x2b0 [i915]
      [   48.480590]  ____i915_gem_object_get_pages+0x40/0xb0 [i915]
      [   48.480712]  __i915_gem_object_get_pages+0x90/0xa0 [i915]
      [   48.480834]  i915_gem_object_prepare_write+0x2d6/0x330 [i915]
      [   48.480955]  create_test_object.isra.54+0x1a9/0x3e0 [i915]
      [   48.481075]  igt_shared_ctx_exec+0x365/0x3c0 [i915]
      [   48.481210]  __i915_subtests.cold.4+0x30/0x92 [i915]
      [   48.481341]  __run_selftests.cold.3+0xa9/0x119 [i915]
      [   48.481466]  i915_live_selftests+0x3c/0x70 [i915]
      [   48.481583]  i915_pci_probe+0xe7/0x220 [i915]
      [   48.481620]  pci_device_probe+0xe0/0x180
      [   48.481665]  really_probe+0x163/0x4e0
      [   48.481710]  device_driver_attach+0x85/0x90
      [   48.481750]  __driver_attach+0xa5/0x180
      [   48.481796]  bus_for_each_dev+0xda/0x130
      [   48.481831]  bus_add_driver+0x205/0x2e0
      [   48.481882]  driver_register+0xca/0x140
      [   48.481927]  do_one_initcall+0x6c/0x1af
      [   48.481970]  do_init_module+0x106/0x350
      [   48.482010]  load_module+0x3d2c/0x3ea0
      [   48.482058]  __do_sys_finit_module+0x110/0x180
      [   48.482102]  do_syscall_64+0x62/0x1f0
      [   48.482147]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      [   48.482190]
      [   48.482224] Freed by task 37:
      [   48.482273]  save_stack+0x19/0x80
      [   48.482318]  __kasan_slab_free+0x12e/0x180
      [   48.482363]  kmem_cache_free+0x70/0x140
      [   48.482406]  __free_iova+0x1d/0x30
      [   48.482445]  fq_ring_free+0x15a/0x1a0
      [   48.482490]  queue_iova+0x19c/0x1f0
      [   48.482624]  cleanup_page_dma.isra.64+0x62/0xb0 [i915]
      [   48.482749]  __gen8_ppgtt_cleanup+0x63/0x80 [i915]
      [   48.482873]  __gen8_ppgtt_cleanup+0x42/0x80 [i915]
      [   48.482999]  __gen8_ppgtt_clear+0x433/0x4b0 [i915]
      [   48.483123]  __gen8_ppgtt_clear+0x462/0x4b0 [i915]
      [   48.483250]  i915_vma_unbind+0x1e2/0x240 [i915]
      [   48.483378]  i915_vma_destroy+0x3a/0x280 [i915]
      [   48.483500]  __i915_gem_free_objects+0xf0/0x2d0 [i915]
      [   48.483622]  __i915_gem_free_work+0x41/0xa0 [i915]
      [   48.483659]  process_one_work+0x495/0x710
      [   48.483704]  worker_thread+0x4c7/0x6f0
      [   48.483748]  kthread+0x1b2/0x1d0
      [   48.483787]  ret_from_fork+0x1f/0x30
      [   48.483831]
      [   48.483868] The buggy address belongs to the object at ffff88870fc19000
      [   48.483868]  which belongs to the cache iommu_iova of size 40
      [   48.483920] The buggy address is located 32 bytes inside of
      [   48.483920]  40-byte region [ffff88870fc19000, ffff88870fc19028)
      [   48.483964] The buggy address belongs to the page:
      [   48.484006] page:ffffea001c3f0600 refcount:1 mapcount:0 mapping:ffff8888181a91c0 index:0x0 compound_mapcount: 0
      [   48.484045] flags: 0x8000000000010200(slab|head)
      [   48.484096] raw: 8000000000010200 ffffea001c421a08 ffffea001c447e88 ffff8888181a91c0
      [   48.484141] raw: 0000000000000000 0000000000120012 00000001ffffffff 0000000000000000
      [   48.484188] page dumped because: kasan: bad access detected
      [   48.484230]
      [   48.484265] Memory state around the buggy address:
      [   48.484314]  ffff88870fc18f00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [   48.484361]  ffff88870fc18f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [   48.484406] >ffff88870fc19000: fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc fc
      [   48.484451]                                ^
      [   48.484494]  ffff88870fc19080: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [   48.484530]  ffff88870fc19100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108602
      Fixes: e60aa7b5 ("iommu/iova: Extend rbtree node caching")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: <stable@vger.kernel.org> # v4.15+
      Reviewed-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      9eed17d3
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 83768245
      Linus Torvalds 提交于
      Pull networking fixes from David Miller:
      
       1) Several netfilter fixes including a nfnetlink deadlock fix from
          Florian Westphal and fix for dropping VRF packets from Miaohe Lin.
      
       2) Flow offload fixes from Pablo Neira Ayuso including a fix to restore
          proper block sharing.
      
       3) Fix r8169 PHY init from Thomas Voegtle.
      
       4) Fix memory leak in mac80211, from Lorenzo Bianconi.
      
       5) Missing NULL check on object allocation in cxgb4, from Navid
          Emamdoost.
      
       6) Fix scaling of RX power in sfp phy driver, from Andrew Lunn.
      
       7) Check that there is actually an ip header to access in skb->data in
          VRF, from Peter Kosyh.
      
       8) Remove spurious rcu unlock in hv_netvsc, from Haiyang Zhang.
      
       9) One more tweak the the TCP fragmentation memory limit changes, to be
          less harmful to applications setting small SO_SNDBUF values. From
          Eric Dumazet.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (40 commits)
        tcp: be more careful in tcp_fragment()
        hv_netvsc: Fix extra rcu_read_unlock in netvsc_recv_callback()
        vrf: make sure skb->data contains ip header to make routing
        connector: remove redundant input callback from cn_dev
        qed: Prefer pcie_capability_read_word()
        igc: Prefer pcie_capability_read_word()
        cxgb4: Prefer pcie_capability_read_word()
        be2net: Synchronize be_update_queues with dev_watchdog
        bnx2x: Prevent load reordering in tx completion processing
        net: phy: sfp: hwmon: Fix scaling of RX power
        net: sched: verify that q!=NULL before setting q->flags
        chelsio: Fix a typo in a function name
        allocate_flower_entry: should check for null deref
        net: hns3: typo in the name of a constant
        kbuild: add net/netfilter/nf_tables_offload.h to header-test blacklist.
        tipc: Fix a typo
        mac80211: don't warn about CW params when not using them
        mac80211: fix possible memory leak in ieee80211_assign_beacon
        nl80211: fix NL80211_HE_MAX_CAPABILITY_LEN
        nl80211: fix VENDOR_CMD_RAW_DATA
        ...
      83768245
    • D
      iommu/vt-d: Check if domain->pgd was allocated · 3ee9eca7
      Dmitry Safonov 提交于
      There is a couple of places where on domain_init() failure domain_exit()
      is called. While currently domain_init() can fail only if
      alloc_pgtable_page() has failed.
      
      Make domain_exit() check if domain->pgd present, before calling
      domain_unmap(), as it theoretically should crash on clearing pte entries
      in dma_pte_clear_level().
      
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Lu Baolu <baolu.lu@linux.intel.com>
      Cc: iommu@lists.linux-foundation.org
      Signed-off-by: NDmitry Safonov <dima@arista.com>
      Reviewed-by: NLu Baolu <baolu.lu@linux.intel.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      3ee9eca7
    • D
      iommu/vt-d: Don't queue_iova() if there is no flush queue · effa4678
      Dmitry Safonov 提交于
      Intel VT-d driver was reworked to use common deferred flushing
      implementation. Previously there was one global per-cpu flush queue,
      afterwards - one per domain.
      
      Before deferring a flush, the queue should be allocated and initialized.
      
      Currently only domains with IOMMU_DOMAIN_DMA type initialize their flush
      queue. It's probably worth to init it for static or unmanaged domains
      too, but it may be arguable - I'm leaving it to iommu folks.
      
      Prevent queuing an iova flush if the domain doesn't have a queue.
      The defensive check seems to be worth to keep even if queue would be
      initialized for all kinds of domains. And is easy backportable.
      
      On 4.19.43 stable kernel it has a user-visible effect: previously for
      devices in si domain there were crashes, on sata devices:
      
       BUG: spinlock bad magic on CPU#6, swapper/0/1
        lock: 0xffff88844f582008, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
       CPU: 6 PID: 1 Comm: swapper/0 Not tainted 4.19.43 #1
       Call Trace:
        <IRQ>
        dump_stack+0x61/0x7e
        spin_bug+0x9d/0xa3
        do_raw_spin_lock+0x22/0x8e
        _raw_spin_lock_irqsave+0x32/0x3a
        queue_iova+0x45/0x115
        intel_unmap+0x107/0x113
        intel_unmap_sg+0x6b/0x76
        __ata_qc_complete+0x7f/0x103
        ata_qc_complete+0x9b/0x26a
        ata_qc_complete_multiple+0xd0/0xe3
        ahci_handle_port_interrupt+0x3ee/0x48a
        ahci_handle_port_intr+0x73/0xa9
        ahci_single_level_irq_intr+0x40/0x60
        __handle_irq_event_percpu+0x7f/0x19a
        handle_irq_event_percpu+0x32/0x72
        handle_irq_event+0x38/0x56
        handle_edge_irq+0x102/0x121
        handle_irq+0x147/0x15c
        do_IRQ+0x66/0xf2
        common_interrupt+0xf/0xf
       RIP: 0010:__do_softirq+0x8c/0x2df
      
      The same for usb devices that use ehci-pci:
       BUG: spinlock bad magic on CPU#0, swapper/0/1
        lock: 0xffff88844f402008, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
       CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.19.43 #4
       Call Trace:
        <IRQ>
        dump_stack+0x61/0x7e
        spin_bug+0x9d/0xa3
        do_raw_spin_lock+0x22/0x8e
        _raw_spin_lock_irqsave+0x32/0x3a
        queue_iova+0x77/0x145
        intel_unmap+0x107/0x113
        intel_unmap_page+0xe/0x10
        usb_hcd_unmap_urb_setup_for_dma+0x53/0x9d
        usb_hcd_unmap_urb_for_dma+0x17/0x100
        unmap_urb_for_dma+0x22/0x24
        __usb_hcd_giveback_urb+0x51/0xc3
        usb_giveback_urb_bh+0x97/0xde
        tasklet_action_common.isra.4+0x5f/0xa1
        tasklet_action+0x2d/0x30
        __do_softirq+0x138/0x2df
        irq_exit+0x7d/0x8b
        smp_apic_timer_interrupt+0x10f/0x151
        apic_timer_interrupt+0xf/0x20
        </IRQ>
       RIP: 0010:_raw_spin_unlock_irqrestore+0x17/0x39
      
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Lu Baolu <baolu.lu@linux.intel.com>
      Cc: iommu@lists.linux-foundation.org
      Cc: <stable@vger.kernel.org> # 4.14+
      Fixes: 13cf0174 ("iommu/vt-d: Make use of iova deferred flushing")
      Signed-off-by: NDmitry Safonov <dima@arista.com>
      Reviewed-by: NLu Baolu <baolu.lu@linux.intel.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      effa4678
    • L
      iommu/vt-d: Avoid duplicated pci dma alias consideration · 55752949
      Lu Baolu 提交于
      As we have abandoned the home-made lazy domain allocation
      and delegated the DMA domain life cycle up to the default
      domain mechanism defined in the generic iommu layer, we
      needn't consider pci alias anymore when mapping/unmapping
      the context entries. Without this fix, we see kernel NULL
      pointer dereference during pci device hot-plug test.
      
      Cc: Ashok Raj <ashok.raj@intel.com>
      Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
      Cc: Kevin Tian <kevin.tian@intel.com>
      Fixes: fa954e68 ("iommu/vt-d: Delegate the dma domain to upper layer")
      Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
      Reported-and-tested-by: NXu Pengfei <pengfei.xu@intel.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      55752949
    • J
      Revert "iommu/vt-d: Consolidate domain_init() to avoid duplication" · 301e7ee1
      Joerg Roedel 提交于
      This reverts commit 123b2ffc.
      
      This commit reportedly caused boot failures on some systems
      and needs to be reverted for now.
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      301e7ee1
    • S
      pidfd: fix a poll race when setting exit_state · b191d649
      Suren Baghdasaryan 提交于
      There is a race between reading task->exit_state in pidfd_poll and
      writing it after do_notify_parent calls do_notify_pidfd. Expected
      sequence of events is:
      
      CPU 0                            CPU 1
      ------------------------------------------------
      exit_notify
        do_notify_parent
          do_notify_pidfd
        tsk->exit_state = EXIT_DEAD
                                        pidfd_poll
                                           if (tsk->exit_state)
      
      However nothing prevents the following sequence:
      
      CPU 0                            CPU 1
      ------------------------------------------------
      exit_notify
        do_notify_parent
          do_notify_pidfd
                                         pidfd_poll
                                            if (tsk->exit_state)
        tsk->exit_state = EXIT_DEAD
      
      This causes a polling task to wait forever, since poll blocks because
      exit_state is 0 and the waiting task is not notified again. A stress
      test continuously doing pidfd poll and process exits uncovered this bug.
      
      To fix it, we make sure that the task's exit_state is always set before
      calling do_notify_pidfd.
      
      Fixes: b53b0b9d ("pidfd: add polling support")
      Cc: kernel-team@android.com
      Cc: Oleg Nesterov <oleg@redhat.com>
      Signed-off-by: NSuren Baghdasaryan <surenb@google.com>
      Signed-off-by: NJoel Fernandes (Google) <joel@joelfernandes.org>
      Link: https://lore.kernel.org/r/20190717172100.261204-1-joel@joelfernandes.org
      [christian@brauner.io: adapt commit message and drop unneeded changes from wait_task_zombie]
      Signed-off-by: NChristian Brauner <christian@brauner.io>
      b191d649
    • V
      powerpc/papr_scm: Force a scm-unbind if initial scm-bind fails · 3a855b7a
      Vaibhav Jain 提交于
      In some cases initial bind of scm memory for an lpar can fail if
      previously it wasn't released using a scm-unbind hcall. This situation
      can arise due to panic of the previous kernel or forced lpar
      fadump. In such cases the H_SCM_BIND_MEM return a H_OVERLAP error.
      
      To mitigate such cases the patch updates papr_scm_probe() to force a
      call to drc_pmem_unbind() in case the initial bind of scm memory fails
      with EBUSY error. In case scm-bind operation again fails after the
      forced scm-unbind then we follow the existing error path. We also
      update drc_pmem_bind() to handle the H_OVERLAP error returned by phyp
      and indicate it as a EBUSY error back to the caller.
      Suggested-by: N"Oliver O'Halloran" <oohall@gmail.com>
      Signed-off-by: NVaibhav Jain <vaibhav@linux.ibm.com>
      Reviewed-by: NOliver O'Halloran <oohall@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20190629160610.23402-4-vaibhav@linux.ibm.com
      3a855b7a
    • V
      powerpc/papr_scm: Update drc_pmem_unbind() to use H_SCM_UNBIND_ALL · 0d7fc080
      Vaibhav Jain 提交于
      The new hcall named H_SCM_UNBIND_ALL has been introduce that can
      unbind all or specific scm memory assigned to an lpar. This is
      more efficient than using H_SCM_UNBIND_MEM as currently we don't
      support partial unbind of scm memory.
      
      Hence this patch proposes following changes to drc_pmem_unbind():
      
          * Update drc_pmem_unbind() to replace hcall H_SCM_UNBIND_MEM to
            H_SCM_UNBIND_ALL.
      
          * Update drc_pmem_unbind() to handles cases when PHYP asks the guest
            kernel to wait for specific amount of time before retrying the
            hcall via the 'LONG_BUSY' return value.
      
          * Ensure appropriate error code is returned back from the function
            in case of an error.
      Reviewed-by: NOliver O'Halloran <oohall@gmail.com>
      Signed-off-by: NVaibhav Jain <vaibhav@linux.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20190629160610.23402-3-vaibhav@linux.ibm.com
      0d7fc080
    • V
      powerpc/pseries: Update SCM hcall op-codes in hvcall.h · 6d140e75
      Vaibhav Jain 提交于
      Update the hvcalls.h to include op-codes for new hcalls introduce to
      manage SCM memory. Also update existing hcall definitions to reflect
      current papr specification for SCM.
      
      The removed hcall op-codes H_SCM_MEM_QUERY, H_SCM_BLOCK_CLEAR were
      transient proposals and there support was never implemented by
      Power-VM nor they were used anywhere in Linux kernel. Hence we don't
      expect anyone to be impacted by this change.
      Signed-off-by: NVaibhav Jain <vaibhav@linux.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20190629160610.23402-2-vaibhav@linux.ibm.com
      6d140e75
新手
引导
客服 返回
顶部