1. 31 1月, 2018 4 次提交
  2. 25 1月, 2018 1 次提交
  3. 24 1月, 2018 1 次提交
    • J
      PCI: Add pci_enable_atomic_ops_to_root() · 430a2368
      Jay Cornwall 提交于
      The Atomic Operations feature (PCIe r4.0, sec 6.15) allows atomic
      transctions to be requested by, routed through and completed by PCIe
      components. Routing and completion do not require software support.
      Component support for each is detectable via the DEVCAP2 register.
      
      A Requester may use AtomicOps only if its PCI_EXP_DEVCTL2_ATOMIC_REQ is
      set. This should be set only if the Completer and all intermediate routing
      elements support AtomicOps.
      
      A concrete example is the AMD Fiji-class GPU (which is capable of making
      AtomicOp requests), below a PLX 8747 switch (advertising AtomicOp routing)
      with a Haswell host bridge (advertising AtomicOp completion support).
      
      Add pci_enable_atomic_ops_to_root() for per-device control over AtomicOp
      requests. This checks to be sure the Root Port supports completion of the
      desired AtomicOp sizes and the path to the Root Port supports routing the
      AtomicOps.
      Signed-off-by: NJay Cornwall <Jay.Cornwall@amd.com>
      Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
      [bhelgaas: changelog, comments, whitespace]
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      430a2368
  4. 20 1月, 2018 1 次提交
    • N
      PCI: Add dummy pci_irqd_intx_xlate() for CONFIG_PCI=n build · 80db6f08
      Niklas Cassel 提交于
      Some hardware can operate in either "host" or "endpoint" mode, which means
      there can be both a host bridge driver and an endpoint driver for the same
      device.  Those drivers share a lot of code, so sometimes they live in the
      same source file.
      
      The host bridge driver requires CONFIG_PCI=y because it enumerates PCI
      devices below the bridge using the PCI core.  The endpoint driver does not
      require CONFIG_PCI=y because it runs in an embedded kernel on the other
      side of the device, e.g., on an adapter card.
      
      pci-dra7xx.c contains both host and endpoint drivers.  If we select only
      the endpoint driver (CONFIG_PCI=n and CONFIG_PCI_DRA7XX_EP=y), the unneeded
      host driver is still compiled.  It references pci_irqd_intx_xlate(), which
      is not present when CONFIG_PCI=n, which causes this error:
      
        drivers/pci/dwc/pci-dra7xx.c:229:11: error: 'pci_irqd_intx_xlate' undeclared here (not in a function)
      
      Add a dummy pci_irqd_intx_xlate() for the CONFIG_PCI=n case.
      
      [bhelgaas: changelog]
      Signed-off-by: NNiklas Cassel <niklas.cassel@axis.com>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Acked-by: NArnd Bergmann <arnd@arndb.de>
      Acked-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      80db6f08
  5. 19 1月, 2018 1 次提交
  6. 18 1月, 2018 2 次提交
  7. 19 12月, 2017 4 次提交
  8. 17 12月, 2017 3 次提交
  9. 15 12月, 2017 6 次提交
  10. 12 12月, 2017 3 次提交
    • M
      compiler.h: Remove ACCESS_ONCE() · b899a850
      Mark Rutland 提交于
      There are no longer any kernelspace uses of ACCESS_ONCE(), so we can
      remove the definition from <linux/compiler.h>.
      
      This patch removes the ACCESS_ONCE() definition, and updates comments
      which referred to it. At the same time, some inconsistent and redundant
      whitespace is removed from comments.
      Tested-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Joe Perches <joe@perches.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: apw@canonical.com
      Link: http://lkml.kernel.org/r/20171127103824.36526-4-mark.rutland@arm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      b899a850
    • I
      locking/lockdep: Remove the cross-release locking checks · e966eaee
      Ingo Molnar 提交于
      This code (CONFIG_LOCKDEP_CROSSRELEASE=y and CONFIG_LOCKDEP_COMPLETIONS=y),
      while it found a number of old bugs initially, was also causing too many
      false positives that caused people to disable lockdep - which is arguably
      a worse overall outcome.
      
      If we disable cross-release by default but keep the code upstream then
      in practice the most likely outcome is that we'll allow the situation
      to degrade gradually, by allowing entropy to introduce more and more
      false positives, until it overwhelms maintenance capacity.
      
      Another bad side effect was that people were trying to work around
      the false positives by uglifying/complicating unrelated code. There's
      a marked difference between annotating locking operations and
      uglifying good code just due to bad lock debugging code ...
      
      This gradual decrease in quality happened to a number of debugging
      facilities in the kernel, and lockdep is pretty complex already,
      so we cannot risk this outcome.
      
      Either cross-release checking can be done right with no false positives,
      or it should not be included in the upstream kernel.
      
      ( Note that it might make sense to maintain it out of tree and go through
        the false positives every now and then and see whether new bugs were
        introduced. )
      
      Cc: Byungchul Park <byungchul.park@lge.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      e966eaee
    • W
      locking/core: Remove break_lock field when CONFIG_GENERIC_LOCKBREAK=y · d89c7035
      Will Deacon 提交于
      When CONFIG_GENERIC_LOCKBEAK=y, locking structures grow an extra int ->break_lock
      field which is used to implement raw_spin_is_contended() by setting the field
      to 1 when waiting on a lock and clearing it to zero when holding a lock.
      However, there are a few problems with this approach:
      
        - There is a write-write race between a CPU successfully taking the lock
          (and subsequently writing break_lock = 0) and a waiter waiting on
          the lock (and subsequently writing break_lock = 1). This could result
          in a contended lock being reported as uncontended and vice-versa.
      
        - On machines with store buffers, nothing guarantees that the writes
          to break_lock are visible to other CPUs at any particular time.
      
        - READ_ONCE/WRITE_ONCE are not used, so the field is potentially
          susceptible to harmful compiler optimisations,
      
      Consequently, the usefulness of this field is unclear and we'd be better off
      removing it and allowing architectures to implement raw_spin_is_contended() by
      providing a definition of arch_spin_is_contended(), as they can when
      CONFIG_GENERIC_LOCKBREAK=n.
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1511894539-7988-3-git-send-email-will.deacon@arm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      d89c7035
  11. 11 12月, 2017 2 次提交
  12. 09 12月, 2017 1 次提交
  13. 08 12月, 2017 2 次提交
  14. 07 12月, 2017 2 次提交
  15. 06 12月, 2017 2 次提交
    • E
      net: remove hlist_nulls_add_tail_rcu() · d7efc6c1
      Eric Dumazet 提交于
      Alexander Potapenko reported use of uninitialized memory [1]
      
      This happens when inserting a request socket into TCP ehash,
      in __sk_nulls_add_node_rcu(), since sk_reuseport is not initialized.
      
      Bug was added by commit d894ba18 ("soreuseport: fix ordering for
      mixed v4/v6 sockets")
      
      Note that d296ba60 ("soreuseport: Resolve merge conflict for v4/v6
      ordering fix") missed the opportunity to get rid of
      hlist_nulls_add_tail_rcu() :
      
      Both UDP sockets and TCP/DCCP listeners no longer use
      __sk_nulls_add_node_rcu() for their hash insertion.
      
      Since all other sockets have unique 4-tuple, the reuseport status
      has no special meaning, so we can always use hlist_nulls_add_head_rcu()
      for them and save few cycles/instructions.
      
      [1]
      
      ==================================================================
      BUG: KMSAN: use of uninitialized memory in inet_ehash_insert+0xd40/0x1050
      CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.13.0+ #3288
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
      Call Trace:
       <IRQ>
       __dump_stack lib/dump_stack.c:16
       dump_stack+0x185/0x1d0 lib/dump_stack.c:52
       kmsan_report+0x13f/0x1c0 mm/kmsan/kmsan.c:1016
       __msan_warning_32+0x69/0xb0 mm/kmsan/kmsan_instr.c:766
       __sk_nulls_add_node_rcu ./include/net/sock.h:684
       inet_ehash_insert+0xd40/0x1050 net/ipv4/inet_hashtables.c:413
       reqsk_queue_hash_req net/ipv4/inet_connection_sock.c:754
       inet_csk_reqsk_queue_hash_add+0x1cc/0x300 net/ipv4/inet_connection_sock.c:765
       tcp_conn_request+0x31e7/0x36f0 net/ipv4/tcp_input.c:6414
       tcp_v4_conn_request+0x16d/0x220 net/ipv4/tcp_ipv4.c:1314
       tcp_rcv_state_process+0x42a/0x7210 net/ipv4/tcp_input.c:5917
       tcp_v4_do_rcv+0xa6a/0xcd0 net/ipv4/tcp_ipv4.c:1483
       tcp_v4_rcv+0x3de0/0x4ab0 net/ipv4/tcp_ipv4.c:1763
       ip_local_deliver_finish+0x6bb/0xcb0 net/ipv4/ip_input.c:216
       NF_HOOK ./include/linux/netfilter.h:248
       ip_local_deliver+0x3fa/0x480 net/ipv4/ip_input.c:257
       dst_input ./include/net/dst.h:477
       ip_rcv_finish+0x6fb/0x1540 net/ipv4/ip_input.c:397
       NF_HOOK ./include/linux/netfilter.h:248
       ip_rcv+0x10f6/0x15c0 net/ipv4/ip_input.c:488
       __netif_receive_skb_core+0x36f6/0x3f60 net/core/dev.c:4298
       __netif_receive_skb net/core/dev.c:4336
       netif_receive_skb_internal+0x63c/0x19c0 net/core/dev.c:4497
       napi_skb_finish net/core/dev.c:4858
       napi_gro_receive+0x629/0xa50 net/core/dev.c:4889
       e1000_receive_skb drivers/net/ethernet/intel/e1000/e1000_main.c:4018
       e1000_clean_rx_irq+0x1492/0x1d30
      drivers/net/ethernet/intel/e1000/e1000_main.c:4474
       e1000_clean+0x43aa/0x5970 drivers/net/ethernet/intel/e1000/e1000_main.c:3819
       napi_poll net/core/dev.c:5500
       net_rx_action+0x73c/0x1820 net/core/dev.c:5566
       __do_softirq+0x4b4/0x8dd kernel/softirq.c:284
       invoke_softirq kernel/softirq.c:364
       irq_exit+0x203/0x240 kernel/softirq.c:405
       exiting_irq+0xe/0x10 ./arch/x86/include/asm/apic.h:638
       do_IRQ+0x15e/0x1a0 arch/x86/kernel/irq.c:263
       common_interrupt+0x86/0x86
      
      Fixes: d894ba18 ("soreuseport: fix ordering for mixed v4/v6 sockets")
      Fixes: d296ba60 ("soreuseport: Resolve merge conflict for v4/v6 ordering fix")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NAlexander Potapenko <glider@google.com>
      Acked-by: NCraig Gallek <kraig@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d7efc6c1
    • R
      x86,kvm: move qemu/guest FPU switching out to vcpu_run · f775b13e
      Rik van Riel 提交于
      Currently, every time a VCPU is scheduled out, the host kernel will
      first save the guest FPU/xstate context, then load the qemu userspace
      FPU context, only to then immediately save the qemu userspace FPU
      context back to memory. When scheduling in a VCPU, the same extraneous
      FPU loads and saves are done.
      
      This could be avoided by moving from a model where the guest FPU is
      loaded and stored with preemption disabled, to a model where the
      qemu userspace FPU is swapped out for the guest FPU context for
      the duration of the KVM_RUN ioctl.
      
      This is done under the VCPU mutex, which is also taken when other
      tasks inspect the VCPU FPU context, so the code should already be
      safe for this change. That should come as no surprise, given that
      s390 already has this optimization.
      
      This can fix a bug where KVM calls get_user_pages while owning the
      FPU, and the file system ends up requesting the FPU again:
      
          [258270.527947]  __warn+0xcb/0xf0
          [258270.527948]  warn_slowpath_null+0x1d/0x20
          [258270.527951]  kernel_fpu_disable+0x3f/0x50
          [258270.527953]  __kernel_fpu_begin+0x49/0x100
          [258270.527955]  kernel_fpu_begin+0xe/0x10
          [258270.527958]  crc32c_pcl_intel_update+0x84/0xb0
          [258270.527961]  crypto_shash_update+0x3f/0x110
          [258270.527968]  crc32c+0x63/0x8a [libcrc32c]
          [258270.527975]  dm_bm_checksum+0x1b/0x20 [dm_persistent_data]
          [258270.527978]  node_prepare_for_write+0x44/0x70 [dm_persistent_data]
          [258270.527985]  dm_block_manager_write_callback+0x41/0x50 [dm_persistent_data]
          [258270.527988]  submit_io+0x170/0x1b0 [dm_bufio]
          [258270.527992]  __write_dirty_buffer+0x89/0x90 [dm_bufio]
          [258270.527994]  __make_buffer_clean+0x4f/0x80 [dm_bufio]
          [258270.527996]  __try_evict_buffer+0x42/0x60 [dm_bufio]
          [258270.527998]  dm_bufio_shrink_scan+0xc0/0x130 [dm_bufio]
          [258270.528002]  shrink_slab.part.40+0x1f5/0x420
          [258270.528004]  shrink_node+0x22c/0x320
          [258270.528006]  do_try_to_free_pages+0xf5/0x330
          [258270.528008]  try_to_free_pages+0xe9/0x190
          [258270.528009]  __alloc_pages_slowpath+0x40f/0xba0
          [258270.528011]  __alloc_pages_nodemask+0x209/0x260
          [258270.528014]  alloc_pages_vma+0x1f1/0x250
          [258270.528017]  do_huge_pmd_anonymous_page+0x123/0x660
          [258270.528021]  handle_mm_fault+0xfd3/0x1330
          [258270.528025]  __get_user_pages+0x113/0x640
          [258270.528027]  get_user_pages+0x4f/0x60
          [258270.528063]  __gfn_to_pfn_memslot+0x120/0x3f0 [kvm]
          [258270.528108]  try_async_pf+0x66/0x230 [kvm]
          [258270.528135]  tdp_page_fault+0x130/0x280 [kvm]
          [258270.528149]  kvm_mmu_page_fault+0x60/0x120 [kvm]
          [258270.528158]  handle_ept_violation+0x91/0x170 [kvm_intel]
          [258270.528162]  vmx_handle_exit+0x1ca/0x1400 [kvm_intel]
      
      No performance changes were detected in quick ping-pong tests on
      my 4 socket system, which is expected since an FPU+xstate load is
      on the order of 0.1us, while ping-ponging between CPUs is on the
      order of 20us, and somewhat noisy.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NRik van Riel <riel@redhat.com>
      Suggested-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      [Fixed a bug where reset_vcpu called put_fpu without preceding load_fpu,
       which happened inside from KVM_CREATE_VCPU ioctl. - Radim]
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      f775b13e
  16. 05 12月, 2017 3 次提交
    • H
      bpf: correct broken uapi for BPF_PROG_TYPE_PERF_EVENT program type · c895f6f7
      Hendrik Brueckner 提交于
      Commit 0515e599 ("bpf: introduce BPF_PROG_TYPE_PERF_EVENT
      program type") introduced the bpf_perf_event_data structure which
      exports the pt_regs structure.  This is OK for multiple architectures
      but fail for s390 and arm64 which do not export pt_regs.  Programs
      using them, for example, the bpf selftest fail to compile on these
      architectures.
      
      For s390, exporting the pt_regs is not an option because s390 wants
      to allow changes to it.  For arm64, there is a user_pt_regs structure
      that covers parts of the pt_regs structure for use by user space.
      
      To solve the broken uapi for s390 and arm64, introduce an abstract
      type for pt_regs and add an asm/bpf_perf_event.h file that concretes
      the type.  An asm-generic header file covers the architectures that
      export pt_regs today.
      
      The arch-specific enablement for s390 and arm64 follows in separate
      commits.
      Reported-by: NThomas Richter <tmricht@linux.vnet.ibm.com>
      Fixes: 0515e599 ("bpf: introduce BPF_PROG_TYPE_PERF_EVENT program type")
      Signed-off-by: NHendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Reviewed-and-tested-by: NThomas Richter <tmricht@linux.vnet.ibm.com>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      c895f6f7
    • T
      Revert "cpuset: Make cpuset hotplug synchronous" · 11db855c
      Tejun Heo 提交于
      This reverts commit 1599a185.
      
      This and the previous commit led to another circular locking scenario
      and the scenario which is fixed by this commit no longer exists after
      e8b3f8db ("workqueue/hotplug: simplify workqueue_offline_cpu()")
      which removes work item flushing from hotplug path.
      
      Revert it for now.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      11db855c
    • W
      irqdesc: Use bool return type instead of int · 4ce413d1
      Will Deacon 提交于
      The irq_balancing_disabled and irq_is_percpu{,_devid} functions are
      clearly intended to return bool like the functions in
      kernel/irq/settings.h, but actually return an int containing a masked
      value of desc->status_use_accessors. This can lead to subtle breakage
      if, for example, the return value is subsequently truncated when
      assigned to a narrower type.
      
      As Linus points out:
      
      | In particular, what can (and _has_ happened) is that people end up
      | using these functions that return true or false, and they assign the
      | result to something like a bitfield (or a char) or whatever.
      |
      | And the code looks *obviously* correct, when you have things like
      |
      |      dev->percpu = irq_is_percpu_devid(dev->irq);
      |
      | and that "percpu" thing is just one status bit among many. It may even
      | *work*, because maybe that "percpu" flag ends up not being all that
      | important, or it just happens to never be set on the particular
      | hardware that people end up testing.
      |
      | But while it looks obviously correct, and might even work, it's really
      | fundamentally broken. Because that "true or false" function didn't
      | actually return 0/1, it returned 0 or 0x20000.
      |
      | And 0x20000 may not fit in a bitmask or a "char" or whatever.
      
      Fix the problem by consistently using bool as the return type for these
      functions.
      Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: marc.zyngier@arm.com
      Link: https://lkml.kernel.org/r/1512142179-24616-1-git-send-email-will.deacon@arm.com
      4ce413d1
  17. 04 12月, 2017 1 次提交
  18. 02 12月, 2017 1 次提交
    • A
      iio: stm32: fix adc/trigger link error · 6d745ee8
      Arnd Bergmann 提交于
      The ADC driver can trigger on either the timer or the lptim
      trigger, but it only uses a Kconfig 'select' statement
      to ensure that the first of the two is present. When the lptim
      trigger is enabled as a loadable module, and the adc driver
      is built-in, we now get a link error:
      
      drivers/iio/adc/stm32-adc.o: In function `stm32_adc_get_trig_extsel':
      stm32-adc.c:(.text+0x4e0): undefined reference to `is_stm32_lptim_trigger'
      
      We could use a second 'select' statement and always have both
      trigger drivers enabled when the adc driver is, but it seems that
      the lptimer trigger was intentionally left optional, so it seems
      better to keep it that way.
      
      This adds a hack to use 'IS_REACHABLE()' rather than 'IS_ENABLED()',
      which avoids the link error, but instead leads to the lptimer trigger
      not being used in the broken configuration. I've added a runtime
      warning for this case to help users figure out what they did wrong
      if this should ever be done by accident.
      
      Fixes: f0b638a7 ("iio: adc: stm32: add support for lptimer triggers")
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: <Stable@vger.kernel.org>
      Signed-off-by: NJonathan Cameron <Jonathan.Cameron@huawei.com>
      6d745ee8