1. 17 1月, 2016 6 次提交
  2. 16 1月, 2016 2 次提交
    • D
      mm, dax, pmem: introduce {get|put}_dev_pagemap() for dax-gup · 5c2c2587
      Dan Williams 提交于
      get_dev_page() enables paths like get_user_pages() to pin a dynamically
      mapped pfn-range (devm_memremap_pages()) while the resulting struct page
      objects are in use.  Unlike get_page() it may fail if the device is, or
      is in the process of being, disabled.  While the initial lookup of the
      range may be an expensive list walk, the result is cached to speed up
      subsequent lookups which are likely to be in the same mapped range.
      
      devm_memremap_pages() now requires a reference counter to be specified
      at init time.  For pmem this means moving request_queue allocation into
      pmem_alloc() so the existing queue usage counter can track "device
      pages".
      
      ZONE_DEVICE pages always have an elevated count and will never be on an
      lru reclaim list.  That space in 'struct page' can be redirected for
      other uses, but for safety introduce a poison value that will always
      trip __list_add() to assert.  This allows half of the struct list_head
      storage to be reclaimed with some assurance to back up the assumption
      that the page count never goes to zero and a list_add() is never
      attempted.
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      Tested-by: NLogan Gunthorpe <logang@deltatee.com>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: Matthew Wilcox <willy@linux.intel.com>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5c2c2587
    • K
      page-flags: introduce page flags policies wrt compound pages · 95ad9755
      Kirill A. Shutemov 提交于
      This patch adds a third argument to macros which create function
      definitions for page flags.  This argument defines how page-flags
      helpers behave on compound functions.
      
      For now we define four policies:
      
       - PF_ANY: the helper function operates on the page it gets, regardless
         if it's non-compound, head or tail.
      
       - PF_HEAD: the helper function operates on the head page of the
         compound page if it gets tail page.
      
       - PF_NO_TAIL: only head and non-compond pages are acceptable for this
         helper function.
      
       - PF_NO_COMPOUND: only non-compound pages are acceptable for this
         helper function.
      
      For now we use policy PF_ANY for all helpers, which matches current
      behaviour.
      
      We do not enforce the policy for TESTPAGEFLAG, because we have flags
      checked for random pages all over the kernel.  Noticeable exception to
      this is PageTransHuge() which triggers VM_BUG_ON() for tail page.
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Steve Capper <steve.capper@linaro.org>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Jerome Marchand <jmarchan@redhat.com>
      Cc: Jérôme Glisse <jglisse@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      95ad9755
  3. 15 1月, 2016 1 次提交
  4. 09 1月, 2016 2 次提交
    • D
      restrict /dev/mem to idle io memory ranges · 90a545e9
      Dan Williams 提交于
      This effectively promotes IORESOURCE_BUSY to IORESOURCE_EXCLUSIVE
      semantics by default.  If userspace really believes it is safe to access
      the memory region it can also perform the extra step of disabling an
      active driver.  This protects device address ranges with read side
      effects and otherwise directs userspace to use the driver.
      
      Persistent memory presents a large "mistake surface" to /dev/mem as now
      accidental writes can corrupt a filesystem.
      
      In general if a device driver is busily using a memory region it already
      informs other parts of the kernel to not touch it via
      request_mem_region().  /dev/mem should honor the same safety restriction
      by default.  Debugging a device driver from userspace becomes more
      difficult with this enabled.  Any application using /dev/mem or mmap of
      sysfs pci resources will now need to perform the extra step of either:
      
      1/ Disabling the driver, for example:
      
         echo <device id> > /dev/bus/<parent bus>/drivers/<driver name>/unbind
      
      2/ Rebooting with "iomem=relaxed" on the command line
      
      3/ Recompiling with CONFIG_IO_STRICT_DEVMEM=n
      
      Traditional users of /dev/mem like dosemu are unaffected because the
      first 1MB of memory is not subject to the IO_STRICT_DEVMEM restriction.
      Legacy X configurations use /dev/mem to talk to graphics hardware, but
      that functionality has since moved to kernel graphics drivers.
      
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Acked-by: NKees Cook <keescook@chromium.org>
      Acked-by: NIngo Molnar <mingo@redhat.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      90a545e9
    • D
      arch: consolidate CONFIG_STRICT_DEVM in lib/Kconfig.debug · 21266be9
      Dan Williams 提交于
      Let all the archs that implement devmem_is_allowed() opt-in to a common
      definition of CONFIG_STRICT_DEVM in lib/Kconfig.debug.
      
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Acked-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      [heiko: drop 'default y' for s390]
      Acked-by: NIngo Molnar <mingo@redhat.com>
      Suggested-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      21266be9
  5. 07 1月, 2016 1 次提交
  6. 04 1月, 2016 1 次提交
  7. 24 12月, 2015 1 次提交
  8. 23 12月, 2015 4 次提交
  9. 22 12月, 2015 1 次提交
  10. 19 12月, 2015 2 次提交
  11. 17 12月, 2015 2 次提交
    • D
      dma-debug: Fix dma_debug_entry offset calculation · 0354aec1
      Daniel Mentz 提交于
      dma-debug uses struct dma_debug_entry to keep track of dma coherent
      memory allocation requests. The virtual address is converted into a pfn
      and an offset. Previously, the offset was calculated using an incorrect
      bit mask.  As a result, we saw incorrect error messages from dma-debug
      like the following:
      
      "DMA-API: exceeded 7 overlapping mappings of cacheline 0x03e00000"
      
      Cacheline 0x03e00000 does not exist on our platform.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 0abdd7a8 ("dma-debug: introduce debug_dma_assert_idle()")
      Signed-off-by: NDaniel Mentz <danielmentz@google.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      0354aec1
    • H
      rhashtable: Fix walker list corruption · c6ff5268
      Herbert Xu 提交于
      The commit ba7c95ea ("rhashtable:
      Fix sleeping inside RCU critical section in walk_stop") introduced
      a new spinlock for the walker list.  However, it did not convert
      all existing users of the list over to the new spin lock.  Some
      continued to use the old mutext for this purpose.  This obviously
      led to corruption of the list.
      
      The fix is to use the spin lock everywhere where we touch the list.
      
      This also allows us to do rcu_rad_lock before we take the lock in
      rhashtable_walk_start.  With the old mutex this would've deadlocked
      but it's safe with the new spin lock.
      
      Fixes: ba7c95ea ("rhashtable: Fix sleeping inside RCU...")
      Reported-by: NColin Ian King <colin.king@canonical.com>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c6ff5268
  12. 16 12月, 2015 1 次提交
  13. 09 12月, 2015 2 次提交
    • H
      rhashtable: Remove unnecessary wmb for future_tbl · 46c749ea
      Herbert Xu 提交于
      The patch 9497df88 ("rhashtable:
      Fix reader/rehash race") added a pair of barriers.  In fact the
      wmb is superfluous because every subsequent write to the old or
      new hash table uses rcu_assign_pointer, which itself carriers a
      full barrier prior to the assignment.
      
      Therefore we may remove the explicit wmb.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Acked-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      46c749ea
    • T
      workqueue: implement lockup detector · 82607adc
      Tejun Heo 提交于
      Workqueue stalls can happen from a variety of usage bugs such as
      missing WQ_MEM_RECLAIM flag or concurrency managed work item
      indefinitely staying RUNNING.  These stalls can be extremely difficult
      to hunt down because the usual warning mechanisms can't detect
      workqueue stalls and the internal state is pretty opaque.
      
      To alleviate the situation, this patch implements workqueue lockup
      detector.  It periodically monitors all worker_pools periodically and,
      if any pool failed to make forward progress longer than the threshold
      duration, triggers warning and dumps workqueue state as follows.
      
       BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=0 stuck for 31s!
       Showing busy workqueues and worker pools:
       workqueue events: flags=0x0
         pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=17/256
           pending: monkey_wrench_fn, e1000_watchdog, cache_reap, vmstat_shepherd, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, cgroup_release_agent
       workqueue events_power_efficient: flags=0x80
         pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=2/256
           pending: check_lifetime, neigh_periodic_work
       workqueue cgroup_pidlist_destroy: flags=0x0
         pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/1
           pending: cgroup_pidlist_destroy_work_fn
       ...
      
      The detection mechanism is controller through kernel parameter
      workqueue.watchdog_thresh and can be updated at runtime through the
      sysfs module parameter file.
      
      v2: Decoupled from softlockup control knobs.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NDon Zickus <dzickus@redhat.com>
      Cc: Ulrich Obergfell <uobergfe@redhat.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Chris Mason <clm@fb.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      82607adc
  14. 08 12月, 2015 1 次提交
  15. 07 12月, 2015 2 次提交
  16. 06 12月, 2015 2 次提交
  17. 05 12月, 2015 2 次提交
  18. 04 12月, 2015 1 次提交
  19. 02 12月, 2015 1 次提交
    • N
      net: add support for netdev notifier error injection · 02fff96a
      Nikolay Aleksandrov 提交于
      This module allows to insert errors in some of netdevice's notifier
      events. All network drivers use these notifiers to signal various events
      and to check if they are allowed, e.g. PRECHANGEMTU and CHANGEMTU
      afterwards. Until recently I had to run failure tests by injecting
      a custom module, but now this infrastructure makes it trivial to test
      these failure paths. Some of the recent bugs I fixed were found using
      this module.
      Here's an example:
       $ cd /sys/kernel/debug/notifier-error-inject/netdev
       $ echo -22 > actions/NETDEV_CHANGEMTU/error
       $ ip link set eth0 mtu 1024
       RTNETLINK answers: Invalid argument
      
      CC: Akinobu Mita <akinobu.mita@gmail.com>
      CC: "David S. Miller" <davem@davemloft.net>
      CC: netdev <netdev@vger.kernel.org>
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      02fff96a
  20. 01 12月, 2015 1 次提交
  21. 24 11月, 2015 4 次提交