1. 18 11月, 2014 2 次提交
    • D
      x86, mpx: Cleanup unused bound tables · 1de4fa14
      Dave Hansen 提交于
      The previous patch allocates bounds tables on-demand.  As noted in
      an earlier description, these can add up to *HUGE* amounts of
      memory.  This has caused OOMs in practice when running tests.
      
      This patch adds support for freeing bounds tables when they are no
      longer in use.
      
      There are two types of mappings in play when unmapping tables:
       1. The mapping with the actual data, which userspace is
          munmap()ing or brk()ing away, etc...
       2. The mapping for the bounds table *backing* the data
          (is tagged with VM_MPX, see the patch "add MPX specific
          mmap interface").
      
      If userspace use the prctl() indroduced earlier in this patchset
      to enable the management of bounds tables in kernel, when it
      unmaps the first type of mapping with the actual data, the kernel
      needs to free the mapping for the bounds table backing the data.
      This patch hooks in at the very end of do_unmap() to do so.
      We look at the addresses being unmapped and find the bounds
      directory entries and tables which cover those addresses.  If
      an entire table is unused, we clear associated directory entry
      and free the table.
      
      Once we unmap the bounds table, we would have a bounds directory
      entry pointing at empty address space. That address space might
      now be allocated for some other (random) use, and the MPX
      hardware might now try to walk it as if it were a bounds table.
      That would be bad.  So any unmapping of an enture bounds table
      has to be accompanied by a corresponding write to the bounds
      directory entry to invalidate it.  That write to the bounds
      directory can fault, which causes the following problem:
      
      Since we are doing the freeing from munmap() (and other paths
      like it), we hold mmap_sem for write. If we fault, the page
      fault handler will attempt to acquire mmap_sem for read and
      we will deadlock.  To avoid the deadlock, we pagefault_disable()
      when touching the bounds directory entry and use a
      get_user_pages() to resolve the fault.
      
      The unmapping of bounds tables happends under vm_munmap().  We
      also (indirectly) call vm_munmap() to _do_ the unmapping of the
      bounds tables.  We avoid unbounded recursion by disallowing
      freeing of bounds tables *for* bounds tables.  This would not
      occur normally, so should not have any practical impact.  Being
      strict about it here helps ensure that we do not have an
      exploitable stack overflow.
      Based-on-patch-by: NQiaowei Ren <qiaowei.ren@intel.com>
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Cc: linux-mm@kvack.org
      Cc: linux-mips@linux-mips.org
      Cc: Dave Hansen <dave@sr71.net>
      Link: http://lkml.kernel.org/r/20141114151831.E4531C4A@viggo.jf.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      1de4fa14
    • D
      x86, mpx: On-demand kernel allocation of bounds tables · fe3d197f
      Dave Hansen 提交于
      This is really the meat of the MPX patch set.  If there is one patch to
      review in the entire series, this is the one.  There is a new ABI here
      and this kernel code also interacts with userspace memory in a
      relatively unusual manner.  (small FAQ below).
      
      Long Description:
      
      This patch adds two prctl() commands to provide enable or disable the
      management of bounds tables in kernel, including on-demand kernel
      allocation (See the patch "on-demand kernel allocation of bounds tables")
      and cleanup (See the patch "cleanup unused bound tables"). Applications
      do not strictly need the kernel to manage bounds tables and we expect
      some applications to use MPX without taking advantage of this kernel
      support. This means the kernel can not simply infer whether an application
      needs bounds table management from the MPX registers.  The prctl() is an
      explicit signal from userspace.
      
      PR_MPX_ENABLE_MANAGEMENT is meant to be a signal from userspace to
      require kernel's help in managing bounds tables.
      
      PR_MPX_DISABLE_MANAGEMENT is the opposite, meaning that userspace don't
      want kernel's help any more. With PR_MPX_DISABLE_MANAGEMENT, the kernel
      won't allocate and free bounds tables even if the CPU supports MPX.
      
      PR_MPX_ENABLE_MANAGEMENT will fetch the base address of the bounds
      directory out of a userspace register (bndcfgu) and then cache it into
      a new field (->bd_addr) in  the 'mm_struct'.  PR_MPX_DISABLE_MANAGEMENT
      will set "bd_addr" to an invalid address.  Using this scheme, we can
      use "bd_addr" to determine whether the management of bounds tables in
      kernel is enabled.
      
      Also, the only way to access that bndcfgu register is via an xsaves,
      which can be expensive.  Caching "bd_addr" like this also helps reduce
      the cost of those xsaves when doing table cleanup at munmap() time.
      Unfortunately, we can not apply this optimization to #BR fault time
      because we need an xsave to get the value of BNDSTATUS.
      
      ==== Why does the hardware even have these Bounds Tables? ====
      
      MPX only has 4 hardware registers for storing bounds information.
      If MPX-enabled code needs more than these 4 registers, it needs to
      spill them somewhere. It has two special instructions for this
      which allow the bounds to be moved between the bounds registers
      and some new "bounds tables".
      
      They are similar conceptually to a page fault and will be raised by
      the MPX hardware during both bounds violations or when the tables
      are not present. This patch handles those #BR exceptions for
      not-present tables by carving the space out of the normal processes
      address space (essentially calling the new mmap() interface indroduced
      earlier in this patch set.) and then pointing the bounds-directory
      over to it.
      
      The tables *need* to be accessed and controlled by userspace because
      the instructions for moving bounds in and out of them are extremely
      frequent. They potentially happen every time a register pointing to
      memory is dereferenced. Any direct kernel involvement (like a syscall)
      to access the tables would obviously destroy performance.
      
      ==== Why not do this in userspace? ====
      
      This patch is obviously doing this allocation in the kernel.
      However, MPX does not strictly *require* anything in the kernel.
      It can theoretically be done completely from userspace. Here are
      a few ways this *could* be done. I don't think any of them are
      practical in the real-world, but here they are.
      
      Q: Can virtual space simply be reserved for the bounds tables so
         that we never have to allocate them?
      A: As noted earlier, these tables are *HUGE*. An X-GB virtual
         area needs 4*X GB of virtual space, plus 2GB for the bounds
         directory. If we were to preallocate them for the 128TB of
         user virtual address space, we would need to reserve 512TB+2GB,
         which is larger than the entire virtual address space today.
         This means they can not be reserved ahead of time. Also, a
         single process's pre-popualated bounds directory consumes 2GB
         of virtual *AND* physical memory. IOW, it's completely
         infeasible to prepopulate bounds directories.
      
      Q: Can we preallocate bounds table space at the same time memory
         is allocated which might contain pointers that might eventually
         need bounds tables?
      A: This would work if we could hook the site of each and every
         memory allocation syscall. This can be done for small,
         constrained applications. But, it isn't practical at a larger
         scale since a given app has no way of controlling how all the
         parts of the app might allocate memory (think libraries). The
         kernel is really the only place to intercept these calls.
      
      Q: Could a bounds fault be handed to userspace and the tables
         allocated there in a signal handler instead of in the kernel?
      A: (thanks to tglx) mmap() is not on the list of safe async
         handler functions and even if mmap() would work it still
         requires locking or nasty tricks to keep track of the
         allocation state there.
      
      Having ruled out all of the userspace-only approaches for managing
      bounds tables that we could think of, we create them on demand in
      the kernel.
      Based-on-patch-by: NQiaowei Ren <qiaowei.ren@intel.com>
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Cc: linux-mm@kvack.org
      Cc: linux-mips@linux-mips.org
      Cc: Dave Hansen <dave@sr71.net>
      Link: http://lkml.kernel.org/r/20141114151829.AD4310DE@viggo.jf.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      fe3d197f
  2. 14 10月, 2014 1 次提交
    • P
      mm: softdirty: enable write notifications on VMAs after VM_SOFTDIRTY cleared · 64e45507
      Peter Feiner 提交于
      For VMAs that don't want write notifications, PTEs created for read faults
      have their write bit set.  If the read fault happens after VM_SOFTDIRTY is
      cleared, then the PTE's softdirty bit will remain clear after subsequent
      writes.
      
      Here's a simple code snippet to demonstrate the bug:
      
        char* m = mmap(NULL, getpagesize(), PROT_READ | PROT_WRITE,
                       MAP_ANONYMOUS | MAP_SHARED, -1, 0);
        system("echo 4 > /proc/$PPID/clear_refs"); /* clear VM_SOFTDIRTY */
        assert(*m == '\0');     /* new PTE allows write access */
        assert(!soft_dirty(x));
        *m = 'x';               /* should dirty the page */
        assert(soft_dirty(x));  /* fails */
      
      With this patch, write notifications are enabled when VM_SOFTDIRTY is
      cleared.  Furthermore, to avoid unnecessary faults, write notifications
      are disabled when VM_SOFTDIRTY is set.
      
      As a side effect of enabling and disabling write notifications with
      care, this patch fixes a bug in mprotect where vm_page_prot bits set by
      drivers were zapped on mprotect.  An analogous bug was fixed in mmap by
      commit c9d0bf24 ("mm: uncached vma support with writenotify").
      Signed-off-by: NPeter Feiner <pfeiner@google.com>
      Reported-by: NPeter Feiner <pfeiner@google.com>
      Suggested-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Cc: Jamie Liu <jamieliu@google.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      64e45507
  3. 10 10月, 2014 3 次提交
    • G
      nosave: consolidate __nosave_{begin,end} in <asm/sections.h> · 7f8998c7
      Geert Uytterhoeven 提交于
      The different architectures used their own (and different) declarations:
      
          extern __visible const void __nosave_begin, __nosave_end;
          extern const void __nosave_begin, __nosave_end;
          extern long __nosave_begin, __nosave_end;
      
      Consolidate them using the first variant in <asm/sections.h>.
      Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7f8998c7
    • L
      common: dma-mapping: introduce common remapping functions · 513510dd
      Laura Abbott 提交于
      For architectures without coherent DMA, memory for DMA may need to be
      remapped with coherent attributes.  Factor out the the remapping code from
      arm and put it in a common location to reduce code duplication.
      
      As part of this, the arm APIs are now migrated away from
      ioremap_page_range to the common APIs which use map_vm_area for remapping.
       This should be an equivalent change and using map_vm_area is more correct
      as ioremap_page_range is intended to bring in io addresses into the cpu
      space and not regular kernel managed memory.
      Signed-off-by: NLaura Abbott <lauraa@codeaurora.org>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: David Riley <davidriley@chromium.org>
      Cc: Olof Johansson <olof@lixom.net>
      Cc: Ritesh Harjain <ritesh.harjani@gmail.com>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Thierry Reding <thierry.reding@gmail.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: Laura Abbott <lauraa@codeaurora.org>
      Cc: Mitchel Humpherys <mitchelh@codeaurora.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      513510dd
    • M
      mm: remove misleading ARCH_USES_NUMA_PROT_NONE · 6a33979d
      Mel Gorman 提交于
      ARCH_USES_NUMA_PROT_NONE was defined for architectures that implemented
      _PAGE_NUMA using _PROT_NONE.  This saved using an additional PTE bit and
      relied on the fact that PROT_NONE vmas were skipped by the NUMA hinting
      fault scanner.  This was found to be conceptually confusing with a lot of
      implicit assumptions and it was asked that an alternative be found.
      
      Commit c46a7c81 "x86: define _PAGE_NUMA by reusing software bits on the
      PMD and PTE levels" redefined _PAGE_NUMA on x86 to be one of the swap PTE
      bits and shrunk the maximum possible swap size but it did not go far
      enough.  There are no architectures that reuse _PROT_NONE as _PROT_NUMA
      but the relics still exist.
      
      This patch removes ARCH_USES_NUMA_PROT_NONE and removes some unnecessary
      duplication in powerpc vs the generic implementation by defining the types
      the core NUMA helpers expected to exist from x86 with their ppc64
      equivalent.  This necessitated that a PTE bit mask be created that
      identified the bits that distinguish present from NUMA pte entries but it
      is expected this will only differ between arches based on _PAGE_PROTNONE.
      The naming for the generic helpers was taken from x86 originally but ppc64
      has types that are equivalent for the purposes of the helper so they are
      mapped instead of duplicating code.
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6a33979d
  4. 03 10月, 2014 3 次提交
  5. 01 10月, 2014 1 次提交
  6. 30 9月, 2014 1 次提交
  7. 26 9月, 2014 1 次提交
    • M
      asm-generic: COMMON_CLK defines __clk_{get,put} · b52f4914
      Mike Turquette 提交于
      If CONFIG_COMMON_CLK is selected then __clk_get and __clk_put are
      defined in drivers/clk/clk.c and declared in include/linux/clkdev.h.
      
      Sylwester's series[0] to properly support clk_{get,put} in the common
      clock framework made changes to the asm-specific clkdev.h headers, but
      not the asm-generic version. Tomeu's recent changes[1] to introduce a
      provider/consumer split in the clock framework uncovered this problem,
      causing the following build error on any architecture using the
      asm-generic clkdev.h (e.g. x86 architecture and the ACPI LPSS driver):
      
      In file included from drivers/acpi/acpi_lpss.c:15:0:
      include/linux/clkdev.h:59:5: error: conflicting types for ‘__clk_get’
       int __clk_get(struct clk_core *clk);
           ^
      In file included from arch/x86/include/generated/asm/clkdev.h:1:0,
                       from include/linux/clkdev.h:15,
                       from drivers/acpi/acpi_lpss.c:15:
      include/asm-generic/clkdev.h:20:19: note: previous definition of ‘__clk_get’ was here
       static inline int __clk_get(struct clk *clk) { return 1; }
                         ^
      
      Fixed by only declarating  __clk_get and __clk_put when
      CONFIG_COMMON_CLK is set.
      
      [0] http://lkml.kernel.org/r/<1386177127-2894-5-git-send-email-s.nawrocki@samsung.com>
      [1] http://lkml.kernel.org/r/<1409758148-20104-1-git-send-email-tomeu.vizoso@collabora.com>
      Signed-off-by: NMike Turquette <mturquette@linaro.org>
      b52f4914
  8. 24 9月, 2014 1 次提交
  9. 23 9月, 2014 1 次提交
    • M
      gpio: Increase ARCH_NR_GPIOs to 512 · 7ca267fa
      Mika Westerberg 提交于
      Some newer Intel SoCs, like Braswell already have more than 256 GPIOs
      available so the default limit is exceeded. Instead of adding more
      architecture specific gpio.h files with custom ARCH_NR_GPIOs we increase
      the gpiolib default limit to be twice the current.
      
      Current generic ARCH_NR_GPIOS limit is 256 which starts to be too small
      for newer Intel SoCs like Braswell. In order to support GPIO controllers
      on these SoCs we increase ARCH_NR_GPIOS to be 512 which should be
      sufficient for now.
      
      The kernel size increases a bit with this change. Below is an example of
      x86_64 kernel image.
      
      ARCH_NR_GPIOS=256
       text     data    bss     dec      hex    filename
       11476173 1971328 1265664 14713165 e0814d vmlinux
      
      ARCH_NR_GPIOS=512
       text     data    bss     dec      hex    filename
       11476173 1971328 1269760 14717261 e0914d vmlinux
      
      So the BSS size and this the kernel image size increases by 4k.
      Signed-off-by: NMika Westerberg <mika.westerberg@linux.intel.com>
      Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
      7ca267fa
  10. 22 9月, 2014 1 次提交
  11. 14 9月, 2014 1 次提交
  12. 26 8月, 2014 1 次提交
  13. 14 8月, 2014 1 次提交
  14. 09 8月, 2014 1 次提交
    • J
      pci-dma-compat: add pci_zalloc_consistent helper · 82bf0baa
      Joe Perches 提交于
      Add this helper for consistency with pci_zalloc_coherent
      and the ability to remove unnecessary memset(,0,) uses.
      Signed-off-by: NJoe Perches <joe@perches.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
      Cc: "John W. Linville" <linville@tuxdriver.com>
      Cc: "Stephen M. Cameron" <scameron@beardog.cce.hp.com>
      Cc: Adam Radford <linuxraid@lsi.com>
      Cc: Chaoming Li <chaoming_li@realsil.com.cn>
      Cc: Chas Williams <chas@cmf.nrl.navy.mil>
      Cc: Christian Benvenuti <benve@cisco.com>
      Cc: Christopher Harrer <charrer@alacritech.com>
      Cc: Dario Ballabio <ballabio_dario@emc.com>
      Cc: David Airlie <airlied@linux.ie>
      Cc: Don Fry <pcnet32@frontier.com>
      Cc: Faisal Latif <faisal.latif@intel.com>
      Cc: Forest Bond <forest@alittletooquiet.net>
      Cc: Govindarajulu Varadarajan <_govind@gmx.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
      Cc: Hans Verkuil <hverkuil@xs4all.nl>
      Cc: Jayamohan Kallickal <jayamohan.kallickal@emulex.com>
      Cc: Jiri Slaby <jslaby@suse.cz>
      Cc: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
      Cc: Larry Finger <Larry.Finger@lwfinger.net>
      Cc: Lennert Buytenhek <buytenh@wantstofly.org>
      Cc: Lior Dotan <liodot@gmail.com>
      Cc: Manish Chopra <manish.chopra@qlogic.com>
      Cc: Manohar Vanga <manohar.vanga@gmail.com>
      Cc: Martyn Welch <martyn.welch@ge.com>
      Cc: Mauro Carvalho Chehab <m.chehab@samsung.com>
      Cc: Michael Neuffer <mike@i-Connect.Net>
      Cc: Mirko Lindner <mlindner@marvell.com>
      Cc: Neel Patel <neepatel@cisco.com>
      Cc: Neela Syam Kolli <megaraidlinux@lsi.com>
      Cc: Rajesh Borundia <rajesh.borundia@qlogic.com>
      Cc: Roland Dreier <roland@kernel.org>
      Cc: Ron Mercer <ron.mercer@qlogic.com>
      Cc: Samuel Ortiz <samuel@sortiz.org>
      Cc: Sean Hefty <sean.hefty@intel.com>
      Cc: Shahed Shaikh <shahed.shaikh@qlogic.com>
      Cc: Sony Chacko <sony.chacko@qlogic.com>
      Cc: Stanislav Yakovlev <stas.yakovlev@gmail.com>
      Cc: Stephen Hemminger <stephen@networkplumber.org>
      Cc: Steve Wise <swise@opengridcomputing.com>
      Cc: Sujith Sankar <ssujith@cisco.com>
      Cc: Tom Tucker <tom@opengridcomputing.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      82bf0baa
  15. 25 7月, 2014 1 次提交
  16. 23 7月, 2014 2 次提交
    • A
      gpio: move gpio_ensure_requested() into legacy C file · d82da797
      Alexandre Courbot 提交于
      gpio_ensure_requested() only makes sense when using the integer-based
      GPIO API, so make sure it is called from there instead of the gpiod
      API which we know cannot be called with a non-requested GPIO anyway.
      
      The uses of gpio_ensure_requested() in the gpiod API were kind of
      out-of-place anyway, so putting them in gpio-legacy.c helps clearing the
      code.
      
      Actually, considering the time this ensure_requested mechanism has been
      around, maybe we should just turn this patch into "remove
      gpio_ensure_requested()" if we know for sure that no user depend on it
      anymore?
      Signed-off-by: NAlexandre Courbot <acourbot@nvidia.com>
      Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
      d82da797
    • A
      gpio: remove gpiod_lock/unlock_as_irq() · d74be6df
      Alexandre Courbot 提交于
      gpio_lock/unlock_as_irq() are working with (chip, offset) arguments and
      are thus not using the old integer namespace. Therefore, there is no
      reason to have gpiod variants of these functions working with
      descriptors, especially since the (chip, offset) tuple is more suitable
      to the users of these functions (GPIO drivers, whereas GPIO descriptors
      are targeted at GPIO consumers).
      Signed-off-by: NAlexandre Courbot <acourbot@nvidia.com>
      Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
      d74be6df
  17. 04 7月, 2014 1 次提交
  18. 02 7月, 2014 1 次提交
    • Z
      core: fix typo in percpu read_mostly section · 330d2822
      Zhengyu He 提交于
      This fixes a typo that named the read_mostly section of percpu as
      readmostly. It works fine with SMP because the linker script specifies
      .data..percpu..readmostly. However, UP kernel builds don't have percpu
      sections defined and the non-percpu version of the section is called
      data..read_mostly, so .data..readmostly will float around and may break
      things unexpectedly.
      
      Looking at the original change that introduced data..percpu..readmostly
      (commit c957ef2c), it looks like this
      was the original intention.
      
      Tested: Built UP kernel and confirmed the sections got merged.
      
      - Before the patch:
      $ objdump -h vmlinux.o  | grep '\.data\.\.read.*mostly'
      38 .data..read_mostly 00004418  0000000000000000  0000000000000000  00431ac0  2**6
      50 .data..readmostly 00000014  0000000000000000  0000000000000000  00444000  2**3
      
      - After the patch:
      $ objdump -h vmlinux.o  | grep '\.data\.\.read.*mostly'
      38 .data..read_mostly 00004438  0000000000000000  0000000000000000  00431ac0  2**6
      Signed-off-by: NZhengyu He <hzy@google.com>
      Signed-off-by: NFilipe Brandenburger <filbranden@google.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      330d2822
  19. 20 6月, 2014 1 次提交
  20. 18 6月, 2014 6 次提交
    • T
      percpu: preffity percpu header files · eba11788
      Tejun Heo 提交于
      percpu macros are difficult to read.  It's partly because they're
      fairly complex but also because they simply lack visual and
      conventional consistency to an unusual degree.  The preceding patches
      tried to organize macro definitions consistently by their roles.  This
      patch makes the following cosmetic changes to improve overall
      readability.
      
      * Use consistent convention for multi-line macro definitions - "do {"
        or "({" are now put on their own lines and the line continuing '\'
        are all put on the same column.
      
      * Temp variables used inside macro are consistently given "__" prefix.
      
      * When a macro argument is passed to another macro or a function,
        putting extra parenthses around it doesn't help anything.  Don't put
        them.
      
      * _this_cpu_generic_*() are renamed to this_cpu_generic_*() so that
        they're consistent with raw_cpu_generic_*().
      
      * Reorganize raw_cpu_*() and this_cpu_*() definitions so that trivial
        wrappers are collected in one place after actual operation
        definitions.
      
      * Other misc cleanups including reorganizing comments.
      
      All changes in this patch are cosmetic and cause no functional
      difference.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NChristoph Lameter <cl@linux.com>
      eba11788
    • T
      percpu: reorder macros in percpu header files · 9c28278a
      Tejun Heo 提交于
      * In include/asm-generic/percpu.h, collect {raw|_this}_cpu_generic*()
        macros into one place.  They were dispersed through
        {raw|this}_cpu_*_N() definitions and the visiual inconsistency was
        making following the code unnecessarily difficult.
      
      * In include/linux/percpu-defs.h, move __verify_pcpu_ptr() later in
        the file so that it's right above accessor definitions where it's
        actually used.
      
      This is pure reorganization.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NChristoph Lameter <cl@linux.com>
      9c28278a
    • T
      percpu: move generic {raw|this}_cpu_*_N() definitions to include/asm-generic/percpu.h · 47b69ad6
      Tejun Heo 提交于
      {raw|this}_cpu_*_N() operations are expected to be provided by archs
      and the generic definitions are provided as fallbacks.  As such, these
      firmly belong to include/asm-generic/percpu.h.
      
      Move the generic definitions to include/asm-generic/percpu.h.  The
      code is moved mostly verbatim; however, raw_cpu_*_N() are placed above
      this_cpu_*_N() which is more conventional as the raw operations may be
      used to defined other variants.
      
      This is pure reorganization.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NChristoph Lameter <cl@linux.com>
      47b69ad6
    • T
      percpu: include/asm-generic/percpu.h should contain only arch-overridable parts · 62fde541
      Tejun Heo 提交于
      The roles of the various percpu header files has become unclear.
      There are four header files involved.
      
       include/linux/percpu-defs.h
       include/linux/percpu.h
       include/asm-generic/percpu.h
       arch/*/include/asm/percpu.h
      
      The original intention for include/asm-generic/percpu.h is providing
      generic definitions for arch-overridable parts; however, it now hosts
      various stuff which can't be overridden by archs.
      
      Also, include/linux/percpu-defs.h was initially added to contain
      section and percpu variable definition macros so that arch header
      files can make use of them without worrying about introducing cyclic
      inclusion dependency by including include/linux/percpu.h; however,
      arch headers sometimes need to access percpu variables too and this is
      one of the reasons why some accessors were implemented in
      include/linux/asm-generic/percpu.h.
      
      Let's clear up the situation by making include/asm-generic/percpu.h
      contain only arch-overridable parts and moving accessors and
      operations into include/linux/percpu-defs.  Note that this patch only
      moves things from include/asm-generic/percpu.h.
      include/linux/percpu.h will be taken care of by later patches.
      
      This patch moves the followings.
      
      * SHIFT_PERCPU_PTR() / VERIFY_PERCPU_PTR()
      * per_cpu()
      * raw_cpu_ptr()
      * this_cpu_ptr()
      * __get_cpu_var()
      * __raw_get_cpu_var()
      * __this_cpu_ptr()
      * PER_CPU_[SHARED_]ALIGNED_SECTION
      * PER_CPU_[SHARED_]ALIGNED_SECTION
      * PER_CPU_FIRST_SECTION
      
      This patch is pure reorganization.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NChristoph Lameter <cl@linux.com>
      62fde541
    • T
      percpu: introduce arch_raw_cpu_ptr() · bbc344e1
      Tejun Heo 提交于
      Currently, archs can override raw_cpu_ptr() directly; however, we
      wanna build a layer of indirection in the generic part of percpu so
      that we can implement generic features there without affecting archs.
      
      Introduce arch_raw_cpu_ptr() which is used to define raw_cpu_ptr() by
      generic percpu code.  The two are identical for now.  x86 is currently
      the only arch which overrides raw_cpu_ptr() and is converted to
      define arch_raw_cpu_ptr() instead.
      
      This doesn't introduce any functional difference.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      bbc344e1
    • T
      percpu: disallow archs from overriding SHIFT_PERCPU_PTR() · 6adc5cac
      Tejun Heo 提交于
      It has been about half a decade since all archs started using the
      dynamic percpu allocator and thus the same SHIFT_PERCPU_PTR()
      implementation.  There's no benefit in overriding SHIFT_PERCPU_PTR()
      anymore.
      
      Remove #ifndef around it to clarify that this is identical regardless
      of the arch.
      
      This patch doesn't cause any functional difference.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NChristoph Lameter <cl@linux.com>
      6adc5cac
  21. 07 6月, 2014 1 次提交
    • H
      include/asm-generic/ioctl.h: fix _IOC_TYPECHECK sparse error · d55875f5
      Hans Verkuil 提交于
      When running sparse over drivers/media/v4l2-core/v4l2-ioctl.c I get these
      errors:
      
        drivers/media/v4l2-core/v4l2-ioctl.c:2043:9: error: bad integer constant expression
        drivers/media/v4l2-core/v4l2-ioctl.c:2044:9: error: bad integer constant expression
        drivers/media/v4l2-core/v4l2-ioctl.c:2045:9: error: bad integer constant expression
        drivers/media/v4l2-core/v4l2-ioctl.c:2046:9: error: bad integer constant expression
      
      etc.
      
      The root cause of that turns out to be in include/asm-generic/ioctl.h:
      
      #include <uapi/asm-generic/ioctl.h>
      
      /* provoke compile error for invalid uses of size argument */
      extern unsigned int __invalid_size_argument_for_IOC;
      #define _IOC_TYPECHECK(t) \
              ((sizeof(t) == sizeof(t[1]) && \
                sizeof(t) < (1 << _IOC_SIZEBITS)) ? \
                sizeof(t) : __invalid_size_argument_for_IOC)
      
      If it is defined as this (as is already done if __KERNEL__ is not defined):
      
        #define _IOC_TYPECHECK(t) (sizeof(t))
      
      then all is well with the world.
      
      This patch allows sparse to work correctly.
      Signed-off-by: NHans Verkuil <hans.verkuil@cisco.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d55875f5
  22. 06 6月, 2014 1 次提交
  23. 05 6月, 2014 1 次提交
    • M
      x86: define _PAGE_NUMA by reusing software bits on the PMD and PTE levels · c46a7c81
      Mel Gorman 提交于
      _PAGE_NUMA is currently an alias of _PROT_PROTNONE to trap NUMA hinting
      faults on x86.  Care is taken such that _PAGE_NUMA is used only in
      situations where the VMA flags distinguish between NUMA hinting faults
      and prot_none faults.  This decision was x86-specific and conceptually
      it is difficult requiring special casing to distinguish between PROTNONE
      and NUMA ptes based on context.
      
      Fundamentally, we only need the _PAGE_NUMA bit to tell the difference
      between an entry that is really unmapped and a page that is protected
      for NUMA hinting faults as if the PTE is not present then a fault will
      be trapped.
      
      Swap PTEs on x86-64 use the bits after _PAGE_GLOBAL for the offset.
      This patch shrinks the maximum possible swap size and uses the bit to
      uniquely distinguish between NUMA hinting ptes and swap ptes.
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Peter Anvin <hpa@zytor.com>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Steven Noonan <steven@uplinklabs.net>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c46a7c81
  24. 21 5月, 2014 5 次提交
  25. 15 5月, 2014 1 次提交
    • J
      asm-generic: remove _STK_LIM_MAX · ffe6902b
      James Hogan 提交于
      _STK_LIM_MAX could be used to override the RLIMIT_STACK hard limit from
      an arch's include/uapi/asm-generic/resource.h file, but is no longer
      used since both parisc and metag removed the override. Therefore remove
      it entirely, setting the hard RLIMIT_STACK limit to RLIM_INFINITY
      directly in include/asm-generic/resource.h.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: linux-arch@vger.kernel.org
      Cc: Helge Deller <deller@gmx.de>
      Cc: John David Anglin <dave.anglin@bell.net>
      ffe6902b