1. 19 3月, 2014 1 次提交
  2. 20 2月, 2014 2 次提交
    • P
      sparc32: make copy_to/from_user_page() usable from modular code · a56b072f
      Paul Gortmaker 提交于
      While copy_to/from_user_page() users are uncommon, there is one in
      drivers/staging/lustre/lustre/libcfs/linux/linux-curproc.c which leads
      to the following:
      
      ERROR: "sparc32_cachetlb_ops" [drivers/staging/lustre/lustre/libcfs/libcfs.ko] undefined!
      
      during routine allmodconfig build coverage.  The reason this happens
      is as follows:
      
      In arch/sparc/include/asm/cacheflush_32.h we have:
      
       #define flush_cache_page(vma,addr,pfn) \
              sparc32_cachetlb_ops->cache_page(vma, addr)
      
       #define copy_to_user_page(vma, page, vaddr, dst, src, len) \
              do {                                                    \
                      flush_cache_page(vma, vaddr, page_to_pfn(page));\
                      memcpy(dst, src, len);                          \
              } while (0)
       #define copy_from_user_page(vma, page, vaddr, dst, src, len) \
              do {                                                    \
                      flush_cache_page(vma, vaddr, page_to_pfn(page));\
                      memcpy(dst, src, len);                          \
              } while (0)
      
      However, sparc32_cachetlb_ops isn't exported and hence the error.
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a56b072f
    • P
      sparc32: fix build failure for arch_jump_label_transform · 4f6500ff
      Paul Gortmaker 提交于
      In arch/sparc/Kernel/Makefile, we see:
      
         obj-$(CONFIG_SPARC64)   += jump_label.o
      
      However, the Kconfig selects HAVE_ARCH_JUMP_LABEL unconditionally
      for all SPARC.  This in turn leads to the following failure when
      doing allmodconfig coverage builds:
      
      kernel/built-in.o: In function `__jump_label_update':
      jump_label.c:(.text+0x8560c): undefined reference to `arch_jump_label_transform'
      kernel/built-in.o: In function `arch_jump_label_transform_static':
      (.text+0x85cf4): undefined reference to `arch_jump_label_transform'
      make: *** [vmlinux] Error 1
      
      Change HAVE_ARCH_JUMP_LABEL to be conditional on SPARC64 so that it
      matches the Makefile.
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4f6500ff
  3. 29 1月, 2014 3 次提交
  4. 24 1月, 2014 1 次提交
  5. 22 1月, 2014 1 次提交
    • T
      memblock: make memblock_set_node() support different memblock_type · e7e8de59
      Tang Chen 提交于
      [sfr@canb.auug.org.au: fix powerpc build]
      Signed-off-by: NTang Chen <tangchen@cn.fujitsu.com>
      Reviewed-by: NZhang Yanfei <zhangyanfei@cn.fujitsu.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: "Rafael J . Wysocki" <rjw@sisk.pl>
      Cc: Chen Tang <imtangchen@gmail.com>
      Cc: Gong Chen <gong.chen@linux.intel.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Jiang Liu <jiang.liu@huawei.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Larry Woodman <lwoodman@redhat.com>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Liu Jiang <jiang.liu@huawei.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Michal Nazarewicz <mina86@mina86.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Taku Izumi <izumi.taku@jp.fujitsu.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Renninger <trenn@suse.de>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
      Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
      Cc: Wen Congyang <wency@cn.fujitsu.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e7e8de59
  6. 19 1月, 2014 1 次提交
    • M
      net: introduce SO_BPF_EXTENSIONS · ea02f941
      Michal Sekletar 提交于
      For user space packet capturing libraries such as libpcap, there's
      currently only one way to check which BPF extensions are supported
      by the kernel, that is, commit aa1113d9 ("net: filter: return
      -EINVAL if BPF_S_ANC* operation is not supported"). For querying all
      extensions at once this might be rather inconvenient.
      
      Therefore, this patch introduces a new option which can be used as
      an argument for getsockopt(), and allows one to obtain information
      about which BPF extensions are supported by the current kernel.
      
      As David Miller suggests, we do not need to define any bits right
      now and status quo can just return 0 in order to state that this
      versions supports SKF_AD_PROTOCOL up to SKF_AD_PAY_OFFSET. Later
      additions to BPF extensions need to add their bits to the
      bpf_tell_extensions() function, as documented in the comment.
      Signed-off-by: NMichal Sekletar <msekleta@redhat.com>
      Cc: David Miller <davem@davemloft.net>
      Reviewed-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ea02f941
  7. 16 1月, 2014 1 次提交
  8. 12 1月, 2014 2 次提交
    • P
      arch: Introduce smp_load_acquire(), smp_store_release() · 47933ad4
      Peter Zijlstra 提交于
      A number of situations currently require the heavyweight smp_mb(),
      even though there is no need to order prior stores against later
      loads.  Many architectures have much cheaper ways to handle these
      situations, but the Linux kernel currently has no portable way
      to make use of them.
      
      This commit therefore supplies smp_load_acquire() and
      smp_store_release() to remedy this situation.  The new
      smp_load_acquire() primitive orders the specified load against
      any subsequent reads or writes, while the new smp_store_release()
      primitive orders the specifed store against any prior reads or
      writes.  These primitives allow array-based circular FIFOs to be
      implemented without an smp_mb(), and also allow a theoretical
      hole in rcu_assign_pointer() to be closed at no additional
      expense on most architectures.
      
      In addition, the RCU experience transitioning from explicit
      smp_read_barrier_depends() and smp_wmb() to rcu_dereference()
      and rcu_assign_pointer(), respectively resulted in substantial
      improvements in readability.  It therefore seems likely that
      replacing other explicit barriers with smp_load_acquire() and
      smp_store_release() will provide similar benefits.  It appears
      that roughly half of the explicit barriers in core kernel code
      might be so replaced.
      
      [Changelog by PaulMck]
      Reviewed-by: N"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Michael Ellerman <michael@ellerman.id.au>
      Cc: Michael Neuling <mikey@neuling.org>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Victor Kaplansky <VICTORK@il.ibm.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Link: http://lkml.kernel.org/r/20131213150640.908486364@infradead.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      47933ad4
    • P
      arch: Clean up asm/barrier.h implementations using asm-generic/barrier.h · 93ea02bb
      Peter Zijlstra 提交于
      We're going to be adding a few new barrier primitives, and in order to
      avoid endless duplication make more agressive use of
      asm-generic/barrier.h.
      
      Change the asm-generic/barrier.h such that it allows partial barrier
      definitions and fills out the rest with defaults.
      
      There are a few architectures (m32r, m68k) that could probably
      do away with their barrier.h file entirely but are kept for now due to
      their unconventional nop() implementation.
      Suggested-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Reviewed-by: N"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Reviewed-by: NMathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Michael Ellerman <michael@ellerman.id.au>
      Cc: Michael Neuling <mikey@neuling.org>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Victor Kaplansky <VICTORK@il.ibm.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Link: http://lkml.kernel.org/r/20131213150640.846368594@infradead.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      93ea02bb
  9. 05 1月, 2014 3 次提交
  10. 03 1月, 2014 1 次提交
  11. 22 12月, 2013 1 次提交
    • Y
      PCI: Convert pcibios_resource_to_bus() to take a pci_bus, not a pci_dev · fc279850
      Yinghai Lu 提交于
      These interfaces:
      
        pcibios_resource_to_bus(struct pci_dev *dev, *bus_region, *resource)
        pcibios_bus_to_resource(struct pci_dev *dev, *resource, *bus_region)
      
      took a pci_dev, but they really depend only on the pci_bus.  And we want to
      use them in resource allocation paths where we have the bus but not a
      device, so this patch converts them to take the pci_bus instead of the
      pci_dev:
      
        pcibios_resource_to_bus(struct pci_bus *bus, *bus_region, *resource)
        pcibios_bus_to_resource(struct pci_bus *bus, *resource, *bus_region)
      
      In fact, with standard PCI-PCI bridges, they only depend on the host
      bridge, because that's the only place address translation occurs, but
      we aren't going that far yet.
      
      [bhelgaas: changelog]
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      fc279850
  12. 19 12月, 2013 1 次提交
    • R
      mm: fix TLB flush race between migration, and change_protection_range · 20841405
      Rik van Riel 提交于
      There are a few subtle races, between change_protection_range (used by
      mprotect and change_prot_numa) on one side, and NUMA page migration and
      compaction on the other side.
      
      The basic race is that there is a time window between when the PTE gets
      made non-present (PROT_NONE or NUMA), and the TLB is flushed.
      
      During that time, a CPU may continue writing to the page.
      
      This is fine most of the time, however compaction or the NUMA migration
      code may come in, and migrate the page away.
      
      When that happens, the CPU may continue writing, through the cached
      translation, to what is no longer the current memory location of the
      process.
      
      This only affects x86, which has a somewhat optimistic pte_accessible.
      All other architectures appear to be safe, and will either always flush,
      or flush whenever there is a valid mapping, even with no permissions
      (SPARC).
      
      The basic race looks like this:
      
      CPU A			CPU B			CPU C
      
      						load TLB entry
      make entry PTE/PMD_NUMA
      			fault on entry
      						read/write old page
      			start migrating page
      			change PTE/PMD to new page
      						read/write old page [*]
      flush TLB
      						reload TLB from new entry
      						read/write new page
      						lose data
      
      [*] the old page may belong to a new user at this point!
      
      The obvious fix is to flush remote TLB entries, by making sure that
      pte_accessible aware of the fact that PROT_NONE and PROT_NUMA memory may
      still be accessible if there is a TLB flush pending for the mm.
      
      This should fix both NUMA migration and compaction.
      
      [mgorman@suse.de: fix build]
      Signed-off-by: NRik van Riel <riel@redhat.com>
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Cc: Alex Thorlton <athorlton@sgi.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      20841405
  13. 18 12月, 2013 1 次提交
  14. 12 12月, 2013 1 次提交
  15. 04 12月, 2013 1 次提交
  16. 19 11月, 2013 2 次提交
  17. 15 11月, 2013 6 次提交
  18. 14 11月, 2013 4 次提交
  19. 13 11月, 2013 7 次提交
    • E
      errno.h: remove "NFS" from descriptions in comments · 0ca43435
      Eric Sandeen 提交于
      glibc recently changed the error string for ESTALE to remove "NFS" -
      
      https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=96945714ec61951cc748da2b4b8a80cf02127ee9
      
      from: [ERR_REMAP (ESTALE)] = N_("Stale NFS file handle"),
      to:   [ERR_REMAP (ESTALE)] = N_("Stale file handle"),
      
      And some have expressed concern that the kernel's errno.h
      comments still refer to NFS.
      
      So make that change... note that this is a comment-only change,
      and has no functional difference.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0ca43435
    • J
      mm/arch: use NUMA_NO_NODE · 40c3baa7
      Jianguo Wu 提交于
      Use more appropriate NUMA_NO_NODE instead of -1 in all archs' module_alloc()
      Signed-off-by: NJianguo Wu <wujianguo@huawei.com>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      40c3baa7
    • D
      sparc64: Move to 64-bit PGDs and PMDs. · 2b77933c
      David S. Miller 提交于
      To make the page tables compact, we were using 32-bit PGDs and PMDs.
      We only had to support <= 43 bits of physical addresses so this was
      quite feasible.
      
      In order to support larger physical addresses we have to move to
      64-bit PGDs and PMDs.
      
      Most of the changes are straight-forward:
      
      1) {pgd,pmd}_t --> unsigned long
      
      2) Anything that tries to use plain "unsigned int" types with pgd/pmd
         values needs to be adjusted.  In particular things like "0U" become
         "0UL".
      
      3) {PGDIR,PMD}_BITS decrease by one.
      
      4) In the assembler page table walkers, use "ldxa" instead of "lduwa"
         and adjust the low bit masks to clear out the low 3 bits instead of
         just the low 2 bits during pgd/pmd address formation.
      
      Also, use PTRS_PER_PGD and PTRS_PER_PMD in the sizing of the
      swapper_{pg_dir,low_pmd_dir} arrays.
      
      This patch does not try to take advantage of having 64-bits in the
      PMDs to simplify the hugepage code, that will come in a subsequent
      change.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2b77933c
    • D
      sparc64: Move from 4MB to 8MB huge pages. · 37b3a8ff
      David S. Miller 提交于
      The impetus for this is that we would like to move to 64-bit PMDs and
      PGDs, but that would result in only supporting a 42-bit address space
      with the current page table layout.  It'd be nice to support at least
      43-bits.
      
      The reason we'd end up with only 42-bits after making PMDs and PGDs
      64-bit is that we only use half-page sized PTE tables in order to make
      PMDs line up to 4MB, the hardware huge page size we use.
      
      So what we do here is we make huge pages 8MB, and fabricate them using
      4MB hw TLB entries.
      
      Facilitate this by providing a "REAL_HPAGE_SHIFT" which is used in
      places that really need to operate on hardware 4MB pages.
      
      Use full pages (512 entries) for PTE tables, and adjust PMD_SHIFT,
      PGD_SHIFT, and the build time CPP test as needed.  Use a CPP test to
      make sure REAL_HPAGE_SHIFT and the _PAGE_SZHUGE_* we use match up.
      
      This makes the pgtable cache completely unused, so remove the code
      managing it and the state used in mm_context_t.  Now we have less
      spinlocks taken in the page table allocation path.
      
      The technique we use to fabricate the 8MB pages is to transfer bit 22
      from the missing virtual address into the PTEs physical address field.
      That takes care of the transparent huge pages case.
      
      For hugetlb, we fill things in at the PTE level and that code already
      puts the sub huge page physical bits into the PTEs, based upon the
      offset, so there is nothing special we need to do.  It all just works
      out.
      
      So, a small amount of complexity in the THP case, but this code is
      about to get much simpler when we move the 64-bit PMDs as we can move
      away from the fancy 32-bit huge PMD encoding and just put a real PTE
      value in there.
      
      With bug fixes and help from Bob Picco.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      37b3a8ff
    • D
      sparc64: Make PAGE_OFFSET variable. · b2d43834
      David S. Miller 提交于
      Choose PAGE_OFFSET dynamically based upon cpu type.
      
      Original UltraSPARC-I (spitfire) chips only supported a 44-bit
      virtual address space.
      
      Newer chips (T4 and later) support 52-bit virtual addresses
      and up to 47-bits of physical memory space.
      
      Therefore we have to adjust PAGE_SIZE dynamically based upon
      the capabilities of the chip.
      
      Note that this change alone does not allow us to support > 43-bit
      physical memory, to do that we need to re-arrange our page table
      support.  The current encodings of the pmd_t and pgd_t pointers
      restricts us to "32 + 11" == 43 bits.
      
      This change can waste quite a bit of memory for the various tables.
      In particular, a future change should work to size and allocate
      kern_linear_bitmap[] and sparc64_valid_addr_bitmap[] dynamically.
      This isn't easy as we really cannot take a TLB miss when accessing
      kern_linear_bitmap[].  We'd have to lock it into the TLB or similar.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Acked-by: NBob Picco <bob.picco@oracle.com>
      b2d43834
    • D
      sparc64: Fix inconsistent max-physical-address defines. · f998c9c0
      David S. Miller 提交于
      Some parts of the code use '41' others use '42', make them
      all use the same value.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Acked-by: NBob Picco <bob.picco@oracle.com>
      f998c9c0
    • D
      sparc64: Document the shift counts used to validate linear kernel addresses. · bb7b4353
      David S. Miller 提交于
      This way we can see exactly what they are derived from, and in particular
      how they would change if we were to use a different PAGE_OFFSET value.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Acked-by: NBob Picco <bob.picco@oracle.com>
      bb7b4353