1. 30 4月, 2013 10 次提交
  2. 26 4月, 2013 11 次提交
    • S
      s390/pci: use pci_scan_root_bus · 1c21351b
      Sebastian Ott 提交于
      The pci config space accessors on s390 are (now) smart enough to
      figure out if a pci function is available. So instead of calling
      pci_create_root_bus and then pci_scan_single_device for each
      available function just call pci_scan_root_bus and let the pci core
      do the scanning (via config reads on all possible functions) and
      device creation.
      Reviewed-by: NGerald Schaefer <gerald.schaefer@de.ibm.com>
      Signed-off-by: NSebastian Ott <sebott@linux.vnet.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      1c21351b
    • H
      s390: remove small stack config option · 581618f2
      Heiko Carstens 提交于
      We've seen repeatedly that 8KB stack size on 64 bit kernels
      is not sufficient.
      So simply remove the config option.
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      581618f2
    • M
      s390: system call path micro optimization · 61649881
      Martin Schwidefsky 提交于
      Add a pointer to the system call table to the thread_info structure.
      The TIF_31BIT bit is set or cleared by SET_PERSONALITY exactly once
      for the lifetime of a process. With the pointer to the correct system
      call table in thread_info the system call code in entry64.S path can
      drop the check for TIF_31BIT which saves a couple of instructions.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      61649881
    • M
      s390: lowcore stack pointer offsets · dc7ee00d
      Martin Schwidefsky 提交于
      Store the stack pointers in the lowcore for the kernel stack, the async
      stack and the panic stack with the offset required for the first user.
      This avoids an unnecessary add instruction on the system call path.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      dc7ee00d
    • J
      parisc: use spin_lock_irqsave/spin_unlock_irqrestore for PTE updates · bda079d3
      John David Anglin 提交于
      User applications running on SMP kernels have long suffered from instability
      and random segmentation faults.  This patch improves the situation although
      there is more work to be done.
      
      One of the problems is the various routines in pgtable.h that update page table
      entries use different locking mechanisms, or no lock at all (set_pte_at).  This
      change modifies the routines to all use the same lock pa_dbit_lock.  This lock
      is used for dirty bit updates in the interruption code. The patch also purges
      the TLB entries associated with the PTE to ensure that inconsistent values are
      not used after the page table entry is updated.  The UP and SMP code are now
      identical.
      
      The change also includes a minor update to the purge_tlb_entries function in
      cache.c to improve its efficiency.
      Signed-off-by: NJohn David Anglin <dave.anglin@bell.net>
      Cc: Helge Deller <deller@gmx.de>
      Signed-off-by: NHelge Deller <deller@gmx.de>
      bda079d3
    • H
      parisc: disable -mlong-calls compiler option for kernel modules · cf71130d
      Helge Deller 提交于
      CONFIG_MLONGCALLS was introduced in commit
      ec758f98 to overcome linker issues when linking
      huge linux kernels, e.g. with many modules linked in.
      
      But in the kernel module loader there is no support yet for the new relocation
      types, which is why modules built with -mlong-calls can't be loaded.
      Furthermore, for modules long calls are not really necessary, since we already
      use stub sections which resolve long distance calls.
      
      So, let's just disable this compiler option when compiling kernel modules.
      Signed-off-by: NHelge Deller <deller@gmx.de>
      cf71130d
    • W
      parisc: uaccess: fix compiler warnings caused by __put_user casting · 0f28b628
      Will Deacon 提交于
      When targetting 32-bit processors, __put_user emits a pair of stw
      instructions for the 8-byte case. If the type of __val is a pointer, the
      marshalling code casts it to the wider integer type of u64, resulting
      in the following compiler warnings:
      
        kernel/signal.c: In function 'copy_siginfo_to_user':
        kernel/signal.c:2752:11: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
        kernel/signal.c:2752:11: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
        [...]
      
      This patch fixes the warnings by removing the marshalling code and using
      the correct output modifiers in the __put_{user,kernel}_asm64 macros
      so that GCC will allocate the right registers without the need to
      extract the two words explicitly.
      
      Cc: Helge Deller <deller@gmx.de>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NHelge Deller <deller@gmx.de>
      0f28b628
    • J
      parisc: Change kunmap macro to static inline function · 87be2f88
      John David Anglin 提交于
      Change kunmap macro to static inline function to fix build error
      compiling drivers/base/dma-buf.c.
      
      Without the change, the following error can occur:
      
         CC      drivers/base/dma-buf.o
      drivers/base/dma-buf.c: In function 'dma_buf_kunmap':
      drivers/base/dma-buf.c:427:46:
      error: macro "kunmap" passed 3 arguments, but takes just 1
      
      I believe parisc is the only arch to implement kunmap using a macro.
      Signed-off-by: NJohn David Anglin <dave.anglin@bell.net>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Helge Deller <deller@gmx.de>
      Signed-off-by: NHelge Deller <deller@gmx.de>
      87be2f88
    • J
      parisc: Provide __ucmpdi2 to resolve undefined references in 32 bit builds. · ca0ad83d
      John David Anglin 提交于
      The Debian experimental linux source package (3.8.5-1) build fails
      with the following errors:
      ...
      MODPOST 2016 modules
      ERROR: "__ucmpdi2" [fs/btrfs/btrfs.ko] undefined!
      ERROR: "__ucmpdi2" [drivers/md/dm-verity.ko] undefined!
      
      The attached patch resolves this problem.  It is based on the s390
      implementation of ucmpdi2.c.
      Signed-off-by: NJohn David Anglin <dave.anglin@bell.net>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Signed-off-by: NHelge Deller <deller@gmx.de>
      ca0ad83d
    • A
      USB: OMAP: ISP1301 needs USB_PHY · c3c683ea
      Arnd Bergmann 提交于
      The Kconfig entry for USB_OMAP unconditionally selects USB_ISP1301,
      which is now only visible when USB_PHY is also enabled.
      
      This adds an appropriate dependency and enables USB_PHY in the omap1
      defconfig, avoiding these build warnings:
      
      warning: (USB_OHCI_HCD && USB_OMAP) selects ISP1301_OMAP which has unmet direct dependencies (USB_SUPPORT && USB_PHY && I2C && ARCH_OMAP_OTG)
      
      Also fix a Makefile typo while we're at it.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Felipe Balbi <balbi@ti.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c3c683ea
    • A
      USB: lpc32xx: ISP1301 needs USB_PHY · 64e98a79
      Arnd Bergmann 提交于
      The Kconfig entry for USB_LPC32XX unconditionally selects USB_ISP1301,
      which is now only visible when USB_PHY is also enabled.
      
      This adds an appropriate dependency and enables USB_PHY in the msm
      defconfig, avoiding these build errors:
      
      warning: (USB_LPC32XX) selects USB_ISP1301 which has unmet direct dependencies (USB_SUPPORT && USB_PHY && (USB || USB_GADGET) && I2C)
      drivers/built-in.o: In function `usb_hcd_nxp_probe':
      drivers/usb/host/ohci-nxp.c:224: undefined reference to `isp1301_get_client'
      drivers/built-in.o: In function `lpc32xx_udc_probe':
      drivers/usb/gadget/lpc32xx_udc.c:3071: undefined reference to `isp1301_get_client'
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Felipe Balbi <balbi@ti.com>
      Cc: Roland Stigge <stigge@antcom.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      64e98a79
  3. 25 4月, 2013 1 次提交
  4. 24 4月, 2013 2 次提交
    • J
      efi: Check EFI revision in setup_efi_vars · f697036b
      Josh Boyer 提交于
      We need to check the runtime sys_table for the EFI version the firmware
      specifies instead of just checking for a NULL QueryVariableInfo.  Older
      implementations of EFI don't have QueryVariableInfo but the runtime is
      a smaller structure, so the pointer to it may be pointing off into garbage.
      
      This is apparently the case with several Apple firmwares that support EFI
      1.10, and the current check causes them to no longer boot.  Fix based on
      a suggestion from Matthew Garrett.
      Signed-off-by: NJosh Boyer <jwboyer@redhat.com>
      Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
      f697036b
    • B
      x86, efi: Fix a build warning · 51f8fbba
      Borislav Petkov 提交于
      Fix this:
      
      arch/x86/boot/compressed/eboot.c: In function ‘setup_efi_vars’:
      arch/x86/boot/compressed/eboot.c:269:2: warning: passing argument 1 of ‘efi_call_phys’ makes pointer from integer without a cast [enabled by default]
      In file included from arch/x86/boot/compressed/eboot.c:12:0:
      /w/kernel/linux/arch/x86/include/asm/efi.h:8:33: note: expected ‘void *’ but argument is of type ‘long unsigned int’
      
      after cc5a080c ("efi: Pass boot services variable info to runtime
      code").
      Reported-by: NPaul Bolle <pebolle@tiscali.nl>
      Cc: Matthew Garrett <matthew.garrett@nebula.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
      51f8fbba
  5. 23 4月, 2013 11 次提交
  6. 20 4月, 2013 3 次提交
    • H
      x86, microcode: Verify the family before dispatching microcode patching · 74c3e3fc
      H. Peter Anvin 提交于
      For each CPU vendor that implements CPU microcode patching, there will
      be a minimum family for which this is implemented.  Verify this
      minimum level of support.
      
      This can be done in the dispatch function or early in the application
      functions.  Doing the latter turned out to be somewhat awkward because
      of the ineviable split between the BSP and the AP paths, and rather
      than pushing deep into the application functions, do this in
      the dispatch function.
      Reported-by: N"Bryan O'Donoghue" <bryan.odonoghue.lkml@nexus-software.ie>
      Suggested-by: NBorislav Petkov <bp@alien8.de>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Link: http://lkml.kernel.org/r/1366392183-4149-1-git-send-email-bryan.odonoghue.lkml@nexus-software.ie
      74c3e3fc
    • D
      sparc64: Fix race in TLB batch processing. · f36391d2
      David S. Miller 提交于
      As reported by Dave Kleikamp, when we emit cross calls to do batched
      TLB flush processing we have a race because we do not synchronize on
      the sibling cpus completing the cross call.
      
      So meanwhile the TLB batch can be reset (tb->tlb_nr set to zero, etc.)
      and either flushes are missed or flushes will flush the wrong
      addresses.
      
      Fix this by using generic infrastructure to synchonize on the
      completion of the cross call.
      
      This first required getting the flush_tlb_pending() call out from
      switch_to() which operates with locks held and interrupts disabled.
      The problem is that smp_call_function_many() cannot be invoked with
      IRQs disabled and this is explicitly checked for with WARN_ON_ONCE().
      
      We get the batch processing outside of locked IRQ disabled sections by
      using some ideas from the powerpc port. Namely, we only batch inside
      of arch_{enter,leave}_lazy_mmu_mode() calls.  If we're not in such a
      region, we flush TLBs synchronously.
      
      1) Get rid of xcall_flush_tlb_pending and per-cpu type
         implementations.
      
      2) Do TLB batch cross calls instead via:
      
      	smp_call_function_many()
      		tlb_pending_func()
      			__flush_tlb_pending()
      
      3) Batch only in lazy mmu sequences:
      
      	a) Add 'active' member to struct tlb_batch
      	b) Define __HAVE_ARCH_ENTER_LAZY_MMU_MODE
      	c) Set 'active' in arch_enter_lazy_mmu_mode()
      	d) Run batch and clear 'active' in arch_leave_lazy_mmu_mode()
      	e) Check 'active' in tlb_batch_add_one() and do a synchronous
                 flush if it's clear.
      
      4) Add infrastructure for synchronous TLB page flushes.
      
      	a) Implement __flush_tlb_page and per-cpu variants, patch
      	   as needed.
      	b) Likewise for xcall_flush_tlb_page.
      	c) Implement smp_flush_tlb_page() to invoke the cross-call.
      	d) Wire up global_flush_tlb_page() to the right routine based
                 upon CONFIG_SMP
      
      5) It turns out that singleton batches are very common, 2 out of every
         3 batch flushes have only a single entry in them.
      
         The batch flush waiting is very expensive, both because of the poll
         on sibling cpu completeion, as well as because passing the tlb batch
         pointer to the sibling cpus invokes a shared memory dereference.
      
         Therefore, in flush_tlb_pending(), if there is only one entry in
         the batch perform a completely asynchronous global_flush_tlb_page()
         instead.
      Reported-by: NDave Kleikamp <dave.kleikamp@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Acked-by: NDave Kleikamp <dave.kleikamp@oracle.com>
      f36391d2
    • S
      ARM: 7699/1: sched_clock: Add more notrace to prevent recursion · cea15092
      Stephen Boyd 提交于
      cyc_to_sched_clock() is called by sched_clock() and cyc_to_ns()
      is called by cyc_to_sched_clock(). I suspect that some compilers
      inline both of these functions into sched_clock() and so we've
      been getting away without having a notrace marking. It seems that
      my compiler isn't inlining cyc_to_sched_clock() though, so I'm
      hitting a recursion bug when I enable the function graph tracer,
      causing my system to crash. Marking these functions notrace fixes
      it. Technically cyc_to_ns() doesn't need the notrace because it's
      already marked inline, but let's just add it so that if we ever
      remove inline from that function it doesn't blow up.
      Signed-off-by: NStephen Boyd <sboyd@codeaurora.org>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      cea15092
  7. 19 4月, 2013 2 次提交
    • W
      mutex: Back out architecture specific check for negative mutex count · cc189d25
      Waiman Long 提交于
      Linus suggested that probably all the supported architectures can
      allow a negative mutex count without incorrect behavior, so we can
      then back out the architecture specific change and allow the
      mutex count to go to any negative number. That should further
      reduce contention for non-x86 architecture.
      Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NWaiman Long <Waiman.Long@hp.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Chandramouleeswaran Aswin <aswin@hp.com>
      Cc: Davidlohr Bueso <davidlohr.bueso@hp.com>
      Cc: Norton Scott J <scott.norton@hp.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Dave Jones <davej@redhat.com>
      Cc: Clark Williams <williams@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1366226594-5506-5-git-send-email-Waiman.Long@hp.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      cc189d25
    • W
      mutex: Make more scalable by doing less atomic operations · 0dc8c730
      Waiman Long 提交于
      In the __mutex_lock_common() function, an initial entry into
      the lock slow path will cause two atomic_xchg instructions to be
      issued. Together with the atomic decrement in the fast path, a
      total of three atomic read-modify-write instructions will be
      issued in rapid succession. This can cause a lot of cache
      bouncing when many tasks are trying to acquire the mutex at the
      same time.
      
      This patch will reduce the number of atomic_xchg instructions
      used by checking the counter value first before issuing the
      instruction. The atomic_read() function is just a simple memory
      read. The atomic_xchg() function, on the other hand, can be up
      to 2 order of magnitude or even more in cost when compared with
      atomic_read(). By using atomic_read() to check the value first
      before calling atomic_xchg(), we can avoid a lot of unnecessary
      cache coherency traffic. The only downside with this change is
      that a task on the slow path will have a tiny bit less chance of
      getting the mutex when competing with another task in the fast
      path.
      
      The same is true for the atomic_cmpxchg() function in the
      mutex-spin-on-owner loop. So an atomic_read() is also performed
      before calling atomic_cmpxchg().
      
      The mutex locking and unlocking code for the x86 architecture
      can allow any negative number to be used in the mutex count to
      indicate that some tasks are waiting for the mutex. I am not so
      sure if that is the case for the other architectures. So the
      default is to avoid atomic_xchg() if the count has already been
      set to -1. For x86, the check is modified to include all
      negative numbers to cover a larger case.
      
      The following table shows the jobs per minutes (JPM) scalability
      data on an 8-node 80-core Westmere box with a 3.7.10 kernel. The
      numactl command is used to restrict the running of the
      high_systime workloads to 1/2/4/8 nodes with hyperthreading on
      and off.
      
      +-----------------+-----------+------------+----------+
      |  Configuration  | Mean JPM  |  Mean JPM  | % Change |
      |		  | w/o patch | with patch |	      |
      +-----------------+-----------------------------------+
      |		  |      User Range 1100 - 2000	      |
      +-----------------+-----------------------------------+
      | 8 nodes, HT on  |    36980   |   148590  | +301.8%  |
      | 8 nodes, HT off |    42799   |   145011  | +238.8%  |
      | 4 nodes, HT on  |    61318   |   118445  |  +51.1%  |
      | 4 nodes, HT off |   158481   |   158592  |   +0.1%  |
      | 2 nodes, HT on  |   180602   |   173967  |   -3.7%  |
      | 2 nodes, HT off |   198409   |   198073  |   -0.2%  |
      | 1 node , HT on  |   149042   |   147671  |   -0.9%  |
      | 1 node , HT off |   126036   |   126533  |   +0.4%  |
      +-----------------+-----------------------------------+
      |		  |       User Range 200 - 1000	      |
      +-----------------+-----------------------------------+
      | 8 nodes, HT on  |   41525    |   122349  | +194.6%  |
      | 8 nodes, HT off |   49866    |   124032  | +148.7%  |
      | 4 nodes, HT on  |   66409    |   106984  |  +61.1%  |
      | 4 nodes, HT off |  119880    |   130508  |   +8.9%  |
      | 2 nodes, HT on  |  138003    |   133948  |   -2.9%  |
      | 2 nodes, HT off |  132792    |   131997  |   -0.6%  |
      | 1 node , HT on  |  116593    |   115859  |   -0.6%  |
      | 1 node , HT off |  104499    |   104597  |   +0.1%  |
      +-----------------+------------+-----------+----------+
      
      At low user range 10-100, the JPM differences were within +/-1%.
      So they are not that interesting.
      
      AIM7 benchmark run has a pretty large run-to-run variance due to
      random nature of the subtests executed. So a difference of less
      than +-5% may not be really significant.
      
      This patch improves high_systime workload performance at 4 nodes
      and up by maintaining transaction rates without significant
      drop-off at high node count.  The patch has practically no
      impact on 1 and 2 nodes system.
      
      The table below shows the percentage time (as reported by perf
      record -a -s -g) spent on the __mutex_lock_slowpath() function
      by the high_systime workload at 1500 users for 2/4/8-node
      configurations with hyperthreading off.
      
      +---------------+-----------------+------------------+---------+
      | Configuration | %Time w/o patch | %Time with patch | %Change |
      +---------------+-----------------+------------------+---------+
      |    8 nodes    |      65.34%     |      0.69%       |  -99%   |
      |    4 nodes    |       8.70%	  |      1.02%	     |  -88%   |
      |    2 nodes    |       0.41%     |      0.32%       |  -22%   |
      +---------------+-----------------+------------------+---------+
      
      It is obvious that the dramatic performance improvement at 8
      nodes was due to the drastic cut in the time spent within the
      __mutex_lock_slowpath() function.
      
      The table below show the improvements in other AIM7 workloads
      (at 8 nodes, hyperthreading off).
      
      +--------------+---------------+----------------+-----------------+
      |   Workload   | mean % change | mean % change  | mean % change   |
      |              | 10-100 users  | 200-1000 users | 1100-2000 users |
      +--------------+---------------+----------------+-----------------+
      | alltests     |     +0.6%     |   +104.2%      |   +185.9%       |
      | five_sec     |     +1.9%     |     +0.9%      |     +0.9%       |
      | fserver      |     +1.4%     |     -7.7%      |     +5.1%       |
      | new_fserver  |     -0.5%     |     +3.2%      |     +3.1%       |
      | shared       |    +13.1%     |   +146.1%      |   +181.5%       |
      | short        |     +7.4%     |     +5.0%      |     +4.2%       |
      +--------------+---------------+----------------+-----------------+
      Signed-off-by: NWaiman Long <Waiman.Long@hp.com>
      Reviewed-by: NDavidlohr Bueso <davidlohr.bueso@hp.com>
      Reviewed-by: NRik van Riel <riel@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Chandramouleeswaran Aswin <aswin@hp.com>
      Cc: Norton: Scott J <scott.norton@hp.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Dave Jones <davej@redhat.com>
      Cc: Clark Williams <williams@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1366226594-5506-3-git-send-email-Waiman.Long@hp.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      0dc8c730