1. 09 3月, 2012 20 次提交
    • B
      powerpc: Rework lazy-interrupt handling · 7230c564
      Benjamin Herrenschmidt 提交于
      The current implementation of lazy interrupts handling has some
      issues that this tries to address.
      
      We don't do the various workarounds we need to do when re-enabling
      interrupts in some cases such as when returning from an interrupt
      and thus we may still lose or get delayed decrementer or doorbell
      interrupts.
      
      The current scheme also makes it much harder to handle the external
      "edge" interrupts provided by some BookE processors when using the
      EPR facility (External Proxy) and the Freescale Hypervisor.
      
      Additionally, we tend to keep interrupts hard disabled in a number
      of cases, such as decrementer interrupts, external interrupts, or
      when a masked decrementer interrupt is pending. This is sub-optimal.
      
      This is an attempt at fixing it all in one go by reworking the way
      we do the lazy interrupt disabling from the ground up.
      
      The base idea is to replace the "hard_enabled" field with a
      "irq_happened" field in which we store a bit mask of what interrupt
      occurred while soft-disabled.
      
      When re-enabling, either via arch_local_irq_restore() or when returning
      from an interrupt, we can now decide what to do by testing bits in that
      field.
      
      We then implement replaying of the missed interrupts either by
      re-using the existing exception frame (in exception exit case) or via
      the creation of a new one from an assembly trampoline (in the
      arch_local_irq_enable case).
      
      This removes the need to play with the decrementer to try to create
      fake interrupts, among others.
      
      In addition, this adds a few refinements:
      
       - We no longer  hard disable decrementer interrupts that occur
      while soft-disabled. We now simply bump the decrementer back to max
      (on BookS) or leave it stopped (on BookE) and continue with hard interrupts
      enabled, which means that we'll potentially get better sample quality from
      performance monitor interrupts.
      
       - Timer, decrementer and doorbell interrupts now hard-enable
      shortly after removing the source of the interrupt, which means
      they no longer run entirely hard disabled. Again, this will improve
      perf sample quality.
      
       - On Book3E 64-bit, we now make the performance monitor interrupt
      act as an NMI like Book3S (the necessary C code for that to work
      appear to already be present in the FSL perf code, notably calling
      nmi_enter instead of irq_enter). (This also fixes a bug where BookE
      perfmon interrupts could clobber r14 ... oops)
      
       - We could make "masked" decrementer interrupts act as NMIs when doing
      timer-based perf sampling to improve the sample quality.
      
      Signed-off-by-yet: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      ---
      
      v2:
      
      - Add hard-enable to decrementer, timer and doorbells
      - Fix CR clobber in masked irq handling on BookE
      - Make embedded perf interrupt act as an NMI
      - Add a PACA_HAPPENED_EE_EDGE for use by FSL if they want
        to retrigger an interrupt without preventing hard-enable
      
      v3:
      
       - Fix or vs. ori bug on Book3E
       - Fix enabling of interrupts for some exceptions on Book3E
      
      v4:
      
       - Fix resend of doorbells on return from interrupt on Book3E
      
      v5:
      
       - Rebased on top of my latest series, which involves some significant
      rework of some aspects of the patch.
      
      v6:
       - 32-bit compile fix
       - more compile fixes with various .config combos
       - factor out the asm code to soft-disable interrupts
       - remove the C wrapper around preempt_schedule_irq
      
      v7:
       - Fix a bug with hard irq state tracking on native power7
      7230c564
    • B
      powerpc: Replace mfmsr instructions with load from PACA kernel_msr field · d9ada91a
      Benjamin Herrenschmidt 提交于
      On 64-bit, the mfmsr instruction can be quite slow, slower
      than loading a field from the cache-hot PACA, which happens
      to already contain the value we want in most cases.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      d9ada91a
    • B
      powerpc: Fix 64-bit BookE FP unavailable exceptions · 9424fabf
      Benjamin Herrenschmidt 提交于
      We were using CR0.EQ after EXCEPTION_COMMON, hoping it still
      contained whether we came from userspace or kernel space.
      
      However, under some circumstances, EXCEPTION_COMMON will
      call C code and clobber non-volatile registers, so we really
      need to re-load the previous MSR from the stackframe and
      re-test.
      
      While there, invert the condition to make the fast path more
      obvious and remove the BUG_OPCODE which was a debugging
      leftover and call .ret_from_except as we should.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      9424fabf
    • B
      powerpc: Fix register clobbering when accumulating stolen time · 990118c8
      Benjamin Herrenschmidt 提交于
      When running under a hypervisor that supports stolen time accounting,
      we may call C code from the macro EXCEPTION_PROLOG_COMMON in the
      exception entry path, which clobbers CR0.
      
      However, the FPU and vector traps rely on CR0 indicating whether we
      are coming from userspace or kernel to decide what to do.
      
      So we need to restore that value after the C call
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      990118c8
    • B
      powerpc/xmon: Add display of soft & hard irq states · 7ac21cd4
      Benjamin Herrenschmidt 提交于
      Also use local_paca instead of get_paca() to avoid getting into
      the smp_processor_id() debugging code from the debugger
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      7ac21cd4
    • B
      powerpc: Add support for page fault retry and fatal signals · 9be72573
      Benjamin Herrenschmidt 提交于
      Other architectures such as x86 and ARM have been growing
      new support for features like retrying page faults after
      dropping the mm semaphore to break contention, or being
      able to return from a stuck page fault when a SIGKILL is
      pending.
      
      This refactors our implementation of do_page_fault() to
      move the error handling out of line in a way similar to
      x86 and adds support for those two features.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      9be72573
    • B
      powerpc: Disable interrupts in 64-bit kernel FP and vector faults · 9f2f79e3
      Benjamin Herrenschmidt 提交于
      If we get a floating point, altivec or vsx unavaible interrupt in
      kernel, we trigger a kernel error. There is no point preserving
      the interrupt state, in fact, that can even make debugging harder
      as the processor state might change (we may even preempt) between
      taking the exception and landing in a debugger.
      
      So just make those 3 disable interrupts unconditionally.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      ---
      
      v2: On BookE only disable when hitting the kernel unavailable
          path, otherwise it will fail to restore softe as
          fast_exception_return doesn't do it.
      9f2f79e3
    • B
      powerpc: Call do_page_fault() with interrupts off · a546498f
      Benjamin Herrenschmidt 提交于
      We currently turn interrupts back to their previous state before
      calling do_page_fault(). This can be annoying when debugging as
      a bad fault will potentially have lost some processor state before
      getting into the debugger.
      
      We also end up calling some generic code with interrupts enabled
      such as notify_page_fault() with interrupts enabled, which could
      be unexpected.
      
      This changes our code to behave more like other architectures,
      and make the assembly entry code call into do_page_faults() with
      interrupts disabled. They are conditionally re-enabled from
      within do_page_fault() in the same spot x86 does it.
      
      While there, add the might_sleep() test in the case of a successful
      trylock of the mmap semaphore, again like x86.
      
      Also fix a bug in the existing assembly where r12 (_MSR) could get
      clobbered by C calls (the DTL accounting in the exception common
      macro and DISABLE_INTS) in some cases.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      ---
      
      v2. Add the r12 clobber fix
      a546498f
    • B
      powerpc: Improve behaviour of irq tracing on 64-bit exception entry · 1b701179
      Benjamin Herrenschmidt 提交于
      Some exceptions would unconditionally disable interrupts on entry,
      which is fine, but calling lockdep every time not only adds more
      overhead than strictly needed, but also means we get quite a few
      "redudant" disable logged, which makes it hard to spot the really
      bad ones.
      
      So instead, split the macro used by the exception code into a
      normal one and a separate one used when CONFIG_TRACE_IRQFLAGS is
      enabled, and make the later skip th tracing if interrupts were
      already disabled.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      1b701179
    • B
      powerpc: Improve 64-bit syscall entry/exit · 1421ae0b
      Benjamin Herrenschmidt 提交于
      We unconditionally hard enable interrupts. This is unnecessary as
      syscalls are expected to always be called with interrupts enabled.
      
      While at it, we add a WARN_ON if that is not the case and
      CONFIG_TRACE_IRQFLAGS is enabled (we don't want to add overhead
      to the fast path when this is not set though).
      
      Thus let's remove the enabling (and associated irq tracing) from
      the syscall entry path. Also on Book3S, replace a few mfmsr
      instructions with loads of PACAMSR from the PACA, which should be
      faster & schedule better.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      1421ae0b
    • B
      powerpc: Rework runlatch code · fe1952fc
      Benjamin Herrenschmidt 提交于
      This moves the inlines into system.h and changes the runlatch
      code to use the thread local flags (non-atomic) rather than
      the TIF flags (atomic) to keep track of the latch state.
      
      The code to turn it back on in an asynchronous interrupt is
      now simplified and partially inlined.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      fe1952fc
    • B
      powerpc: Use the same interrupt prolog for perfmon as other interrupts · 7450f6f0
      Benjamin Herrenschmidt 提交于
      The perfmon interrupt is the sole user of a special variant of the
      interrupt prolog which differs from the one used by external and timer
      interrupts in that it saves the non-volatile GPRs and doesn't turn the
      runlatch on.
      
      The former is unnecessary and the later is arguably incorrect, so
      let's clean that up by using the same prolog. While at it we rename
      that prolog to use the _ASYNC prefix.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      7450f6f0
    • B
      powerpc: Remove legacy iSeries bits from assembly files · 4f8cf36f
      Benjamin Herrenschmidt 提交于
      This removes the various bits of assembly in the kernel entry,
      exception handling and SLB management code that were specific
      to running under the legacy iSeries hypervisor which is no
      longer supported.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      4f8cf36f
    • S
      powerpc: clean up vio.c · b0787660
      Stephen Rothwell 提交于
      This cleans up vio.c after the removal of the legacy iSeries platform.
      It also removes some no longer referenced include files.
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      b0787660
    • S
      driver-core: remove legacy iSeries hack · fcd6f762
      Stephen Rothwell 提交于
      The PowerPC legacy iSeries plateform is being removed along with the
      "one looney iseries driver", so this code can now be removed as well.
      
      cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      fcd6f762
    • S
      tty: powerpc: remove SERIAL_ICOM dependency on PPC_ISERIES · c17a9d4c
      Stephen Rothwell 提交于
      The PowerPC legacy iSeries platform is being removed so this is no
      longer selectable.
      
      Cc: Alan Cox <alan@linux.intel.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: linux-serial@vger.kernel.org
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      c17a9d4c
    • S
      tty: powerpc: remove hvc_iseries · b6680891
      Stephen Rothwell 提交于
      The PowerPC legacy iSeries platform is being removed, so this code is no
      longer needed.
      
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      b6680891
    • S
      powerpc: remove the legacy iSeries part of ibmvscsi · 7834799a
      Stephen Rothwell 提交于
      The PowerPC legacy iSeries platform is being removed and this code is
      no longer selectable.  There is more clean up that can be done, but this
      just gets the old code out of the way.
      
      Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
      Cc: Brian King <brking@linux.vnet.ibm.com>
      Cc: linux-scsi@vger.kernel.org
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      7834799a
    • S
      net: powerpc: remove the legacy iSeries ethernet driver · e92a6659
      Stephen Rothwell 提交于
      This driver is specific to the PowerPC legcay iSeries platform which is
      being removed.
      
      Cc: David Miller <davem@davemloft.net>
      Cc: <netdev@vger.kernel.org>
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      e92a6659
    • S
  2. 07 3月, 2012 8 次提交
  3. 27 2月, 2012 10 次提交
    • I
      carma-fpga: fix race between data dumping and DMA callback · 6c15d7af
      Ira Snyder 提交于
      When the system is under heavy load, we occasionally saw a problem where
      the system would get a legitimate interrupt when they should be
      disabled.
      
      This was caused by the data_dma_cb() DMA callback unconditionally
      re-enabling FPGA interrupts even when data dumping is disabled. When
      data dumping was re-enabled, the irq handler would fire while a DMA was
      in progress. The "BUG_ON(priv->inflight != NULL);" during the second
      invocation of the DMA callback caused the system to crash.
      
      To fix the issue, the priv->enabled boolean is moved under the
      protection of the priv->lock spinlock. The DMA callback checks the
      boolean to know whether to re-enable FPGA interrupts before it returns.
      
      Now that it is fixed, the driver keeps FPGA interrupts disabled when it
      expects that they are disabled, fixing the bug.
      Signed-off-by: NIra W. Snyder <iws@ovro.caltech.edu>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      6c15d7af
    • I
      carma-fpga: fix lockdep warning · 75ff85a8
      Ira Snyder 提交于
      Lockdep occasionally complains with the message:
      INFO: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected
      
      This is caused by calling videobuf_dma_unmap() under spin_lock_irq(). To
      fix the warning, we drop the lock before unmapping and freeing the
      buffer.
      Signed-off-by: NIra W. Snyder <iws@ovro.caltech.edu>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      75ff85a8
    • M
      macintosh: Fix typo in mediabay.c · 6d45584f
      Masanari Iida 提交于
      Fix typo "unsuported" to "unsupported" in
      drivers/machintosh/mediabay.c
      
      Signed-off-by: Masanari Iida<standby24x7@gmail.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      6d45584f
    • D
      arch/powerpc/platforms/powernv/setup.c: included asm/xics.h twice · 0a167e0a
      Danny Kukawka 提交于
      arch/powerpc/platforms/powernv/setup.c: included 'asm/xics.h' twice,
      remove the duplicate.
      Signed-off-by: NDanny Kukawka <danny.kukawka@bisect.de>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      0a167e0a
    • D
      arch/powerpc/kvm/book3s_hv.c: included linux/sched.h twice · ed7e3d1c
      Danny Kukawka 提交于
      arch/powerpc/kvm/book3s_hv.c: included 'linux/sched.h' twice,
      remove the duplicate.
      Signed-off-by: NDanny Kukawka <danny.kukawka@bisect.de>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      ed7e3d1c
    • S
      powerpc: remove CONFIG_PPC_ISERIES from the architecture Kconfig files · 3d066d77
      Stephen Rothwell 提交于
      After this, we can remove the legacy iSeries code more easily.
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      3d066d77
    • B
      powerpc/mpic: Fix allocation of reverse-map for multi-ISU mpics · fe83364f
      Benjamin Herrenschmidt 提交于
      When using a multi-ISU MPIC, we can interrupts up to
      isu_size * MPIC_MAX_ISU, not just isu_size, so allocate
      the right size reverse map.
      
      Without this, the code will constantly fallback to
      a linear search.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      fe83364f
    • B
      f851013c
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 203738e5
      Linus Torvalds 提交于
      1) ICMP sockets leave err uninitialized but we try to return it for the
         unsupported MSG_OOB case, reported by Dave Jones.
      
      2) Add new Zaurus device ID entries, from Dave Jones.
      
      3) Pointer calculation in hso driver memset is wrong, from Dan
         Carpenter.
      
      4) ks8851_probe() checks unsigned value as negative, fix also from Dan
         Carpenter.
      
      5) Fix crashes in atl1c driver due to TX queue handling, from Eric
         Dumazet.  I anticipate some TX side locking fixes coming in the near
         future for this driver as well.
      
      6) The inline directive fix in Bluetooth which was breaking the build
         only with very new versions of GCC, from Johan Hedberg.
      
      7) Fix crashes in the ATP CLIP code due to ARP cleanups this merge
         window, reported by Meelis Roos and fixed by Eric Dumazet.
      
      8) JME driver doesn't flush RX FIFO correctly, from Guo-Fu Tseng.
      
      9) Some ip6_route_output() callers test the return value for NULL, but
         this never happens as the convention is to return a dst entry with
         dst->error set.  Fixes from RonQing Li.
      
      10) Logitech Harmony 900 should be handled by zaurus driver not
         cdc_ether, update white lists and black lists accordingly.  From
         Scott Talbert.
      
      11) Receiving from certain kinds of devices there won't be a MAC header,
         so there is no MAC header to fixup in the IPSEC code, and if we try
         to do it we'll crash.  Fix from Eric Dumazet.
      
      12) Port type array indexing off-by-one in mlx4 driver, fix from Yevgeny
         Petrilin.
      
      13) Fix regression in link-down handling in davinci_emac which causes
         all RX descriptors to be freed up and therefore RX to wedge
         completely, from Christian Riesch.
      
      14) It took two attempts, but ctnetlink soft lockups seem to be
         cured now, from Pablo Neira Ayuso.
      
      15) Endianness bug fix in ENIC driver, from Santosh Nayak.
      
      16) The long ago conversion of the PPP fragmentation code over to
         abstracted SKB list handling wasn't perfect, once we get an
         out of sequence SKB we don't flush the rest of them like we
         should.  From Ben McKeegan.
      
      17) Fix regression of ->ip_summed initialization in sfc driver.
         From Ben Hutchings.
      
      18) Bluetooth timeout mistakenly using msecs instead of jiffies,
         from Andrzej Kaczmarek.
      
      19) Using _sync variant of work cancellation results in deadlocks,
         use the non _sync variants instead.  From Andre Guedes.
      
      20) Bluetooth rfcomm code had reference counting problems leading
         to crashes, fix from Octavian Purdila.
      
      21) The conversion of netem over to classful qdisc handling added
         two bugs to netem_dequeue(), fixes from Eric Dumazet.
      
      22) Missing pci_iounmap() in ATM Solos driver.  Fix from Julia Lawall.
      
      23) b44_pci_exit() should not have __exit tag since it's invoked from
         non-__exit code.  From Nikola Pajkovsky.
      
      24) The conversion of the neighbour hash tables over to RCU added a
         race, fixed here by adding the necessary reread of tbl->nht, fix
         from Michel Machado.
      
      25) When we added VF (virtual function) attributes for network device
         dumps, this potentially bloats up the size of the dump of one
         network device such that the dump size is too large for the buffer
         allocated by properly written netlink applications.
      
         In particular, if you add 255 VFs to a network device, parts of
         GLIBC stop working.
      
         To fix this, we add an attribute that is used to turn on these
         extended portions of the network device dump.  Sophisticaed
         applications like 'ip' that want to see this stuff  will be changed
         to set the attribute, whereas things like GLIBC that don't care
         about VFs simply will not, and therefore won't be busted by the
         mere presence of VFs on a network device.
      
         Thanks to the tireless work of Greg Rose on this fix.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (53 commits)
        sfc: Fix assignment of ip_summed for pre-allocated skbs
        ppp: fix 'ppp_mp_reconstruct bad seq' errors
        enic: Fix endianness bug.
        gre: fix spelling in comments
        netfilter: ctnetlink: fix soft lockup when netlink adds new entries (v2)
        Revert "netfilter: ctnetlink: fix soft lockup when netlink adds new entries"
        davinci_emac: Do not free all rx dma descriptors during init
        mlx4_core: Fixing array indexes when setting port types
        phy: IC+101G and PHY_HAS_INTERRUPT flag
        netdev/phy/icplus: Correct broken phy_init code
        ipsec: be careful of non existing mac headers
        Move Logitech Harmony 900 from cdc_ether to zaurus
        hso: memsetting wrong data in hso_get_count()
        netfilter: ip6_route_output() never returns NULL.
        ethernet/broadcom: ip6_route_output() never returns NULL.
        ipv6: ip6_route_output() never returns NULL.
        jme: Fix FIFO flush issue
        atm: clip: remove clip_tbl
        ipv4: ping: Fix recvmsg MSG_OOB error handling.
        rtnetlink: Fix problem with buffer allocation
        ...
      203738e5
    • L
      Fix autofs compile without CONFIG_COMPAT · 3c761ea0
      Linus Torvalds 提交于
      The autofs compat handling fix caused a compile failure when
      CONFIG_COMPAT isn't defined.
      
      Instead of adding random #ifdef'fery in autofs, let's just make the
      compat helpers earlier to use: without CONFIG_COMPAT, is_compat_task()
      just hardcodes to zero.
      
      We could probably do something similar for a number of other cases where
      we have #ifdef's in code, but this is the low-hanging fruit.
      Reported-and-tested-by: NAndreas Schwab <schwab@linux-m68k.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3c761ea0
  4. 26 2月, 2012 2 次提交