1. 05 12月, 2011 7 次提交
    • D
      x86, NMI: Add NMI IPI selftest · 99e8b9ca
      Don Zickus 提交于
      The previous patch modified the stop cpus path to use NMI
      instead of IRQ as the way to communicate to the other cpus to
      shutdown.  There were some concerns that various machines may
      have problems with using an NMI IPI.
      
      This patch creates a selftest to check if NMI is working at
      boot. The idea is to help catch any issues before the machine
      panics and we learn the hard way.
      
      Loosely based on the locking-selftest.c file, this separate file
      runs a couple of simple tests and reports the results.  The
      output looks like:
      
      ...
      Brought up 4 CPUs
      ----------------
      | NMI testsuite:
      --------------------
        remote IPI:  ok  |
         local IPI:  ok  |
      --------------------
      Good, all   2 testcases passed! |
      ---------------------------------
      Total of 4 processors activated (21330.61 BogoMIPS).
      ...
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: seiji.aguchi@hds.com
      Cc: vgoyal@redhat.com
      Cc: mjg@redhat.com
      Cc: tony.luck@intel.com
      Cc: gong.chen@intel.com
      Cc: satoru.moriya@hds.com
      Cc: avi@redhat.com
      Cc: Andi Kleen <andi@firstfloor.org>
      Link: http://lkml.kernel.org/r/1318533267-18880-3-git-send-email-dzickus@redhat.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      99e8b9ca
    • D
      x86, reboot: Use NMI instead of REBOOT_VECTOR to stop cpus · 3603a251
      Don Zickus 提交于
      A recent discussion started talking about the locking on the
      pstore fs and how it relates to the kmsg infrastructure.  We
      noticed it was possible for userspace to r/w to the pstore fs
      (grabbing the locks in the process) and block the panic path
      from r/w to the same fs.
      
      The reason was the cpu with the lock could be doing work while
      the crashing cpu is panic'ing.  Busting those spinlocks might
      cause those cpus to step on each other's data.  Fine, fair
      enough.
      
      It was suggested it would be nice to serialize the panic path
      (ie stop the other cpus) and have only one cpu running.  This
      would allow us to bust the spinlocks and not worry about another
      cpu stepping on the data.
      
      Of course, smp_send_stop() does this in the panic case.
      kmsg_dump() would have to be moved to be called after it.  Easy
      enough.
      
      The only problem is on x86 the smp_send_stop() function calls
      the REBOOT_VECTOR.  Any cpu with irqs disabled (which pstore and
      its backend ERST would do), block this IPI and thus do not stop.
       This makes it difficult to reliably log data to the pstore fs.
      
      The patch below switches from the REBOOT_VECTOR to NMI (and
      mimics what kdump does).  Switching to NMI allows us to deliver
      the IPI when irqs are disabled, increasing the reliability of
      this function.
      
      However, Andi carefully noted that on some machines this
      approach does not work because of broken BIOSes or whatever.
      
      To help accomodate this, the next couple of patches will run a
      selftest and provide a knob to disable.
      
      V2:
        uses atomic ops to serialize the cpu that shuts everyone down
      V3:
        comment cleanup
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: seiji.aguchi@hds.com
      Cc: vgoyal@redhat.com
      Cc: mjg@redhat.com
      Cc: tony.luck@intel.com
      Cc: gong.chen@intel.com
      Cc: satoru.moriya@hds.com
      Cc: avi@redhat.com
      Cc: Andi Kleen <andi@firstfloor.org>
      Link: http://lkml.kernel.org/r/1318533267-18880-2-git-send-email-dzickus@redhat.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      3603a251
    • M
      x86: Clean up the range of stack overflow checking · 467e6b7a
      Mitsuo Hayasaka 提交于
      The overflow checking of kernel stack checks if the stack
      pointer points to the available kernel stack range, which is
      derived from the original overflow checking.
      
      It is clear that curbase address is always less than low
      boundary of available kernel stack. So, this patch removes the
      first condition that checks if the pointer is higher than
      curbase.
      Signed-off-by: NMitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com>
      Cc: yrl.pp-manager.tt@hitachi.com
      Cc: Randy Dunlap <rdunlap@xenotime.net>
      Link: http://lkml.kernel.org/r/20111129060845.11076.40916.stgit@ltc219.sdl.hitachi.co.jpSigned-off-by: NIngo Molnar <mingo@elte.hu>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      467e6b7a
    • M
      x86: Panic on detection of stack overflow · 55af7796
      Mitsuo Hayasaka 提交于
      Currently, messages are just output on the detection of stack
      overflow, which is not sufficient for systems that need a
      high reliability. This is because in general the overflow may
      corrupt data, and the additional corruption may occur due to
      reading them unless systems stop.
      
      This patch adds the sysctl parameter
      kernel.panic_on_stackoverflow and causes a panic when detecting
      the overflows of kernel, IRQ and exception stacks except user
      stack according to the parameter. It is disabled by default.
      Signed-off-by: NMitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com>
      Cc: yrl.pp-manager.tt@hitachi.com
      Cc: Randy Dunlap <rdunlap@xenotime.net>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Link: http://lkml.kernel.org/r/20111129060836.11076.12323.stgit@ltc219.sdl.hitachi.co.jpSigned-off-by: NIngo Molnar <mingo@elte.hu>
      55af7796
    • M
      x86: Check stack overflow in detail · 37fe6a42
      Mitsuo Hayasaka 提交于
      Currently, only kernel stack is checked for the overflow, which
      is not sufficient for systems that need a high reliability. To
      enhance it, it is required to check the IRQ and exception
      stacks, as well.
      
      This patch checks all the stack types and will cause messages of
      stacks in detail when free stack space drops below a certain
      limit except user stack.
      Signed-off-by: NMitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com>
      Cc: yrl.pp-manager.tt@hitachi.com
      Cc: Randy Dunlap <rdunlap@xenotime.net>
      Link: http://lkml.kernel.org/r/20111129060829.11076.51733.stgit@ltc219.sdl.hitachi.co.jpSigned-off-by: NIngo Molnar <mingo@elte.hu>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      37fe6a42
    • M
      x86: Add user_mode_vm check in stack_overflow_check · 69682b62
      Mitsuo Hayasaka 提交于
      The kernel stack overflow is checked in stack_overflow_check(),
      which may wrongly detect the overflow if the stack pointer in
      user space points to the kernel stack intentionally or
      accidentally. So, the actual overflow is never detected after
      this misdetection because WARN_ONCE() is used on the detection
      of it.
      
      This patch adds user-mode-vm checking before it to avoid this
      problem and bails out early if the user stack is used.
      Signed-off-by: NMitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com>
      Cc: yrl.pp-manager.tt@hitachi.com
      Cc: Randy Dunlap <rdunlap@xenotime.net>
      Link: http://lkml.kernel.org/r/20111129060821.11076.55315.stgit@ltc219.sdl.hitachi.co.jpSigned-off-by: NIngo Molnar <mingo@elte.hu>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      69682b62
    • L
      x86: Fix boot failures on older AMD CPU's · 8e8da023
      Linus Torvalds 提交于
      People with old AMD chips are getting hung boots, because commit
      bcb80e53 ("x86, microcode, AMD: Add microcode revision to
      /proc/cpuinfo") moved the microcode detection too early into
      "early_init_amd()".
      
      At that point we are *so* early in the booth that the exception tables
      haven't even been set up yet, so the whole
      
      	rdmsr_safe(MSR_AMD64_PATCH_LEVEL, &c->microcode, &dummy);
      
      doesn't actually work: if the rdmsr does a GP fault (due to non-existant
      MSR register on older CPU's), we can't fix it up yet, and the boot fails.
      
      Fix it by simply moving the code to a slightly later point in the boot
      (init_amd() instead of early_init_amd()), since the kernel itself
      doesn't even really care about the microcode patchlevel at this point
      (or really ever: it's made available to user space in /proc/cpuinfo, and
      updated if you do a microcode load).
      Reported-tested-and-bisected-by: NLarry Finger <Larry.Finger@lwfinger.net>
      Tested-by: NBob Tracy <rct@gherkin.frus.com>
      Acked-by: NBorislav Petkov <borislav.petkov@amd.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8e8da023
  2. 04 12月, 2011 1 次提交
    • K
      xen/pm_idle: Make pm_idle be default_idle under Xen. · e5fd47bf
      Konrad Rzeszutek Wilk 提交于
      The idea behind commit d91ee586 ("cpuidle: replace xen access to x86
      pm_idle and default_idle") was to have one call - disable_cpuidle()
      which would make pm_idle not be molested by other code.  It disallows
      cpuidle_idle_call to be set to pm_idle (which is excellent).
      
      But in the select_idle_routine() and idle_setup(), the pm_idle can still
      be set to either: amd_e400_idle, mwait_idle or default_idle.  This
      depends on some CPU flags (MWAIT) and in AMD case on the type of CPU.
      
      In case of mwait_idle we can hit some instances where the hypervisor
      (Amazon EC2 specifically) sets the MWAIT and we get:
      
        Brought up 2 CPUs
        invalid opcode: 0000 [#1] SMP
      
        Pid: 0, comm: swapper Not tainted 3.1.0-0.rc6.git0.3.fc16.x86_64 #1
        RIP: e030:[<ffffffff81015d1d>]  [<ffffffff81015d1d>] mwait_idle+0x6f/0xb4
        ...
        Call Trace:
         [<ffffffff8100e2ed>] cpu_idle+0xae/0xe8
         [<ffffffff8149ee78>] cpu_bringup_and_idle+0xe/0x10
        RIP  [<ffffffff81015d1d>] mwait_idle+0x6f/0xb4
         RSP <ffff8801d28ddf10>
      
      In the case of amd_e400_idle we don't get so spectacular crashes, but we
      do end up making an MSR which is trapped in the hypervisor, and then
      follow it up with a yield hypercall.  Meaning we end up going to
      hypervisor twice instead of just once.
      
      The previous behavior before v3.0 was that pm_idle was set to
      default_idle regardless of select_idle_routine/idle_setup.
      
      We want to do that, but only for one specific case: Xen.  This patch
      does that.
      
      Fixes RH BZ #739499 and Ubuntu #881076
      Reported-by: NStefan Bader <stefan.bader@canonical.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e5fd47bf
  3. 03 12月, 2011 6 次提交
    • L
      Merge branch 'usb-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · af968e29
      Linus Torvalds 提交于
      * 'usb-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (21 commits)
        usb: ftdi_sio: add PID for Propox ISPcable III
        Revert "xHCI: reset-on-resume quirk for NEC uPD720200"
        xHCI: fix bug in xhci_clear_command_ring()
        usb: gadget: fsl_udc: fix dequeuing a request in progress
        usb: fsl_mxc_udc.c: Remove compile-time dependency of MX35 SoC type
        usb: fsl_mxc_udc.c: Fix build issue by including missing header file
        USB: fsl_udc_core: use usb_endpoint_xfer_isoc to judge ISO XFER
        usb: udc: Fix gadget driver's speed check in various UDC drivers
        usb: gadget: fix g_serial regression
        usb: renesas_usbhs: fixup driver speed
        usb: renesas_usbhs: fixup gadget.dev.driver when udc_stop.
        usb: renesas_usbhs: fixup signal the driver that cable was disconnected
        usb: renesas_usbhs: fixup device_register timing
        usb: musb: PM: fix context save/restore in suspend/resume path
        USB: linux-cdc-acm.inf: add support for the acm_ms gadget
        EHCI : Fix a regression in the ISO scheduler
        xHCI: reset-on-resume quirk for NEC uPD720200
        USB: whci-hcd: fix endian conversion in qset_clear()
        USB: usb-storage: unusual_devs entry for Kingston DT 101 G2
        usb: option: add SIMCom SIM5218
        ...
      af968e29
    • L
      Merge branch 'staging-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging · f9143eae
      Linus Torvalds 提交于
      * 'staging-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
        Staging: comedi: fix integer overflow in do_insnlist_ioctl()
        Revert "Staging: comedi: integer overflow in do_insnlist_ioctl()"
        Staging: comedi: integer overflow in do_insnlist_ioctl()
        Staging: comedi: fix signal handling in read and write
        Staging: comedi: fix mmap_count
        staging: comedi: fix oops for USB DAQ devices.
        staging: comedi: usbduxsigma: Fixed wrong range for the analogue channel.
        staging:rts_pstor:Complete scanning_done variable
        staging: usbip: bugfix for deadlock
      f9143eae
    • L
      Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs · ffb8fb54
      Linus Torvalds 提交于
      * 'for-linus' of git://oss.sgi.com/xfs/xfs:
        xfs: fix attr2 vs large data fork assert
        xfs: force buffer writeback before blocking on the ilock in inode reclaim
        xfs: validate acl count
      ffb8fb54
    • L
      7ed89aed
    • L
      Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux · c2b5adb4
      Linus Torvalds 提交于
      * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
        vmwgfx: integer overflow in vmw_kms_update_layout_ioctl()
        drm/radeon/kms: fix 2D tiling CS support on EG/CM
        drm/radeon/kms: fix scanout of 2D tiled buffers on EG/CM
        drm: Fix lack of CRTC disable for drm_crtc_helper_set_config(.fb=NULL)
        drm/radeon/kms: add some new pci ids
        drm/radeon/kms: Skip ACPI call to ATIF when possible
        drm/radeon/kms: Hide debugging message
        drm/radeon/kms: add some loop timeouts in pageflip code
        drm/nv50/disp: silence compiler warning
        drm/nouveau: fix oopses caused by clear being called on unpopulated ttms
        drm/nouveau: Keep RAMIN heap within the channel.
        drm/nvd0/disp: fix sor dpms typo, preventing dpms on in some situations
        drm/nvc0/gr: fix TP init for transform feedback offset queries
        drm/nouveau: add dumb ioctl support
      c2b5adb4
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 0efebaa7
      Linus Torvalds 提交于
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
        ALSA: hda - Fix S3/S4 problem on machines with VREF-pin mute-LED
        ALSA: hda_intel - revert a quirk that affect VIA chipsets
        ALSA: hda - Avoid touching mute-VREF pin for IDT codecs
        firmware: Sigma: Fix endianess issues
        firmware: Sigma: Skip header during CRC generation
        firmware: Sigma: Prevent out of bounds memory access
        ALSA: usb-audio - Support for Roland GAIA SH-01 Synthesizer
        ASoC: Supply dcs_codes for newer WM1811 revisions
        ASoC: Error out if we can't generate a LRCLK at all for WM8994
        ASoC: Correct name of Speyside Main Speaker widget
        ASoC: skip resume of soc-audio devices without codecs
        ASoC: cs42l51: Fix off-by-one for reg_cache_size
        ASoC: drop support for PlayPaq with WM8510
        ASoC: mpc8610: tell the CS4270 codec that it's the master
        ASoC: cs4720: use snd_soc_cache_sync()
        ASoC: SAMSUNG: Fix build error
        ASoC: max9877: Update register if either val or val2 is changed
        ASoC: Fix wrong define for AD1836_ADC_WORD_OFFSET
      0efebaa7
  4. 02 12月, 2011 21 次提交
  5. 01 12月, 2011 5 次提交