1. 31 7月, 2010 22 次提交
    • B
      x86/PCI: use host bridge _CRS info on ASRock ALiveSATA2-GLAN · 2491762c
      Bjorn Helgaas 提交于
      This DMI quirk turns on "pci=use_crs" for the ALiveSATA2-GLAN because
      amd_bus.c doesn't handle this system correctly.
      
      The system has a single HyperTransport I/O chain, but has two PCI host
      bridges to buses 00 and 80.  amd_bus.c learns the MMIO range associated
      with buses 00-ff and that this range is routed to the HT chain hosted at
      node 0, link 0:
      
          bus: [00, ff] on node 0 link 0
          bus: 00 index 1 [mem 0x80000000-0xfcffffffff]
      
      This includes the address space for both bus 00 and bus 80, and amd_bus.c
      assumes it's all routed to bus 00.
      
      We find device 80:01.0, which BIOS left in the middle of that space, but
      we don't find a bridge from bus 00 to bus 80, so we conclude that 80:01.0
      is unreachable from bus 00, and we move it from the original, working,
      address to something outside the bus 00 aperture, which does not work:
      
          pci 0000:80:01.0: reg 10: [mem 0xfebfc000-0xfebfffff 64bit]
          pci 0000:80:01.0: BAR 0: assigned [mem 0xfd00000000-0xfd00003fff 64bit]
      
      The BIOS told us everything we need to know to handle this correctly,
      so we're better off if we just pay attention, which lets us leave the
      80:01.0 device at the original, working, address:
      
          ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-7f])
          pci_root PNP0A03:00: host bridge window [mem 0x80000000-0xff37ffff]
          ACPI: PCI Root Bridge [PCI1] (domain 0000 [bus 80-ff])
          pci_root PNP0A08:00: host bridge window [mem 0xfebfc000-0xfebfffff]
      
      This was a regression between 2.6.33 and 2.6.34.  In 2.6.33, amd_bus.c
      was used only when we found multiple HT chains.  3e3da00c, which
      enabled amd_bus.c even on systems with a single HT chain, caused this
      failure.
      
      This quirk was written by Graham.  If we ever enable "pci=use_crs" for
      machines from 2006 or earlir, this quirk should be removed.
      
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=16007
      
      Cc: stable@kernel.org
      Reported-by: NGraham Ramsey <ramsey.graham@ntlworld.com>
      Signed-off-by: NBjorn Helgaas <bjorn.helgaas@hp.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      2491762c
    • F
      PCI: remove unused HAVE_ARCH_PCI_SET_DMA_MAX_SEGMENT_{SIZE|BOUNDARY} · bfb51cd0
      FUJITA Tomonori 提交于
      In 2.6.34, we transformed the PCI DMA API into the generic device
      mode. The PCI DMA API is just the wrapper of the DMA API.
      
      So we don't need HAVE_ARCH_PCI_SET_DMA_MAX_SEGMENT_SIZE or
      HAVE_ARCH_PCI_SET_DMA_SEGMENT_BOUNDARY (which enable architectures to
      have the own implementations). Both haven't been used anyway.
      Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      bfb51cd0
    • J
      PCI: disable mmio during bar sizing · 253d2e54
      Jacob Pan 提交于
      It is a known issue that mmio decoding shall be disabled while doing PCI
      bar sizing. Host bridge and other devices (PCI PIC) shall be excluded for
      certain platforms. This patch mainly comes from Mathew Willcox's
      patch in http://kerneltrap.org/mailarchive/linux-kernel/2007/9/13/258969.
      
      A new flag bit "mmio_alway_on" is added to pci_dev with the intention that
      devices with their mmio decoding cannot be disabled during BAR sizing shall
      have this bit set, preferrablly in their quirks.
      
      Without this patch, Intel Moorestown platform graphics unit will be
      corrupted during bar sizing activities.
      Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      253d2e54
    • B
      PCI: MSI: Remove unsafe and unnecessary hardware access · fcd097f3
      Ben Hutchings 提交于
      During suspend on an SMP system, {read,write}_msi_msg_desc() may be
      called to mask and unmask interrupts on a device that is already in a
      reduced power state.  At this point memory-mapped registers including
      MSI-X tables are not accessible, and config space may not be fully
      functional either.
      
      While a device is in a reduced power state its interrupts are
      effectively masked and its MSI(-X) state will be restored when it is
      brought back to D0.  Therefore these functions can simply read and
      write msi_desc::msg for devices not in D0.
      
      Further, read_msi_msg_desc() should only ever be used to update a
      previously written message, so it can always read msi_desc::msg
      and never needs to touch the hardware.
      Tested-by: N"Michael Chan" <mchan@broadcom.com>
      Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      fcd097f3
    • M
      PCI: Default PCIe ASPM control to on and require !EMBEDDED to disable · ea5f9fc5
      Matthew Garrett 提交于
      The CONFIG_PCIEASPM option is confusing and potentially dangerous. ASPM is
      a hardware mediated feature rather than one under direct OS control, and
      even if the config option is disabled the system firmware may have turned
      on ASPM on various bits of hardware. This can cause problems later -
      various hardware that claims to support ASPM does a poor job of it and may
      hang or cause other difficulties. The kernel is able to recognise this in
      many cases and disable the ASPM functionality, but only if CONFIG_PCIEASPM
      is enabled.
      
      Given that in its default configuration this option will either leave the
      hardware as it was originally or disable hardware functionality that may
      cause problems, it should by default y. The only reason to disable it
      ought to be to reduce code size, so make it dependent on CONFIG_EMBEDDED.
      Signed-off-by: NMatthew Garrett <mjg@redhat.com>
      Cc: lrodriguez@atheros.com
      Cc: maximlevitsky@gmail.com
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      ea5f9fc5
    • K
      PCI: kernel oops on access to pci proc file while hot-removal · 8cc2bfd8
      Kenji Kaneshige 提交于
      I encountered the problem that /proc/bus/pci/XX/YY is not removed even
      after the corresponding device is hot-removed, if the file is still
      being opened. In addtion, accessing this file in this situation causes
      kernel panic (see below).
      
      Becasue the pci_proc_detach_device() doesn't call remove_proc_entry()
      if struct proc_dir_entry->count > 1, access to /proc/bus/pci/XX/YY
      would refer to struct pci_dev that was already freed.
      
      Though I don't know why the check for proc_dir_entry->count was added,
      I don't think it is needed. Removing this check fixes the problem.
      
      Steps to reproduce
      ------------------
      # cd /sys/bus/pci/slots/2/
      # PROC_BUS_PCI_FILE=/proc/bus/pci/`awk -F: '{print $2"/"$3}' < address`.0
      # sleep 10000 < $PROC_BUS_PCI_FILE &
      # echo 0 > power
      # while true; do cat $PROC_BUS_PCI_FILE > /dev/null; done
      
      Oops Messages
      -------------
      BUG: unable to handle kernel NULL pointer dereference at 00000042
      IP: [<c05c82d5>] pci_user_read_config_dword+0x65/0xa0
      *pdpt = 000000002185e001 *pde = 0000000476a79067
      Oops: 0000 [#1] SMP
      last sysfs file: /sys/devices/pci0000:00/0000:00:1c.0/0000:10:00.0/local_cpus
      Modules linked in: autofs4 sunrpc cpufreq_ondemand acpi_cpufreq ipv6 dm_mirror dm_region_hash dm_log dm_mod e1000e i2c_i801 i2c_core iTCO_wdt igb sg pcspkr dca iTCO_vendor_support ext4 mbcache jbd2 sd_mod crc_t10dif lpfc mptsas scsi_transport_fc mptscsih mptbase scsi_tgt scsi_transport_sas [last unloaded: microcode]
      
      Pid: 2997, comm: cat Not tainted 2.6.34-kk #32 SB/PRIMEQUEST 1800E
      EIP: 0060:[<c05c82d5>] EFLAGS: 00010046 CPU: 19
      EIP is at pci_user_read_config_dword+0x65/0xa0
      EAX: 00000002 EBX: e44f1800 ECX: e144df14 EDX: 155668c7
      ESI: 00000087 EDI: 00000000 EBP: e144df40 ESP: e144df0c
       DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
      Process cat (pid: 2997, ti=e144c000 task=e26f2570 task.ti=e144c000)
      Stack:
       c09ceac0 c0570f72 ffffffff 08c57000 00000000 00001000 e44f1800 c05d2404
      <0> e144df40 00001000 00000000 00001000 08c57000 3093ae50 e420cb40 e358d5c0
      <0> c05d2300 fffffffb c054984f e144df9c 00008000 08c57000 e358d5c0 00008000
      Call Trace:
       [<c0570f72>] ? security_capable+0x22/0x30
       [<c05d2404>] ? proc_bus_pci_read+0x104/0x220
       [<c05d2300>] ? proc_bus_pci_read+0x0/0x220
       [<c054984f>] ? proc_reg_read+0x5f/0x90
       [<c05497f0>] ? proc_reg_read+0x0/0x90
       [<c050694d>] ? vfs_read+0x9d/0x190
       [<c04958f4>] ? audit_syscall_entry+0x204/0x230
       [<c0506a81>] ? sys_read+0x41/0x70
       [<c0402f1f>] ? sysenter_do_call+0x12/0x28
      Code: b4 26 00 00 00 00 b8 20 88 b1 c0 c7 44 24 08 ff ff ff ff e8 3e 52 22 00 f6 83 24 04 00 00 20 75 34 8b 43 08 8d 4c 24 08 8b 53 1c <8b> 70 40 89 4c 24 04 89 f9 c7 04 24 04 00 00 00 ff 16 89 c6 f0
      EIP: [<c05c82d5>] pci_user_read_config_dword+0x65/0xa0 SS:ESP 0068:e144df0c
      CR2: 0000000000000042
      Acked-by: NGreg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: NKenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      8cc2bfd8
    • K
      PCI: pci-sysfs: remove casts from void* · a3f5835a
      Kulikov Vasiliy 提交于
      Remove unnesessary casts from void*.
      Signed-off-by: NKulikov Vasiliy <segooon@gmail.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      a3f5835a
    • M
      ACPI: Disable ASPM if the platform won't provide _OSC control for PCIe · 852972ac
      Matthew Garrett 提交于
      The PCI SIG documentation for the _OSC OS/firmware handshaking interface
      states:
      
      "If the _OSC control method is absent from the scope of a host bridge
      device, then the operating system must not enable or attempt to use any
      features defined in this section for the hierarchy originated by the host
      bridge."
      
      The obvious interpretation of this is that the OS should not attempt to use
      PCIe hotplug, PME or AER - however, the specification also notes that an
      _OSC method is *required* for PCIe hierarchies, and experimental validation
      with An Alternative OS indicates that it doesn't use any PCIe functionality
      if the _OSC method is missing. That arguably means we shouldn't be using
      MSI or extended config space, but right now our problems seem to be limited
      to vendors being surprised when ASPM gets enabled on machines when other
      OSs refuse to do so. So, for now, let's just disable ASPM if the _OSC
      method doesn't exist or refuses to hand over PCIe capability control.
      Acked-by: NRafael J. Wysocki <rjw@sisk.pl>
      Signed-off-by: NMatthew Garrett <mjg@redhat.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      852972ac
    • Y
      PCI hotplug: make sure child bridges are enabled at hotplug time · 3f579c34
      Yinghai Lu 提交于
      Found one PCIe Module with several bridges built-in where a "cold"
      hotadd doesn't work.
      
      If we end up reassigning bridge windows at hotadd time, and have to loop
      through assigning new ranges, we won't end up enabling the child bridges
      because the first assignment pass already tried to enable them, which
      prevents __pci_bridge_assign_resource from updating the windows.
      
      So try to move enabling of child bridges to the end, and only do it
      once.
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      3f579c34
    • P
      PCI hotplug: shpchp: Removed check for hotplug of display devices · 0ba10bc7
      Praveen Kalamegham 提交于
      Removed check to prevent hotplug of display devices within shpchp.
      Originally this was thought to have been required within the PCI
      Hotplug specification for some legacy devices.  However there is
      no such requirement in the most recent revision. The check prevents
      hotplug of not only display devices but also computational GPUs
      which require serviceability.
      Signed-off-by: NPraveen Kalamegham <praveen@nextio.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      0ba10bc7
    • P
      PCI hotplug: pciehp: Fixed return value sign for pciehp_unconfigure_device · 01b666df
      Praveen Kalamegham 提交于
      pciehp_unconfigure_device() should return -EINVAL, not EINVAL.
      Signed-off-by: NPraveen Kalamegham <praveen@nextio.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      01b666df
    • M
      PCI: Don't enable aspm before drivers have had a chance to veto it · 41cd766b
      Matthew Garrett 提交于
      The aspm code will currently set the configured aspm policy before drivers
      have had an opportunity to indicate that their hardware doesn't support it.
      Unfortunately, putting some hardware in L0 or L1 can result in the hardware
      no longer responding to any requests, even after aspm is disabled. It makes
      more sense to leave aspm policy at the BIOS defaults at initial setup time,
      reconfiguring it after pci_enable_device() is called. This allows the
      driver to blacklist individual devices beforehand.
      Reviewed-by: NKenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
      Signed-off-by: NMatthew Garrett <mjg@redhat.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      41cd766b
    • K
      PCI: fix wrong memory address handling in MSI-X · 4302e0fb
      Kenji Kaneshige 提交于
      Use resource_size_t for MMIO address instead of unsigned long. Otherwise,
      higher 32-bits of MMIO address are cleared unexpectedly in x86-32 PAE.
      Acked-by: NMatthew Wilcox <willy@linux.intel.com>
      Signed-off-by: NKenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      4302e0fb
    • J
      PCI: check return value of pci_enable_device() when enabling bridges · 2eb5ebd3
      Junchang Wang 提交于
      pci_enable_device can fail. In that case, a printed warning would be
      more appropriate.
      Signed-off-by: NJustin P. Mattock <justinmattock@gmail.com>
      Signed-off-by: NJunchang Wang <junchangwang@gmail.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      2eb5ebd3
    • S
      PCI: sparse warning (trivial) · 7736a05a
      Stephen Hemminger 提交于
      Assigning zero where NULL should be used.
      Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      7736a05a
    • M
      x86/PCI: Add option to not assign BAR's if not already assigned · 7bd1c365
      Mike Habeck 提交于
      The Linux kernel assigns BARs that a BIOS did not assign, most likely
      to handle broken BIOSes that didn't enumerate the devices correctly.
      On UV the BIOS purposely doesn't assign I/O BARs for certain devices/
      drivers we know don't use them (examples, LSI SAS, Qlogic FC, ...).
      We purposely don't assign these I/O BARs because I/O Space is a very
      limited resource.  There is only 64k of I/O Space, and in a PCIe
      topology that space gets divided up into 4k chucks (this is due to
      the fact that a pci-to-pci bridge's I/O decoder is aligned at 4k)...
      Thus a system can have at most 16 cards with I/O BARs: (64k / 4k = 16)
      
      SGI needs to scale to >16 devices with I/O BARs.  So by not assigning
      I/O BARs on devices we know don't use them, we can do that (iff the
      kernel doesn't go and assign these BARs that the BIOS purposely didn't
      assign).
      
      This patch will not assign a resource to a device BAR if that BAR was
      not assigned by the BIOS, and the kernel cmdline option 'pci=nobar'
      was specified.   This patch is closely modeled after the 'pci=norom'
      option that currently exists in the tree.
      Signed-off-by: NMike Habeck <habeck@sgi.com>
      Signed-off-by: NMike Travis <travis@sgi.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      7bd1c365
    • T
      PCI: disable MSI on VIA K8M800 · 549e1561
      Tejun Heo 提交于
      MSI delivery from on-board ahci controller doesn't work on K8M800.  At
      this point, it's unclear whether the culprit is with the ahci
      controller or the host bridge.  Given the track record and considering
      the rather minimal impact of MSI, disabling it seems reasonable.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NRainer Hurtado Navarro <publio.escipion.el.africano@gmail.com>
      Cc: stable@kernel.org
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      549e1561
    • C
      PCI quirk: AMD 780: work around wrong vendor ID on APC bridge · aff61369
      Clemens Ladisch 提交于
      In all AMD 780 family northbridges, the vendor ID of the internal
      graphics PCI/PCI bridge reads not as AMD but as that of the mainboard
      vendor, because the hardware actually returns the value of the subsystem
      vendor ID (erratum 18).
      
      We currently have additional quirk entries for Asus and Acer, but it is
      likely that we will encounter more systems with other vendor IDs.
      
      Since we do not know in advance all possible vendor IDs, a better way to
      find the device is to declare the quirk on the host bridge, whose ID is
      always correct, and use that device as a stepping stone to find the PCI/
      PCI bridge, if present.
      Reviewed-by: NMatthew Wilcox <willy@linux.intel.com>
      Signed-off-by: NClemens Ladisch <clemens@ladisch.de>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      aff61369
    • D
      PCI: hotplug/shpchp_hpc: add parenthesis in SLOT_REG_RSVDZ_MASK · 3b8fdb75
      Dan Carpenter 提交于
      The SLOT_REG_RSVDZ_MASK macro is normally used like this:
      	slot_reg &= ~SLOT_REG_RSVDZ_MASK;
      The ~ operator has higher precedence than the | operator from inside the
      macro, so it needs parenthesis.
      Reviewed-by: NKenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
      Signed-off-by: NDan Carpenter <error27@gmail.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      3b8fdb75
    • L
      PCI aerdrv: fix annoying warnings · f6735590
      Linus Torvalds 提交于
      Some compiler generates following warnings:
      
        In function 'aer_isr':
        warning: 'e_src.id' may be used uninitialized in this function
        warning: 'e_src.status' may be used uninitialized in this function
      
      Avoid status flag "int ret" and return constants instead, so that
      gcc sees the return value matching "it is initialized" better.
      Acked-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      f6735590
    • J
      x86/PCI: pci, fix section mismatch · 73cd3b43
      Jiri Slaby 提交于
      pcibios_scan_specific_bus calls pci_scan_bus_on_node which is
      __devinit. Mark pcibios_scan_specific_bus __devinit as well since
      all users are now __init or __devinit.
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      73cd3b43
    • A
      PCI: change device runtime PM settings for probe and remove · f3ec4f87
      Alan Stern 提交于
      This patch (as1388) changes the way the PCI core handles runtime PM
      settings when probing or unbinding drivers.  Now the core will make
      sure the device is enabled for runtime PM, with a usage count >= 1,
      when a driver is probed.  It does the same when calling a driver's
      remove method.
      
      If the driver wants to use runtime PM, all it has to do is call
      pm_runtime_pu_noidle() near the end of its probe routine (to cancel
      the core's usage increment) and pm_runtime_get_noresume() near the
      start of its remove routine (to restore the usage count).  It does not
      need to mess around with setting the runtime state to enabled,
      disabled, active, or suspended.
      
      The patch updates e1000e and r8169, the only PCI drivers that already
      use the existing runtime PM interface.
      Signed-off-by: NAlan Stern <stern@rowland.harvard.edu>
      Acked-by: NRafael J. Wysocki <rjw@sisk.pl>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      f3ec4f87
  2. 30 7月, 2010 6 次提交
    • L
      Merge branch 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6 · a2dccdb2
      Linus Torvalds 提交于
      * 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6:
        [S390] etr: fix clock synchronization race
        [S390] Fix IRQ tracing in case of PER
      a2dccdb2
    • L
    • L
      Merge branch 'fix/hda' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6 · e271e872
      Linus Torvalds 提交于
      * 'fix/hda' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
        ALSA: hda - Add a PC-beep workaround for ASUS P5-V
        ALSA: hda - Assume PC-beep as default for Realtek
        ALSA: hda - Don't register beep input device when no beep is available
        ALSA: hda - Fix pin-detection of Nvidia HDMI
      e271e872
    • D
      CRED: Fix __task_cred()'s lockdep check and banner comment · 8f92054e
      David Howells 提交于
      Fix __task_cred()'s lockdep check by removing the following validation
      condition:
      
      	lockdep_tasklist_lock_is_held()
      
      as commit_creds() does not take the tasklist_lock, and nor do most of the
      functions that call it, so this check is pointless and it can prevent
      detection of the RCU lock not being held if the tasklist_lock is held.
      
      Instead, add the following validation condition:
      
      	task->exit_state >= 0
      
      to permit the access if the target task is dead and therefore unable to change
      its own credentials.
      
      Fix __task_cred()'s comment to:
      
       (1) discard the bit that says that the caller must prevent the target task
           from being deleted.  That shouldn't need saying.
      
       (2) Add a comment indicating the result of __task_cred() should not be passed
           directly to get_cred(), but rather than get_task_cred() should be used
           instead.
      
      Also put a note into the documentation to enforce this point there too.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8f92054e
    • D
      CRED: Fix get_task_cred() and task_state() to not resurrect dead credentials · de09a977
      David Howells 提交于
      It's possible for get_task_cred() as it currently stands to 'corrupt' a set of
      credentials by incrementing their usage count after their replacement by the
      task being accessed.
      
      What happens is that get_task_cred() can race with commit_creds():
      
      	TASK_1			TASK_2			RCU_CLEANER
      	-->get_task_cred(TASK_2)
      	rcu_read_lock()
      	__cred = __task_cred(TASK_2)
      				-->commit_creds()
      				old_cred = TASK_2->real_cred
      				TASK_2->real_cred = ...
      				put_cred(old_cred)
      				  call_rcu(old_cred)
      		[__cred->usage == 0]
      	get_cred(__cred)
      		[__cred->usage == 1]
      	rcu_read_unlock()
      							-->put_cred_rcu()
      							[__cred->usage == 1]
      							panic()
      
      However, since a tasks credentials are generally not changed very often, we can
      reasonably make use of a loop involving reading the creds pointer and using
      atomic_inc_not_zero() to attempt to increment it if it hasn't already hit zero.
      
      If successful, we can safely return the credentials in the knowledge that, even
      if the task we're accessing has released them, they haven't gone to the RCU
      cleanup code.
      
      We then change task_state() in procfs to use get_task_cred() rather than
      calling get_cred() on the result of __task_cred(), as that suffers from the
      same problem.
      
      Without this change, a BUG_ON in __put_cred() or in put_cred_rcu() can be
      tripped when it is noticed that the usage count is not zero as it ought to be,
      for example:
      
      kernel BUG at kernel/cred.c:168!
      invalid opcode: 0000 [#1] SMP
      last sysfs file: /sys/kernel/mm/ksm/run
      CPU 0
      Pid: 2436, comm: master Not tainted 2.6.33.3-85.fc13.x86_64 #1 0HR330/OptiPlex
      745
      RIP: 0010:[<ffffffff81069881>]  [<ffffffff81069881>] __put_cred+0xc/0x45
      RSP: 0018:ffff88019e7e9eb8  EFLAGS: 00010202
      RAX: 0000000000000001 RBX: ffff880161514480 RCX: 00000000ffffffff
      RDX: 00000000ffffffff RSI: ffff880140c690c0 RDI: ffff880140c690c0
      RBP: ffff88019e7e9eb8 R08: 00000000000000d0 R09: 0000000000000000
      R10: 0000000000000001 R11: 0000000000000040 R12: ffff880140c690c0
      R13: ffff88019e77aea0 R14: 00007fff336b0a5c R15: 0000000000000001
      FS:  00007f12f50d97c0(0000) GS:ffff880007400000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007f8f461bc000 CR3: 00000001b26ce000 CR4: 00000000000006f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process master (pid: 2436, threadinfo ffff88019e7e8000, task ffff88019e77aea0)
      Stack:
       ffff88019e7e9ec8 ffffffff810698cd ffff88019e7e9ef8 ffffffff81069b45
      <0> ffff880161514180 ffff880161514480 ffff880161514180 0000000000000000
      <0> ffff88019e7e9f28 ffffffff8106aace 0000000000000001 0000000000000246
      Call Trace:
       [<ffffffff810698cd>] put_cred+0x13/0x15
       [<ffffffff81069b45>] commit_creds+0x16b/0x175
       [<ffffffff8106aace>] set_current_groups+0x47/0x4e
       [<ffffffff8106ac89>] sys_setgroups+0xf6/0x105
       [<ffffffff81009b02>] system_call_fastpath+0x16/0x1b
      Code: 48 8d 71 ff e8 7e 4e 15 00 85 c0 78 0b 8b 75 ec 48 89 df e8 ef 4a 15 00
      48 83 c4 18 5b c9 c3 55 8b 07 8b 07 48 89 e5 85 c0 74 04 <0f> 0b eb fe 65 48 8b
      04 25 00 cc 00 00 48 3b b8 58 04 00 00 75
      RIP  [<ffffffff81069881>] __put_cred+0xc/0x45
       RSP <ffff88019e7e9eb8>
      ---[ end trace df391256a100ebdd ]---
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      de09a977
    • W
      watchdog: update MAINTAINERS entry · 230a5cef
      Wim Van Sebroeck 提交于
      Add Mailing-list and website to watchdog MAINTAINERS entry.
      Signed-off-by: NWim Van Sebroeck <wim@iguana.be>
      230a5cef
  3. 29 7月, 2010 8 次提交
  4. 28 7月, 2010 4 次提交