1. 16 11月, 2010 1 次提交
  2. 13 11月, 2010 1 次提交
    • B
      PCI: fix pci_bus_alloc_resource() hang, prefer positive decode · 82e3e767
      Bjorn Helgaas 提交于
      When a PCI bus has two resources with the same start/end, e.g.,
      
          pci_bus 0000:04: resource 2 [mem 0xd0000000-0xd7ffffff pref]
          pci_bus 0000:04: resource 7 [mem 0xd0000000-0xd7ffffff]
      
      the previous pci_bus_find_resource_prev() implementation would alternate
      between them forever:
      
          pci_bus_find_resource_prev(... [mem 0xd0000000-0xd7ffffff pref])
              returns [mem 0xd0000000-0xd7ffffff]
          pci_bus_find_resource_prev(... [mem 0xd0000000-0xd7ffffff])
              returns [mem 0xd0000000-0xd7ffffff pref]
          pci_bus_find_resource_prev(... [mem 0xd0000000-0xd7ffffff pref])
              returns [mem 0xd0000000-0xd7ffffff]
          ...
      
      This happened because there was no ordering between two resources with the
      same start and end.  A resource that had the same start and end as the
      cursor, but was not itself the cursor, was considered to be before the
      cursor.
      
      This patch fixes the hang by making a fixed ordering between any two
      resources.
      
      In addition, it tries to allocate from positively decoded regions before
      using any subtractively decoded resources.  This means we will use a
      positive decode region before a subtractive decode one, even if it means
      using a smaller address.
      
      Reference: https://bugzilla.kernel.org/show_bug.cgi?id=22062Reported-by: NBorislav Petkov <bp@amd64.org>
      Tested-by: NBorislav Petkov <bp@amd64.org>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NBjorn Helgaas <bjorn.helgaas@hp.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      82e3e767
  3. 12 11月, 2010 3 次提交
    • J
      PCI: read current power state at enable time · 97c145f7
      Jesse Barnes 提交于
      When we enable a PCI device, we avoid doing a lot of the initial setup
      work if the device's enable count is non-zero.  If we don't fetch the
      power state though, we may later fail to set up MSI due to the unknown
      status.  So pick it up before we short circuit the rest due to a
      pre-existing enable or mismatched enable/disable pair (as happens with
      VGA devices, which are special in a special way).
      Tested-by: NJesse Brandeburg <jesse.brandeburg@gmail.com>
      Reported-by: NDave Airlie <airlied@linux.ie>
      Tested-by: NDave Airlie <airlied@linux.ie>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      97c145f7
    • M
      PCI: fix size checks for mmap() on /proc/bus/pci files · 3b519e4e
      Martin Wilck 提交于
      The checks for valid mmaps of PCI resources made through /proc/bus/pci files
      that were introduced in 9eff02e2 have several
      problems:
      
      1. mmap() calls on /proc/bus/pci files are made with real file offsets > 0,
      whereas under /sys/bus/pci/devices, the start of the resource corresponds
      to offset 0. This may lead to false negatives in pci_mmap_fits(), which
      implicitly assumes the /sys/bus/pci/devices layout.
      
      2. The loop in proc_bus_pci_mmap doesn't skip empty resouces. This leads
      to false positives, because pci_mmap_fits() doesn't treat empty resources
      correctly (the calculated size is 1 << (8*sizeof(resource_size_t)-PAGE_SHIFT)
      in this case!).
      
      3. If a user maps resources with BAR > 0, pci_mmap_fits will emit bogus
      WARNINGS for the first resources that don't fit until the correct one is found.
      
      On many controllers the first 2-4 BARs are used, and the others are empty.
      In this case, an mmap attempt will first fail on the non-empty BARs
      (including the "right" BAR because of 1.) and emit bogus WARNINGS because
      of 3., and finally succeed on the first empty BAR because of 2.
      This is certainly not the intended behaviour.
      
      This patch addresses all 3 issues.
      Updated with an enum type for the additional parameter for pci_mmap_fits().
      
      Cc: stable@kernel.org
      Signed-off-by: NMartin Wilck <martin.wilck@ts.fujitsu.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      3b519e4e
    • S
      PCI hotplug: ibmphp: Add check to prevent reading beyond mapped area · ac3abf2c
      Steven Rostedt 提交于
      While testing various randconfigs with ktest.pl, I hit the following panic:
      
      BUG: unable to handle kernel paging request at f7e54b03
      IP: [<c0d63409>] ibmphp_access_ebda+0x101/0x19bb
      
      Adding printks, I found that the loop that reads the ebda blocks
      can move out of the mapped section.
      
      ibmphp_access_ebda: start=f7e44c00 size=5120 end=f7e46000
      ibmphp_access_ebda: io_mem=f7e44d80 offset=384
      ibmphp_access_ebda: io_mem=f7e54b03 offset=65283
      
      The start of the iomap was at f7e44c00 and had a size of 5120,
      making the end f7e46000. We start with an offset of 0x180 or
      384, giving the first read at 0xf7e44d80. Reading that location
      yields 65283, which is much bigger than the 5120 that was allocated
      and makes the next read at f7e54b03 which is outside the mapped area.
      
      Perhaps this is a bug in the driver, or buggy hardware, but this patch
      is more about not crashing my box on start up and just giving a warning
      if it detects this error.
      
      This patch at least lets my box boot with just a warning.
      
      Cc: Chandru Siddalingappa <chandru@linux.vnet.ibm.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      ac3abf2c
  4. 09 11月, 2010 2 次提交
  5. 28 10月, 2010 1 次提交
  6. 27 10月, 2010 1 次提交
    • B
      PCI: allocate bus resources from the top down · b126b470
      Bjorn Helgaas 提交于
      Allocate space from the highest-address PCI bus resource first, then work
      downward.
      
      Previously, we looked for space in PCI host bridge windows in the order
      we discovered the windows.  For example, given the following windows
      (discovered via an ACPI _CRS method):
      
          pci_root PNP0A03:00: host bridge window [mem 0x000a0000-0x000bffff]
          pci_root PNP0A03:00: host bridge window [mem 0x000c0000-0x000effff]
          pci_root PNP0A03:00: host bridge window [mem 0x000f0000-0x000fffff]
          pci_root PNP0A03:00: host bridge window [mem 0xbff00000-0xf7ffffff]
          pci_root PNP0A03:00: host bridge window [mem 0xff980000-0xff980fff]
          pci_root PNP0A03:00: host bridge window [mem 0xff97c000-0xff97ffff]
          pci_root PNP0A03:00: host bridge window [mem 0xfed20000-0xfed9ffff]
      
      we attempted to allocate from [mem 0x000a0000-0x000bffff] first, then
      [mem 0x000c0000-0x000effff], and so on.
      
      With this patch, we allocate from [mem 0xff980000-0xff980fff] first, then
      [mem 0xff97c000-0xff97ffff], [mem 0xfed20000-0xfed9ffff], etc.
      
      Allocating top-down follows Windows practice, so we're less likely to
      trip over BIOS defects in the _CRS description.
      
      On the machine above (a Dell T3500), the [mem 0xbff00000-0xbfffffff] region
      doesn't actually work and is likely a BIOS defect.  The symptom is that we
      move the AHCI controller to 0xbff00000, which leads to "Boot has failed,
      sleeping forever," a BUG in ahci_stop_engine(), or some other boot failure.
      
      Reference: https://bugzilla.kernel.org/show_bug.cgi?id=16228#c43
      Reference: https://bugzilla.redhat.com/show_bug.cgi?id=620313
      Reference: https://bugzilla.redhat.com/show_bug.cgi?id=629933Reported-by: NBrian Bloniarz <phunge0@hotmail.com>
      Reported-and-tested-by: NStefan Becker <chemobejk@gmail.com>
      Reported-by: NDenys Vlasenko <dvlasenk@redhat.com>
      Signed-off-by: NBjorn Helgaas <bjorn.helgaas@hp.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      b126b470
  7. 18 10月, 2010 12 次提交
  8. 16 10月, 2010 6 次提交
    • N
      PCI: add quirk for non-symmetric-mode irq routing to versions 0 and 4 of the MCP55 northbridge · 66db60ea
      Neil Horman 提交于
      A long time ago I worked on a RHEL5 bug in which kdump hung during boot
      on a set of systems.  The systems hung because they never received timer
      interrupts during calibrate_delay.  These systems also all had Opteron
      processors on a hypertransport bus, bridged to a pci bus via an Nvidia
      MCP55 northbridge chip.  After much wrangling I managed to learn from
      Nvidia that they have an undocumented register in some versions of that
      chip which control how legacy interrupts are send to the cpu complex
      when the ioapic isn't active.  Nvidia defaults this register to only
      send legacy interrupts to the BSP, so if kdump happens to boot on an AP,
      we never get timer interrupts and boom.  I had initially used this quirk
      as a workaround, with my intent being to move apic initalization to an
      earlier point in the boot process, so the setting of the register would
      be irrelevant.  Given the work involved in doing that however, the
      fragile nature of the apic initalization code, and the fact that, over
      the 2 years since we found this bug, the MCP55 is the only chip which
      seems to have this issue, I've figure at this point its likely safer to
      just carry the quirk around.  By setting the referenced bits in this
      hidden register, interrupts will be broadcast to all cpus when the
      ioapic isn't active on the above described systems.
      Acked-by: NSimon Horman <horms@verge.net.au>
      Acked-by: NVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      66db60ea
    • R
      PCI/PCIe/AER: Disable native AER service if BIOS has precedence · b22c3d82
      Rafael J. Wysocki 提交于
      There is a design issue related to PCIe AER and _OSC that the BIOS
      may be asked to grant control of the AER service even if some
      Hardware Error Source Table (HEST) entries contain information
      meaning that the BIOS really should control it.  Namely,
      pcie_port_acpi_setup() calls pcie_aer_get_firmware_first() that
      determines whether or not the AER service should be controlled by
      the BIOS on the basis of the HEST information for the given PCIe
      port.  The BIOS is asked to grant control of the AER service for
      a PCIe Root Complex if pcie_aer_get_firmware_first() returns 'false'
      for at least one root port in that complex, even if all of the other
      root ports' HEST entries have the FIRMWARE_FIRST flag set (and none
      of them has the GLOBAL flag set).  However, if the AER service is
      controlled by the kernel, that may interfere with the BIOS' handling
      of the error sources having the FIRMWARE_FIRST flag.  Moreover,
      there may be PCIe endpoints that have the FIRMWARE_FIRST flag set in
      HEST and are attached to the root ports in question, in which case it
      also may be unsafe to ask the BIOS for control of the AER service.
      
      For this reason, introduce a function checking if there's at least
      one PCIe-related HEST entry with the FIRMWARE_FIRST flag set and
      disable the native AER service altogether if this function returns
      'true'.
      Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      b22c3d82
    • T
      PCI hotplug: ibmphp-hpc: semaphore cleanup · 5a37f138
      Thomas Gleixner 提交于
      Get rid of init_MUTEX[_LOCKED]() and use sema_init() instead.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      5a37f138
    • B
      PCI: aerdrv: fix uninitialized variable warning · 50c1126e
      Bill Pemberton 提交于
      quiet the warning about use of uninitialized e_src in
      aer_isr()  e_src is initialized by get_e_source()
      Signed-off-by: NBill Pemberton <wfp5p@virginia.edu>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      50c1126e
    • A
      PCI: kill BKL in /proc/pci · 991f7395
      Arnd Bergmann 提交于
      All operations in the pci procfs ioctl functions are
      atomic, so no lock is needed here.
      
      Also add a compat_ioctl method, since all the commands
      are compatible in 32 bit mode.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: linux-pci@vger.kernel.org
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      991f7395
    • J
      PCI: Adjust confusing if indentation in pcie_get_readrq · 93e75fab
      Julia Lawall 提交于
      Indent the branch of an if.
      
      The semantic match that finds this problem is as follows:
      (http://coccinelle.lip6.fr/)
      
      // <smpl>
      @r disable braces4@
      position p1,p2;
      statement S1,S2;
      @@
      
      (
      if (...) { ... }
      |
      if (...) S1@p1 S2@p2
      )
      
      @script:python@
      p1 << r.p1;
      p2 << r.p2;
      @@
      
      if (p1[0].column == p2[0].column):
        cocci.print_main("branch",p1)
        cocci.print_secs("after",p2)
      // </smpl>
      Signed-off-by: NJulia Lawall <julia@diku.dk>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      93e75fab
  9. 15 10月, 2010 1 次提交
    • A
      llseek: automatically add .llseek fop · 6038f373
      Arnd Bergmann 提交于
      All file_operations should get a .llseek operation so we can make
      nonseekable_open the default for future file operations without a
      .llseek pointer.
      
      The three cases that we can automatically detect are no_llseek, seq_lseek
      and default_llseek. For cases where we can we can automatically prove that
      the file offset is always ignored, we use noop_llseek, which maintains
      the current behavior of not returning an error from a seek.
      
      New drivers should normally not use noop_llseek but instead use no_llseek
      and call nonseekable_open at open time.  Existing drivers can be converted
      to do the same when the maintainer knows for certain that no user code
      relies on calling seek on the device file.
      
      The generated code is often incorrectly indented and right now contains
      comments that clarify for each added line why a specific variant was
      chosen. In the version that gets submitted upstream, the comments will
      be gone and I will manually fix the indentation, because there does not
      seem to be a way to do that using coccinelle.
      
      Some amount of new code is currently sitting in linux-next that should get
      the same modifications, which I will do at the end of the merge window.
      
      Many thanks to Julia Lawall for helping me learn to write a semantic
      patch that does all this.
      
      ===== begin semantic patch =====
      // This adds an llseek= method to all file operations,
      // as a preparation for making no_llseek the default.
      //
      // The rules are
      // - use no_llseek explicitly if we do nonseekable_open
      // - use seq_lseek for sequential files
      // - use default_llseek if we know we access f_pos
      // - use noop_llseek if we know we don't access f_pos,
      //   but we still want to allow users to call lseek
      //
      @ open1 exists @
      identifier nested_open;
      @@
      nested_open(...)
      {
      <+...
      nonseekable_open(...)
      ...+>
      }
      
      @ open exists@
      identifier open_f;
      identifier i, f;
      identifier open1.nested_open;
      @@
      int open_f(struct inode *i, struct file *f)
      {
      <+...
      (
      nonseekable_open(...)
      |
      nested_open(...)
      )
      ...+>
      }
      
      @ read disable optional_qualifier exists @
      identifier read_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      expression E;
      identifier func;
      @@
      ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
      {
      <+...
      (
         *off = E
      |
         *off += E
      |
         func(..., off, ...)
      |
         E = *off
      )
      ...+>
      }
      
      @ read_no_fpos disable optional_qualifier exists @
      identifier read_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      @@
      ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
      {
      ... when != off
      }
      
      @ write @
      identifier write_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      expression E;
      identifier func;
      @@
      ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
      {
      <+...
      (
        *off = E
      |
        *off += E
      |
        func(..., off, ...)
      |
        E = *off
      )
      ...+>
      }
      
      @ write_no_fpos @
      identifier write_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      @@
      ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
      {
      ... when != off
      }
      
      @ fops0 @
      identifier fops;
      @@
      struct file_operations fops = {
       ...
      };
      
      @ has_llseek depends on fops0 @
      identifier fops0.fops;
      identifier llseek_f;
      @@
      struct file_operations fops = {
      ...
       .llseek = llseek_f,
      ...
      };
      
      @ has_read depends on fops0 @
      identifier fops0.fops;
      identifier read_f;
      @@
      struct file_operations fops = {
      ...
       .read = read_f,
      ...
      };
      
      @ has_write depends on fops0 @
      identifier fops0.fops;
      identifier write_f;
      @@
      struct file_operations fops = {
      ...
       .write = write_f,
      ...
      };
      
      @ has_open depends on fops0 @
      identifier fops0.fops;
      identifier open_f;
      @@
      struct file_operations fops = {
      ...
       .open = open_f,
      ...
      };
      
      // use no_llseek if we call nonseekable_open
      ////////////////////////////////////////////
      @ nonseekable1 depends on !has_llseek && has_open @
      identifier fops0.fops;
      identifier nso ~= "nonseekable_open";
      @@
      struct file_operations fops = {
      ...  .open = nso, ...
      +.llseek = no_llseek, /* nonseekable */
      };
      
      @ nonseekable2 depends on !has_llseek @
      identifier fops0.fops;
      identifier open.open_f;
      @@
      struct file_operations fops = {
      ...  .open = open_f, ...
      +.llseek = no_llseek, /* open uses nonseekable */
      };
      
      // use seq_lseek for sequential files
      /////////////////////////////////////
      @ seq depends on !has_llseek @
      identifier fops0.fops;
      identifier sr ~= "seq_read";
      @@
      struct file_operations fops = {
      ...  .read = sr, ...
      +.llseek = seq_lseek, /* we have seq_read */
      };
      
      // use default_llseek if there is a readdir
      ///////////////////////////////////////////
      @ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier readdir_e;
      @@
      // any other fop is used that changes pos
      struct file_operations fops = {
      ... .readdir = readdir_e, ...
      +.llseek = default_llseek, /* readdir is present */
      };
      
      // use default_llseek if at least one of read/write touches f_pos
      /////////////////////////////////////////////////////////////////
      @ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier read.read_f;
      @@
      // read fops use offset
      struct file_operations fops = {
      ... .read = read_f, ...
      +.llseek = default_llseek, /* read accesses f_pos */
      };
      
      @ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier write.write_f;
      @@
      // write fops use offset
      struct file_operations fops = {
      ... .write = write_f, ...
      +	.llseek = default_llseek, /* write accesses f_pos */
      };
      
      // Use noop_llseek if neither read nor write accesses f_pos
      ///////////////////////////////////////////////////////////
      
      @ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier read_no_fpos.read_f;
      identifier write_no_fpos.write_f;
      @@
      // write fops use offset
      struct file_operations fops = {
      ...
       .write = write_f,
       .read = read_f,
      ...
      +.llseek = noop_llseek, /* read and write both use no f_pos */
      };
      
      @ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier write_no_fpos.write_f;
      @@
      struct file_operations fops = {
      ... .write = write_f, ...
      +.llseek = noop_llseek, /* write uses no f_pos */
      };
      
      @ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier read_no_fpos.read_f;
      @@
      struct file_operations fops = {
      ... .read = read_f, ...
      +.llseek = noop_llseek, /* read uses no f_pos */
      };
      
      @ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      @@
      struct file_operations fops = {
      ...
      +.llseek = noop_llseek, /* no read or write fn */
      };
      ===== End semantic patch =====
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Julia Lawall <julia@diku.dk>
      Cc: Christoph Hellwig <hch@infradead.org>
      6038f373
  10. 12 10月, 2010 11 次提交
  11. 05 10月, 2010 1 次提交
    • A
      drivers: autoconvert trivial BKL users to private mutex · 613655fa
      Arnd Bergmann 提交于
      All these files use the big kernel lock in a trivial
      way to serialize their private file operations,
      typically resulting from an earlier semi-automatic
      pushdown from VFS.
      
      None of these drivers appears to want to lock against
      other code, and they all use the BKL as the top-level
      lock in their file operations, meaning that there
      is no lock-order inversion problem.
      
      Consequently, we can remove the BKL completely,
      replacing it with a per-file mutex in every case.
      Using a scripted approach means we can avoid
      typos.
      
      These drivers do not seem to be under active
      maintainance from my brief investigation. Apologies
      to those maintainers that I have missed.
      
      file=$1
      name=$2
      if grep -q lock_kernel ${file} ; then
          if grep -q 'include.*linux.mutex.h' ${file} ; then
                  sed -i '/include.*<linux\/smp_lock.h>/d' ${file}
          else
                  sed -i 's/include.*<linux\/smp_lock.h>.*$/include <linux\/mutex.h>/g' ${file}
          fi
          sed -i ${file} \
              -e "/^#include.*linux.mutex.h/,$ {
                      1,/^\(static\|int\|long\)/ {
                           /^\(static\|int\|long\)/istatic DEFINE_MUTEX(${name}_mutex);
      
      } }"  \
          -e "s/\(un\)*lock_kernel\>[ ]*()/mutex_\1lock(\&${name}_mutex)/g" \
          -e '/[      ]*cycle_kernel_lock();/d'
      else
          sed -i -e '/include.*\<smp_lock.h\>/d' ${file}  \
                      -e '/cycle_kernel_lock()/d'
      fi
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      613655fa