1. 10 1月, 2008 1 次提交
  2. 09 1月, 2008 8 次提交
  3. 07 1月, 2008 2 次提交
    • I
      CPU hotplug: fix cpu_is_offline() on !CONFIG_HOTPLUG_CPU · a263898f
      Ingo Molnar 提交于
      make randconfig bootup testing found that the cpufreq code
      crashes on bootup, if the powernow-k8 driver is enabled and
      if maxcpus=1 passed on the boot line to a !CONFIG_HOTPLUG_CPU
      kernel.
      
      First lockdep found out that there's an inconsistent unlock
      sequence:
      
       =====================================
       [ BUG: bad unlock balance detected! ]
       -------------------------------------
       swapper/1 is trying to release lock (&per_cpu(cpu_policy_rwsem, cpu)) at:
       [<ffffffff806ffd8e>] unlock_policy_rwsem_write+0x3c/0x42
       but there are no more locks to release!
      
      Call Trace:
       [<ffffffff806ffd8e>] unlock_policy_rwsem_write+0x3c/0x42
       [<ffffffff80251c29>] print_unlock_inbalance_bug+0x104/0x12c
       [<ffffffff80252f3a>] mark_held_locks+0x56/0x94
       [<ffffffff806ffd8e>] unlock_policy_rwsem_write+0x3c/0x42
       [<ffffffff807008b6>] cpufreq_add_dev+0x2a8/0x5c4
       ...
      
      then shortly afterwards the cpufreq code crashed on an assert:
      
       ------------[ cut here ]------------
       kernel BUG at drivers/cpufreq/cpufreq.c:1068!
       invalid opcode: 0000 [1] SMP
       [...]
       Call Trace:
        [<ffffffff805145d6>] sysdev_driver_unregister+0x5b/0x91
        [<ffffffff806ff520>] cpufreq_register_driver+0x15d/0x1a2
        [<ffffffff80cc0596>] powernowk8_init+0x86/0x94
       [...]
       ---[ end trace 1e9219be2b4431de ]---
      
      the bug was caused by maxcpus=1 bootup, which brought up the
      secondary core as !cpu_online() but !cpu_is_offline() either,
      which on on !CONFIG_HOTPLUG_CPU is always 0 (include/linux/cpu.h):
      
        /* CPUs don't go offline once they're online w/o CONFIG_HOTPLUG_CPU */
        static inline int cpu_is_offline(int cpu) { return 0; }
      
      but the cpufreq code uses cpu_online() and cpu_is_offline() in
      a mixed way - the low-level drivers use cpu_online(), while
      the cpufreq core uses cpu_is_offline(). This opened up the
      possibility to add the non-initialized sysdev device of the
      secondary core:
      
       cpufreq-core: trying to register driver powernow-k8
       cpufreq-core: adding CPU 0
       powernow-k8: BIOS error - no PSB or ACPI _PSS objects
       cpufreq-core: initialization failed
       cpufreq-core: adding CPU 1
       cpufreq-core: initialization failed
      
      which then blew up. The fix is to make cpu_is_offline() always
      the negation of cpu_online(). With that fix applied the kernel
      boots up fine without crashing:
      
       Calling initcall 0xffffffff80cc0510: powernowk8_init+0x0/0x94()
       powernow-k8: Found 1 AMD Athlon(tm) 64 X2 Dual Core Processor 3800+ processors (1 cpu cores) (version 2.20.00)
       powernow-k8: BIOS error - no PSB or ACPI _PSS objects
       initcall 0xffffffff80cc0510: powernowk8_init+0x0/0x94() returned -19.
       initcall 0xffffffff80cc0510 ran for 19 msecs: powernowk8_init+0x0/0x94()
       Calling initcall 0xffffffff80cc328f: init_lapic_nmi_sysfs+0x0/0x39()
      
      We could fix this by making CPU enumeration aware of max_cpus, but that
      would be more fragile IMO, and the cpu_online(cpu) != cpu_is_offline(cpu)
      possibility was quite confusing and a continuous source of bugs too.
      
      Most distributions have kernels with CPU hotplug enabled, so this bug
      remained hidden for a long time.
      
      Bug forensics:
      
      The broken cpu_is_offline() API variant was introduced via:
      
       commit a59d2e4e6977e7b94e003c96a41f07e96cddc340
       Author: Rusty Russell <rusty@rustcorp.com.au>
       Date:   Mon Mar 8 06:06:03 2004 -0800
      
           [PATCH] minor cleanups for hotplug CPUs
      
      ( this predates linux-2.6.git, this commit is available from Thomas's
        historic git tree. )
      
      Then 1.5 years later the cpufreq code made use of it:
      
       commit c32b6b8e
       Author: Ashok Raj <ashok.raj@intel.com>
       Date:   Sun Oct 30 14:59:54 2005 -0800
      
           [PATCH] create and destroy cpufreq sysfs entries based on cpu notifiers
      
       +       if (cpu_is_offline(cpu))
       +               return 0;
      
      which is a correct use of the subtly broken new API. v2.6.15 then
      shipped with this bug included.
      
      then it took two more years for random-kernel qa to hit it.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a263898f
    • L
      Revert "scsi: revert "[SCSI] Get rid of scsi_cmnd->done"" · 7b3d9545
      Linus Torvalds 提交于
      This reverts commit ac40532e, which gets
      us back the original cleanup of 6f5391c2.
      
      It turns out that the bug that was triggered by that commit was
      apparently not actually triggered by that commit at all, and just the
      testing conditions had changed enough to make it appear to be due to it.
      
      The real problem seems to have been found by Peter Osterlund:
      
        "pktcdvd sets it [block device size] when opening the /dev/pktcdvd
         device, but when the drive is later opened as /dev/scd0, there is
         nothing that sets it back.  (Btw, 40944 is possible if the disk is a
         CDRW that was formatted with "cdrwtool -m 10236".)
      
         The problem is that pktcdvd opens the cd device in non-blocking mode
         when pktsetup is run, and doesn't close it again until pktsetup -d is
         run.  The effect is that if you meanwhile open the cd device,
         blkdev.c:do_open() doesn't call bd_set_size() because
         bdev->bd_openers is non-zero."
      
      In particular, to repeat the bug (regardless of whether commit
      6f5391c2 is applied or not):
      
        " 1. Start with an empty drive.
          2. pktsetup 0 /dev/scd0
          3. Insert a CD containing an isofs filesystem.
          4. mount /dev/pktcdvd/0 /mnt/tmp
          5. umount /mnt/tmp
          6. Press the eject button.
          7. Insert a DVD containing a non-writable filesystem.
          8. mount /dev/scd0 /mnt/tmp
          9. find /mnt/tmp -type f -print0 | xargs -0 sha1sum >/dev/null
          10. If the DVD contains data beyond the physical size of a CD, you
              get I/O errors in the terminal, and dmesg reports lots of
              "attempt to access beyond end of device" errors."
      
      which in turn is because the nested open after the media change won't
      cause the size to be set properly (because the original open still holds
      the block device, and we only do the bd_set_size() when we don't have
      other people holding the device open).
      
      The proper fix for that is probably to just do something like
      
      	bdev->bd_inode->i_size = (loff_t)get_capacity(disk)<<9;
      
      in fs/block_dev.c:do_open() even for the cases where we're not the
      original opener (but *not* call bd_set_size(), since that will also
      change the block size of the device).
      
      Cc: Peter Osterlund <petero2@telia.com>
      Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
      Cc: Matthew Wilcox <matthew@wil.cx>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7b3d9545
  4. 04 1月, 2008 1 次提交
  5. 03 1月, 2008 3 次提交
  6. 02 1月, 2008 3 次提交
  7. 28 12月, 2007 1 次提交
  8. 27 12月, 2007 5 次提交
  9. 24 12月, 2007 1 次提交
  10. 21 12月, 2007 2 次提交
  11. 20 12月, 2007 3 次提交
  12. 19 12月, 2007 5 次提交
  13. 18 12月, 2007 5 次提交
    • A
      block: let elv_register() return void · 2fdd82bd
      Adrian Bunk 提交于
      elv_register() always returns 0, and there isn't anything it does where
      it should return an error (the only error condition is so grave that
      it's handled with a BUG_ON).
      Signed-off-by: NAdrian Bunk <bunk@kernel.org>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      2fdd82bd
    • C
      quicklist: Set tlb->need_flush if pages are remaining in quicklist 0 · 421d9919
      Christoph Lameter 提交于
      This ensures that the quicklists are drained. Otherwise draining may only
      occur when the processor reaches an idle state.
      
      Fixes fatal leakage of pgd_t's on 2.6.22 and later.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Reported-by: NDhaval Giani <dhaval@linux.vnet.ibm.com>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      421d9919
    • N
      Revert "hugetlb: Add hugetlb_dynamic_pool sysctl" · 368d2c63
      Nishanth Aravamudan 提交于
      This reverts commit 54f9f80d ("hugetlb:
      Add hugetlb_dynamic_pool sysctl")
      
      Given the new sysctl nr_overcommit_hugepages, the boolean dynamic pool
      sysctl is not needed, as its semantics can be expressed by 0 in the
      overcommit sysctl (no dynamic pool) and non-0 in the overcommit sysctl
      (pool enabled).
      
      (Needed in 2.6.24 since it reverts a post-2.6.23 userspace-visible change)
      Signed-off-by: NNishanth Aravamudan <nacc@us.ibm.com>
      Acked-by: NAdam Litke <agl@us.ibm.com>
      Cc: William Lee Irwin III <wli@holomorphy.com>
      Cc: Dave Hansen <haveblue@us.ibm.com>
      Cc: David Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      368d2c63
    • N
      hugetlb: introduce nr_overcommit_hugepages sysctl · d1c3fb1f
      Nishanth Aravamudan 提交于
      hugetlb: introduce nr_overcommit_hugepages sysctl
      
      While examining the code to support /proc/sys/vm/hugetlb_dynamic_pool, I
      became convinced that having a boolean sysctl was insufficient:
      
      1) To support per-node control of hugepages, I have previously submitted
      patches to add a sysfs attribute related to nr_hugepages. However, with
      a boolean global value and per-mount quota enforcement constraining the
      dynamic pool, adding corresponding control of the dynamic pool on a
      per-node basis seems inconsistent to me.
      
      2) Administration of the hugetlb dynamic pool with multiple hugetlbfs
      mount points is, arguably, more arduous than it needs to be. Each quota
      would need to be set separately, and the sum would need to be monitored.
      
      To ease the administration, and to help make the way for per-node
      control of the static & dynamic hugepage pool, I added a separate
      sysctl, nr_overcommit_hugepages. This value serves as a high watermark
      for the overall hugepage pool, while nr_hugepages serves as a low
      watermark. The boolean sysctl can then be removed, as the condition
      
      	nr_overcommit_hugepages > 0
      
      indicates the same administrative setting as
      
      	hugetlb_dynamic_pool == 1
      
      Quotas still serve as local enforcement of the size of the pool on a
      per-mount basis.
      
      A few caveats:
      
      1) There is a race whereby the global surplus huge page counter is
      incremented before a hugepage has allocated. Another process could then
      try grow the pool, and fail to convert a surplus huge page to a normal
      huge page and instead allocate a fresh huge page. I believe this is
      benign, as no memory is leaked (the actual pages are still tracked
      correctly) and the counters won't go out of sync.
      
      2) Shrinking the static pool while a surplus is in effect will allow the
      number of surplus huge pages to exceed the overcommit value. As long as
      this condition holds, however, no more surplus huge pages will be
      allowed on the system until one of the two sysctls are increased
      sufficiently, or the surplus huge pages go out of use and are freed.
      
      Successfully tested on x86_64 with the current libhugetlbfs snapshot,
      modified to use the new sysctl.
      Signed-off-by: NNishanth Aravamudan <nacc@us.ibm.com>
      Acked-by: NAdam Litke <agl@us.ibm.com>
      Cc: William Lee Irwin III <wli@holomorphy.com>
      Cc: Dave Hansen <haveblue@us.ibm.com>
      Cc: David Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d1c3fb1f
    • A
      apm_event{,info}_t are userspace types · 8d936626
      Adam Jackson 提交于
      These types define the size of data read from /dev/apm_bios.  They should
      not be hidden behind #ifdef __KERNEL__.
      
      This is killing my xserver compile, apm_event_t is used in the xserver
      source.
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8d936626