1. 09 1月, 2008 7 次提交
    • E
      [XFRM]: xfrm_algo_clone() allocates too much memory · 0f99be0d
      Eric Dumazet 提交于
      alg_key_len is the length in bits of the key, not in bytes.
      
      Best way to fix this is to move alg_len() function from net/xfrm/xfrm_user.c 
      to include/net/xfrm.h, and to use it in xfrm_algo_clone()
      
      alg_len() is renamed to xfrm_alg_len() because of its global exposition.
      Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0f99be0d
    • P
      [NET]: Clone the sk_buff 'iif' field in __skb_clone() · 02f1c89d
      Paul Moore 提交于
      Both NetLabel and SELinux (other LSMs may grow to use it as well) rely
      on the 'iif' field to determine the receiving network interface of
      inbound packets.  Unfortunately, at present this field is not
      preserved across a skb clone operation which can lead to garbage
      values if the cloned skb is sent back through the network stack.  This
      patch corrects this problem by properly copying the 'iif' field in
      __skb_clone() and removing the 'iif' field assignment from
      skb_act_clone() since it is no longer needed.
      
      Also, while we are here, put the assignments in the same order as the
      offsets to reduce cacheline bounces.
      Signed-off-by: NPaul Moore <paul.moore@hp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      02f1c89d
    • D
      [NET]: Add NAPI_STATE_DISABLE. · a0a46196
      David S. Miller 提交于
      Create a bit to signal that a napi_disable() is in progress.
      
      This sets up infrastructure such that net_rx_action() can generically
      break out of the ->poll() loop on a NAPI context that has a pending
      napi_disable() yet is being bombed with packets (and thus would
      otherwise poll endlessly and not allow the napi_disable() to finish).
      
      Now, what napi_disable() does is first set the NAPI_STATE_DISABLE bit
      (to indicate that a disable is pending), then it polls for the
      NAPI_STATE_SCHED bit, and once the NAPI_STATE_SCHED bit is acquired
      the NAPI_STATE_DISABLE bit is cleared.  Here, the test_and_set_bit()
      provides the necessary memory barrier between the various bitops.
      
      napi_schedule_prep() now tests for a pending disable as it's first
      action and won't try to obtain the NAPI_STATE_SCHED bit if a disable
      is pending.
      
      As a result, we can remove the netif_running() check in
      netif_rx_schedule_prep() because the NAPI disable pending state serves
      this purpose.  And, it does so in a NAPI centric manner which is what
      we really want.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a0a46196
    • D
      [NET]: Do not grab device reference when scheduling a NAPI poll. · bdb95b17
      David S. Miller 提交于
      It is pointless, because everything that can make a device go away
      will do a napi_disable() first.
      
      The main impetus behind this is that now we can legally do a NAPI
      completion in generic code like net_rx_action() which a following
      changeset needs to do.  net_rx_action() can only perform actions
      in NAPI centric ways, because there may be a one to many mapping
      between NAPI contexts and network devices (SKY2 is one example).
      
      We also want to get rid of this because it's an extra atomic in the
      NAPI paths, and also because it is one of the last instances where the
      NAPI interfaces care about net devices.
      
      The one remaining netdev detail the NAPI stuff cares about is the
      netif_running() check which will be killed off in a subsequent
      changeset.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bdb95b17
    • V
      [SCTP]: Fix the name of the authentication event. · f691724c
      Vlad Yasevich 提交于
      The even should be called SCTP_AUTHENTICATION_INDICATION.
      Signed-off-by: NVlad Yasevich <vladislav.yasevich@hp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f691724c
    • A
      pl2303: Fix mode switching regression · bf5e5834
      Alan Cox 提交于
      Cleaning out all the incorrect 'no change made' checks for termios
      settings showed up a problem with the PL2303. The hardware here seems to
      lose sync and bits if you tell it to make no changes. This shows up with
      a real world application.
      
      To fix this the driver check for meaningful hardware changes is restored
      but doing the tests correctly and as a tty layer function so it doesn't
      get duplicated wrongly everywhere if other drivers turn out to need it.
      Signed-off-by: NAlan Cox <alan@redhat.com>
      Tested-by: NMirko Parthey <mirko.parthey@informatik.tu-chemnitz.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bf5e5834
    • S
      KEYS: fix macro · 5b7741b3
      Sebastian Siewior 提交于
      Commit 664cceb0 changed the parameters of
      the function make_key_ref().  The macros that are used in case CONFIG_KEY
      is not defined did not change.
      
      Cc: David Howells <dhowells@redhat.com>
      Signed-off-by: NSebastian Siewior <sebastian@breakpoint.cc>
      Acked-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5b7741b3
  2. 07 1月, 2008 2 次提交
    • I
      CPU hotplug: fix cpu_is_offline() on !CONFIG_HOTPLUG_CPU · a263898f
      Ingo Molnar 提交于
      make randconfig bootup testing found that the cpufreq code
      crashes on bootup, if the powernow-k8 driver is enabled and
      if maxcpus=1 passed on the boot line to a !CONFIG_HOTPLUG_CPU
      kernel.
      
      First lockdep found out that there's an inconsistent unlock
      sequence:
      
       =====================================
       [ BUG: bad unlock balance detected! ]
       -------------------------------------
       swapper/1 is trying to release lock (&per_cpu(cpu_policy_rwsem, cpu)) at:
       [<ffffffff806ffd8e>] unlock_policy_rwsem_write+0x3c/0x42
       but there are no more locks to release!
      
      Call Trace:
       [<ffffffff806ffd8e>] unlock_policy_rwsem_write+0x3c/0x42
       [<ffffffff80251c29>] print_unlock_inbalance_bug+0x104/0x12c
       [<ffffffff80252f3a>] mark_held_locks+0x56/0x94
       [<ffffffff806ffd8e>] unlock_policy_rwsem_write+0x3c/0x42
       [<ffffffff807008b6>] cpufreq_add_dev+0x2a8/0x5c4
       ...
      
      then shortly afterwards the cpufreq code crashed on an assert:
      
       ------------[ cut here ]------------
       kernel BUG at drivers/cpufreq/cpufreq.c:1068!
       invalid opcode: 0000 [1] SMP
       [...]
       Call Trace:
        [<ffffffff805145d6>] sysdev_driver_unregister+0x5b/0x91
        [<ffffffff806ff520>] cpufreq_register_driver+0x15d/0x1a2
        [<ffffffff80cc0596>] powernowk8_init+0x86/0x94
       [...]
       ---[ end trace 1e9219be2b4431de ]---
      
      the bug was caused by maxcpus=1 bootup, which brought up the
      secondary core as !cpu_online() but !cpu_is_offline() either,
      which on on !CONFIG_HOTPLUG_CPU is always 0 (include/linux/cpu.h):
      
        /* CPUs don't go offline once they're online w/o CONFIG_HOTPLUG_CPU */
        static inline int cpu_is_offline(int cpu) { return 0; }
      
      but the cpufreq code uses cpu_online() and cpu_is_offline() in
      a mixed way - the low-level drivers use cpu_online(), while
      the cpufreq core uses cpu_is_offline(). This opened up the
      possibility to add the non-initialized sysdev device of the
      secondary core:
      
       cpufreq-core: trying to register driver powernow-k8
       cpufreq-core: adding CPU 0
       powernow-k8: BIOS error - no PSB or ACPI _PSS objects
       cpufreq-core: initialization failed
       cpufreq-core: adding CPU 1
       cpufreq-core: initialization failed
      
      which then blew up. The fix is to make cpu_is_offline() always
      the negation of cpu_online(). With that fix applied the kernel
      boots up fine without crashing:
      
       Calling initcall 0xffffffff80cc0510: powernowk8_init+0x0/0x94()
       powernow-k8: Found 1 AMD Athlon(tm) 64 X2 Dual Core Processor 3800+ processors (1 cpu cores) (version 2.20.00)
       powernow-k8: BIOS error - no PSB or ACPI _PSS objects
       initcall 0xffffffff80cc0510: powernowk8_init+0x0/0x94() returned -19.
       initcall 0xffffffff80cc0510 ran for 19 msecs: powernowk8_init+0x0/0x94()
       Calling initcall 0xffffffff80cc328f: init_lapic_nmi_sysfs+0x0/0x39()
      
      We could fix this by making CPU enumeration aware of max_cpus, but that
      would be more fragile IMO, and the cpu_online(cpu) != cpu_is_offline(cpu)
      possibility was quite confusing and a continuous source of bugs too.
      
      Most distributions have kernels with CPU hotplug enabled, so this bug
      remained hidden for a long time.
      
      Bug forensics:
      
      The broken cpu_is_offline() API variant was introduced via:
      
       commit a59d2e4e6977e7b94e003c96a41f07e96cddc340
       Author: Rusty Russell <rusty@rustcorp.com.au>
       Date:   Mon Mar 8 06:06:03 2004 -0800
      
           [PATCH] minor cleanups for hotplug CPUs
      
      ( this predates linux-2.6.git, this commit is available from Thomas's
        historic git tree. )
      
      Then 1.5 years later the cpufreq code made use of it:
      
       commit c32b6b8e
       Author: Ashok Raj <ashok.raj@intel.com>
       Date:   Sun Oct 30 14:59:54 2005 -0800
      
           [PATCH] create and destroy cpufreq sysfs entries based on cpu notifiers
      
       +       if (cpu_is_offline(cpu))
       +               return 0;
      
      which is a correct use of the subtly broken new API. v2.6.15 then
      shipped with this bug included.
      
      then it took two more years for random-kernel qa to hit it.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a263898f
    • L
      Revert "scsi: revert "[SCSI] Get rid of scsi_cmnd->done"" · 7b3d9545
      Linus Torvalds 提交于
      This reverts commit ac40532e, which gets
      us back the original cleanup of 6f5391c2.
      
      It turns out that the bug that was triggered by that commit was
      apparently not actually triggered by that commit at all, and just the
      testing conditions had changed enough to make it appear to be due to it.
      
      The real problem seems to have been found by Peter Osterlund:
      
        "pktcdvd sets it [block device size] when opening the /dev/pktcdvd
         device, but when the drive is later opened as /dev/scd0, there is
         nothing that sets it back.  (Btw, 40944 is possible if the disk is a
         CDRW that was formatted with "cdrwtool -m 10236".)
      
         The problem is that pktcdvd opens the cd device in non-blocking mode
         when pktsetup is run, and doesn't close it again until pktsetup -d is
         run.  The effect is that if you meanwhile open the cd device,
         blkdev.c:do_open() doesn't call bd_set_size() because
         bdev->bd_openers is non-zero."
      
      In particular, to repeat the bug (regardless of whether commit
      6f5391c2 is applied or not):
      
        " 1. Start with an empty drive.
          2. pktsetup 0 /dev/scd0
          3. Insert a CD containing an isofs filesystem.
          4. mount /dev/pktcdvd/0 /mnt/tmp
          5. umount /mnt/tmp
          6. Press the eject button.
          7. Insert a DVD containing a non-writable filesystem.
          8. mount /dev/scd0 /mnt/tmp
          9. find /mnt/tmp -type f -print0 | xargs -0 sha1sum >/dev/null
          10. If the DVD contains data beyond the physical size of a CD, you
              get I/O errors in the terminal, and dmesg reports lots of
              "attempt to access beyond end of device" errors."
      
      which in turn is because the nested open after the media change won't
      cause the size to be set properly (because the original open still holds
      the block device, and we only do the bd_set_size() when we don't have
      other people holding the device open).
      
      The proper fix for that is probably to just do something like
      
      	bdev->bd_inode->i_size = (loff_t)get_capacity(disk)<<9;
      
      in fs/block_dev.c:do_open() even for the cases where we're not the
      original opener (but *not* call bd_set_size(), since that will also
      change the block size of the device).
      
      Cc: Peter Osterlund <petero2@telia.com>
      Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
      Cc: Matthew Wilcox <matthew@wil.cx>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7b3d9545
  3. 04 1月, 2008 1 次提交
  4. 03 1月, 2008 3 次提交
  5. 02 1月, 2008 3 次提交
  6. 28 12月, 2007 1 次提交
  7. 27 12月, 2007 5 次提交
  8. 24 12月, 2007 1 次提交
  9. 21 12月, 2007 2 次提交
  10. 20 12月, 2007 3 次提交
  11. 19 12月, 2007 5 次提交
  12. 18 12月, 2007 7 次提交
    • A
      block: let elv_register() return void · 2fdd82bd
      Adrian Bunk 提交于
      elv_register() always returns 0, and there isn't anything it does where
      it should return an error (the only error condition is so grave that
      it's handled with a BUG_ON).
      Signed-off-by: NAdrian Bunk <bunk@kernel.org>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      2fdd82bd
    • C
      quicklist: Set tlb->need_flush if pages are remaining in quicklist 0 · 421d9919
      Christoph Lameter 提交于
      This ensures that the quicklists are drained. Otherwise draining may only
      occur when the processor reaches an idle state.
      
      Fixes fatal leakage of pgd_t's on 2.6.22 and later.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Reported-by: NDhaval Giani <dhaval@linux.vnet.ibm.com>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      421d9919
    • N
      Revert "hugetlb: Add hugetlb_dynamic_pool sysctl" · 368d2c63
      Nishanth Aravamudan 提交于
      This reverts commit 54f9f80d ("hugetlb:
      Add hugetlb_dynamic_pool sysctl")
      
      Given the new sysctl nr_overcommit_hugepages, the boolean dynamic pool
      sysctl is not needed, as its semantics can be expressed by 0 in the
      overcommit sysctl (no dynamic pool) and non-0 in the overcommit sysctl
      (pool enabled).
      
      (Needed in 2.6.24 since it reverts a post-2.6.23 userspace-visible change)
      Signed-off-by: NNishanth Aravamudan <nacc@us.ibm.com>
      Acked-by: NAdam Litke <agl@us.ibm.com>
      Cc: William Lee Irwin III <wli@holomorphy.com>
      Cc: Dave Hansen <haveblue@us.ibm.com>
      Cc: David Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      368d2c63
    • N
      hugetlb: introduce nr_overcommit_hugepages sysctl · d1c3fb1f
      Nishanth Aravamudan 提交于
      hugetlb: introduce nr_overcommit_hugepages sysctl
      
      While examining the code to support /proc/sys/vm/hugetlb_dynamic_pool, I
      became convinced that having a boolean sysctl was insufficient:
      
      1) To support per-node control of hugepages, I have previously submitted
      patches to add a sysfs attribute related to nr_hugepages. However, with
      a boolean global value and per-mount quota enforcement constraining the
      dynamic pool, adding corresponding control of the dynamic pool on a
      per-node basis seems inconsistent to me.
      
      2) Administration of the hugetlb dynamic pool with multiple hugetlbfs
      mount points is, arguably, more arduous than it needs to be. Each quota
      would need to be set separately, and the sum would need to be monitored.
      
      To ease the administration, and to help make the way for per-node
      control of the static & dynamic hugepage pool, I added a separate
      sysctl, nr_overcommit_hugepages. This value serves as a high watermark
      for the overall hugepage pool, while nr_hugepages serves as a low
      watermark. The boolean sysctl can then be removed, as the condition
      
      	nr_overcommit_hugepages > 0
      
      indicates the same administrative setting as
      
      	hugetlb_dynamic_pool == 1
      
      Quotas still serve as local enforcement of the size of the pool on a
      per-mount basis.
      
      A few caveats:
      
      1) There is a race whereby the global surplus huge page counter is
      incremented before a hugepage has allocated. Another process could then
      try grow the pool, and fail to convert a surplus huge page to a normal
      huge page and instead allocate a fresh huge page. I believe this is
      benign, as no memory is leaked (the actual pages are still tracked
      correctly) and the counters won't go out of sync.
      
      2) Shrinking the static pool while a surplus is in effect will allow the
      number of surplus huge pages to exceed the overcommit value. As long as
      this condition holds, however, no more surplus huge pages will be
      allowed on the system until one of the two sysctls are increased
      sufficiently, or the surplus huge pages go out of use and are freed.
      
      Successfully tested on x86_64 with the current libhugetlbfs snapshot,
      modified to use the new sysctl.
      Signed-off-by: NNishanth Aravamudan <nacc@us.ibm.com>
      Acked-by: NAdam Litke <agl@us.ibm.com>
      Cc: William Lee Irwin III <wli@holomorphy.com>
      Cc: Dave Hansen <haveblue@us.ibm.com>
      Cc: David Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d1c3fb1f
    • A
      apm_event{,info}_t are userspace types · 8d936626
      Adam Jackson 提交于
      These types define the size of data read from /dev/apm_bios.  They should
      not be hidden behind #ifdef __KERNEL__.
      
      This is killing my xserver compile, apm_event_t is used in the xserver
      source.
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8d936626
    • I
      alpha: build fixes · 9548b209
      Ivan Kokshaysky 提交于
      This fixes some of the alpha-specific build problems, except a) modpost
      warning about COMMON symbol "saved_config" and b) nasty final link
      failure with gcc-4.x, -Os and scsi-disk driver configured built-in
      (due to jump table in .rodata referencing discarded .exit.text).
      
      - build failure with gcc-4.2.x: fix up casts in cia_io* routines to avoid
        warnings ('discards qualifiers from pointer target type'), which are
        failures, thanks to -Werror;
      - modpost warnings: add missing __init qualifier for titan and marvel;
        for non-generic build, move machine vectors from .data to .data.init.refok
        section;
      - unbreak CPU-specific optimization: rearrange cpuflags-y assignments
        so that extended -mcpu value (ev56, pca56, ev67) overrides basic
        one (ev5, ev6) and not vice versa.
      Signed-off-by: NIvan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Richard Henderson <rth@twiddle.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9548b209
    • A
      fix headers_install · 75527135
      Andrew Morton 提交于
      make[3]: *** No rule to make target `/usr/src/devel/include/linux/ticable.h', needed by `/usr/src/devel/usr/include/linux/ticable.h'.  Stop.
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Acked-by: NGreg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      75527135