1. 24 2月, 2009 2 次提交
    • T
      bootmem: reorder interface functions and add a missing one · 2d0aae41
      Tejun Heo 提交于
      Impact: cleanup and addition of missing interface wrapper
      
      The interface functions in bootmem.h was ordered in not so orderly
      manner.  Reorder them such that
      
      * functions allocating the same area group together -
        ie. alloc_bootmem group and alloc_bootmem_low group.
      
      * functions w/o node parameter come before the ones w/ node parameter.
      
      * nopanic variants are immediately below their panicky counterparts.
      
      While at it, add alloc_bootmem_pages_node_nopanic() which was missing.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Johannes Weiner <hannes@saeurebad.de>
      2d0aae41
    • T
      bootmem: clean up arch-specific bootmem wrapping · c1329375
      Tejun Heo 提交于
      Impact: cleaner and consistent bootmem wrapping
      
      By setting CONFIG_HAVE_ARCH_BOOTMEM_NODE, archs can define
      arch-specific wrappers for bootmem allocation.  However, this is done
      a bit strangely in that only the high level convenience macros can be
      changed while lower level, but still exported, interface functions
      can't be wrapped.  This not only is messy but also leads to strange
      situation where alloc_bootmem() does what the arch wants it to do but
      the equivalent __alloc_bootmem() call doesn't although they should be
      able to be used interchangeably.
      
      This patch updates bootmem such that archs can override / wrap the
      backend function - alloc_bootmem_core() instead of the highlevel
      interface functions to allow simpler and consistent wrapping.  Also,
      HAVE_ARCH_BOOTMEM_NODE is renamed to HAVE_ARCH_BOOTMEM.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Johannes Weiner <hannes@saeurebad.de>
      c1329375
  2. 20 2月, 2009 6 次提交
    • T
      percpu: implement new dynamic percpu allocator · fbf59bc9
      Tejun Heo 提交于
      Impact: new scalable dynamic percpu allocator which allows dynamic
              percpu areas to be accessed the same way as static ones
      
      Implement scalable dynamic percpu allocator which can be used for both
      static and dynamic percpu areas.  This will allow static and dynamic
      areas to share faster direct access methods.  This feature is optional
      and enabled only when CONFIG_HAVE_DYNAMIC_PER_CPU_AREA is defined by
      arch.  Please read comment on top of mm/percpu.c for details.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      fbf59bc9
    • T
      vmalloc: add un/map_kernel_range_noflush() · 8fc48985
      Tejun Heo 提交于
      Impact: two more public map/unmap functions
      
      Implement map_kernel_range_noflush() and unmap_kernel_range_noflush().
      These functions respectively map and unmap address range in kernel VM
      area but doesn't do any vcache or tlb flushing.  These will be used by
      new percpu allocator.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Nick Piggin <nickpiggin@yahoo.com.au>
      8fc48985
    • T
      vmalloc: implement vm_area_register_early() · f0aa6617
      Tejun Heo 提交于
      Impact: allow multiple early vm areas
      
      There are places where kernel VM area needs to be allocated before
      vmalloc is initialized.  This is done by allocating static vm_struct,
      initializing several fields and linking it to vmlist and later vmalloc
      initialization picking up these from vmlist.  This is currently done
      manually and if there's more than one such areas, there's no defined
      way to arbitrate who gets which address.
      
      This patch implements vm_area_register_early(), which takes vm_area
      struct with flags and size initialized, assigns address to it and puts
      it on the vmlist.  This way, multiple early vm areas can determine
      which addresses they should use.  The only current user - alpha mm
      init - is converted to use it.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      f0aa6617
    • T
      percpu: kill percpu_alloc() and friends · f2a8205c
      Tejun Heo 提交于
      Impact: kill unused functions
      
      percpu_alloc() and its friends never saw much action.  It was supposed
      to replace the cpu-mask unaware __alloc_percpu() but it never happened
      and in fact __percpu_alloc_mask() itself never really grew proper
      up/down handling interface either (no exported interface for
      populate/depopulate).
      
      percpu allocation is about to go through major reimplementation and
      there's no reason to carry this unused interface around.  Replace it
      with __alloc_percpu() and free_percpu().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      f2a8205c
    • R
      alloc_percpu: add align argument to __alloc_percpu. · 313e458f
      Rusty Russell 提交于
      This prepares for a real __alloc_percpu, by adding an alignment argument.
      Only one place uses __alloc_percpu directly, and that's for a string.
      
      tj: af_inet also uses __alloc_percpu(), update it.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      313e458f
    • R
      alloc_percpu: change percpu_ptr to per_cpu_ptr · b36128c8
      Rusty Russell 提交于
      Impact: cleanup
      
      There are two allocated per-cpu accessor macros with almost identical
      spelling.  The original and far more popular is per_cpu_ptr (44
      files), so change over the other 4 files.
      
      tj: kill percpu_ptr() and update UP too
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: mingo@redhat.com
      Cc: lenb@kernel.org
      Cc: cpufreq@vger.kernel.org
      Signed-off-by: NTejun Heo <tj@kernel.org>
      b36128c8
  3. 10 2月, 2009 1 次提交
    • T
      elf: add ELF_CORE_COPY_KERNEL_REGS() · 6cd61c0b
      Tejun Heo 提交于
      ELF core dump is used for both user land core dump and kernel crash
      dump.  Depending on architecture, register might need to be accessed
      differently for userland and kernel.  Allow architectures to define
      ELF_CORE_COPY_KERNEL_REGS() and use different operation for kernel
      register dump.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6cd61c0b
  4. 09 2月, 2009 2 次提交
  5. 08 2月, 2009 1 次提交
  6. 06 2月, 2009 3 次提交
    • J
      wait: prevent exclusive waiter starvation · 777c6c5f
      Johannes Weiner 提交于
      With exclusive waiters, every process woken up through the wait queue must
      ensure that the next waiter down the line is woken when it has finished.
      
      Interruptible waiters don't do that when aborting due to a signal.  And if
      an aborting waiter is concurrently woken up through the waitqueue, noone
      will ever wake up the next waiter.
      
      This has been observed with __wait_on_bit_lock() used by
      lock_page_killable(): the first contender on the queue was aborting when
      the actual lock holder woke it up concurrently.  The aborted contender
      didn't acquire the lock and therefor never did an unlock followed by
      waking up the next waiter.
      
      Add abort_exclusive_wait() which removes the process' wait descriptor from
      the waitqueue, iff still queued, or wakes up the next waiter otherwise.
      It does so under the waitqueue lock.  Racing with a wake up means the
      aborting process is either already woken (removed from the queue) and will
      wake up the next waiter, or it will remove itself from the queue and the
      concurrent wake up will apply to the next waiter after it.
      
      Use abort_exclusive_wait() in __wait_event_interruptible_exclusive() and
      __wait_on_bit_lock() when they were interrupted by other means than a wake
      up through the queue.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Reported-by: NChris Mason <chris.mason@oracle.com>
      Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Mentored-by: NOleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Matthew Wilcox <matthew@wil.cx>
      Cc: Chuck Lever <cel@citi.umich.edu>
      Cc: Nick Piggin <nickpiggin@yahoo.com.au>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: <stable@kernel.org>		["after some testing"]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      777c6c5f
    • A
      fbmem: don't call copy_from/to_user() with mutex held · 1f5e31d7
      Andrea Righi 提交于
      Avoid calling copy_from/to_user() with fb_info->lock mutex held in fbmem
      ioctl().
      
      fb_mmap() is called under mm->mmap_sem (A) held, that also acquires
      fb_info->lock (B); fb_ioctl() takes fb_info->lock (B) and does
      copy_from/to_user() that might acquire mm->mmap_sem (A), causing a
      deadlock.
      
      NOTE: it doesn't push down the fb_info->lock in each own driver's
      fb_ioctl(), so there are still potential deadlocks elsewhere.
      Signed-off-by: NAndrea Righi <righi.andrea@gmail.com>
      Cc: Dave Jones <davej@redhat.com>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Cc: Johannes Weiner <hannes@saeurebad.de>
      Cc: Krzysztof Helt <krzysztof.h1@wp.pl>
      Cc: Harvey Harrison <harvey.harrison@gmail.com>
      Cc: Stefan Richter <stefanr@s5r6.in-berlin.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1f5e31d7
    • P
      generic swap(): don't return a value from swap() · ac7b9004
      Peter Zijlstra 提交于
      The swap() macro is accidentally retuning the value of its first argument.
      Change it into a doesn't-return-anything macro before someone goes and
      relies upon this behaviour.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wu Fengguang <wfg@linux.intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ac7b9004
  7. 05 2月, 2009 1 次提交
  8. 03 2月, 2009 6 次提交
    • R
      sched: add missing kernel-doc in sched.h · 35626129
      Randy Dunlap 提交于
      Add kernel-doc notation for @lock:
      
      include/linux/sched.h:457: No description found for parameter 'lock'
      Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      35626129
    • T
      libata: implement HORKAGE_1_5_GBPS and apply it to WD My Book · 9062712f
      Tejun Heo 提交于
      3Gbps is often much more prone to transmission failures.  It's usually
      okay to let EH handle speed down after transmission failures but some
      WD My Book drives completely shutdown after certain transmission
      failures and after it only power cycling can revive them.  Combined
      with the fact that external drives often end up with cable assembly
      which is longer than usual and more likely to have intervening gender,
      this makes these drives very likely to shutdown under certain
      configurations virtually rendering them unusable.
      
      This patch implements HOARKGE_1_5_GBPS and applies it to WD My Book
      such that 1.5Gbps is forced once the device is identified.
      
      Please take a look at the following bz for related reports.
      
        http://bugzilla.kernel.org/show_bug.cgi?id=9913Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      9062712f
    • T
      libata: clear dev->ering in smarter way · 99cf610a
      Tejun Heo 提交于
      dev->ering used to be cleared together with the rest of ata_device in
      ata_dev_init() which is called whenever a probing event occurs.
      dev->ering is about to be used to track probing failures so it needs
      to remain persistent over multiple porbing events.  This patch
      achieves this by doing the following.
      
      * Instead of CLEAR_OFFSET, define CLEAR_BEGIN and CLEAR_END and only
        clear between BEGIN and END.  ering is moved after END.  The split
        of persistent area is to allow hotter items remain at the head.
      
      * ering is explicitly cleared on ata_dev_disable() and when device
        attach succeeds.  So, ering is persistent throug a device's life
        time (unless explicitly cleared of course) and also through periods
        inbetween disablement of an attached device and successful detection
        of the next one.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      99cf610a
    • S
      ide/libata: fix ata_id_is_cfa() (take 4) · 2999b58b
      Sergei Shtylyov 提交于
      When checking for the CFA feature set support, ata_id_is_cfa() tests bit 2 in
      word 82 of the identify data instead the word 83;  it also checks the ATA/PI
      version support in the word 80 (which the CompactFlash specifications have as
      reserved), this having no slightest chance to work on the modern CF cards that
      don't have 0x848A in the word 0...
      Signed-off-by: NSergei Shtylyov <sshtylyov@ru.mvista.com>
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      2999b58b
    • E
      modules: Use a better scheme for refcounting · 720eba31
      Eric Dumazet 提交于
      Current refcounting for modules (done if CONFIG_MODULE_UNLOAD=y) is
      using a lot of memory.
      
      Each 'struct module' contains an [NR_CPUS] array of full cache lines.
      
      This patch uses existing infrastructure (percpu_modalloc() &
      percpu_modfree()) to allocate percpu space for the refcount storage.
      
      Instead of wasting NR_CPUS*128 bytes (on i386), we now use
      nr_cpu_ids*sizeof(local_t) bytes.
      
      On a typical distro, where NR_CPUS=8, shiping 2000 modules, we reduce
      size of module files by about 2 Mbytes. (1Kb per module)
      
      Instead of having all refcounters in the same memory node - with TLB misses
      because of vmalloc() - this new implementation permits to have better
      NUMA properties, since each  CPU will use storage on its preferred node,
      thanks to percpu storage.
      Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      720eba31
    • D
      net: Fix userland breakage wrt. linux/if_tunnel.h · 0afd4a21
      David S. Miller 提交于
      Reported by Andrew Walrond <andrew@walrond.org>
      
      Changeset c19e654d
      ("gre: Add netlink interface") added an include
      of linux/ip.h to linux/if_tunnel.h
      
      We can't really let that get exposed to userspace
      because this conflicts with types defined in netinet/ip.h
      which userland is almost certainly going to have included
      either explicitly or implicitly.
      
      So guard this include with a __KERNEL__ ifdef.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0afd4a21
  9. 02 2月, 2009 3 次提交
  10. 31 1月, 2009 15 次提交