1. 09 1月, 2006 7 次提交
    • I
      [PATCH] RCU signal handling · e56d0903
      Ingo Molnar 提交于
      RCU tasklist_lock and RCU signal handling: send signals RCU-read-locked
      instead of tasklist_lock read-locked.  This is a scalability improvement on
      SMP and a preemption-latency improvement under PREEMPT_RCU.
      Signed-off-by: NPaul E. McKenney <paulmck@us.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Acked-by: NWilliam Irwin <wli@holomorphy.com>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: Oleg Nesterov <oleg@tv-sign.ru>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      e56d0903
    • R
      [PATCH] Change maxaligned_in_smp alignemnt macros to internodealigned_in_smp macros · 22fc6ecc
      Ravikiran G Thirumalai 提交于
      ____cacheline_maxaligned_in_smp is currently used to align critical structures
      and avoid false sharing.  It uses per-arch L1_CACHE_SHIFT_MAX and people find
      L1_CACHE_SHIFT_MAX useless.
      
      However, we have been using ____cacheline_maxaligned_in_smp to align
      structures on the internode cacheline size.  As per Andi's suggestion,
      following patch kills ____cacheline_maxaligned_in_smp and introduces
      INTERNODE_CACHE_SHIFT, which defaults to L1_CACHE_SHIFT for all arches.
      Arches needing L3/Internode cacheline alignment can define
      INTERNODE_CACHE_SHIFT in the arch asm/cache.h.  Patch replaces
      ____cacheline_maxaligned_in_smp with ____cacheline_internodealigned_in_smp
      
      With this patch, L1_CACHE_SHIFT_MAX can be killed
      Signed-off-by: NRavikiran Thirumalai <kiran@scalex86.org>
      Signed-off-by: NShai Fultheim <shai@scalex86.org>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      22fc6ecc
    • P
      [PATCH] cpusets: swap migration interface · 45b07ef3
      Paul Jackson 提交于
      Add a boolean "memory_migrate" to each cpuset, represented by a file
      containing "0" or "1" in each directory below /dev/cpuset.
      
      It defaults to false (file contains "0").  It can be set true by writing
      "1" to the file.
      
      If true, then anytime that a task is attached to the cpuset so marked, the
      pages of that task will be moved to that cpuset, preserving, to the extent
      practical, the cpuset-relative placement of the pages.
      
      Also anytime that a cpuset so marked has its memory placement changed (by
      writing to its "mems" file), the tasks in that cpuset will have their pages
      moved to the cpusets new nodes, preserving, to the extent practical, the
      cpuset-relative placement of the moved pages.
      Signed-off-by: NPaul Jackson <pj@sgi.com>
      Cc: Christoph Lameter <christoph@lameter.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      45b07ef3
    • C
      [PATCH] Swap Migration V5: sys_migrate_pages interface · 39743889
      Christoph Lameter 提交于
      sys_migrate_pages implementation using swap based page migration
      
      This is the original API proposed by Ray Bryant in his posts during the first
      half of 2005 on linux-mm@kvack.org and linux-kernel@vger.kernel.org.
      
      The intent of sys_migrate is to migrate memory of a process.  A process may
      have migrated to another node.  Memory was allocated optimally for the prior
      context.  sys_migrate_pages allows to shift the memory to the new node.
      
      sys_migrate_pages is also useful if the processes available memory nodes have
      changed through cpuset operations to manually move the processes memory.  Paul
      Jackson is working on an automated mechanism that will allow an automatic
      migration if the cpuset of a process is changed.  However, a user may decide
      to manually control the migration.
      
      This implementation is put into the policy layer since it uses concepts and
      functions that are also needed for mbind and friends.  The patch also provides
      a do_migrate_pages function that may be useful for cpusets to automatically
      move memory.  sys_migrate_pages does not modify policies in contrast to Ray's
      implementation.
      
      The current code here is based on the swap based page migration capability and
      thus is not able to preserve the physical layout relative to it containing
      nodeset (which may be a cpuset).  When direct page migration becomes available
      then the implementation needs to be changed to do a isomorphic move of pages
      between different nodesets.  The current implementation simply evicts all
      pages in source nodeset that are not in the target nodeset.
      
      Patch supports ia64, i386 and x86_64.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      39743889
    • C
      [PATCH] add schedule_on_each_cpu() · 15316ba8
      Christoph Lameter 提交于
      swap migration's isolate_lru_page() currently uses an IPI to notify other
      processors that the lru caches need to be drained if the page cannot be
      found on the LRU.  The IPI interrupt may interrupt a processor that is just
      processing lru requests and cause a race condition.
      
      This patch introduces a new function run_on_each_cpu() that uses the
      keventd() to run the LRU draining on each processor.  Processors disable
      preemption when dealing the LRU caches (these are per processor) and thus
      executing LRU draining from another process is safe.
      
      Thanks to Lee Schermerhorn <lee.schermerhorn@hp.com> for finding this race
      condition.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      15316ba8
    • R
      [PATCH] Make high and batch sizes of per_cpu_pagelists configurable · 8ad4b1fb
      Rohit Seth 提交于
      As recently there has been lot of traffic on the right values for batch and
      high water marks for per_cpu_pagelists.  This patch makes these two
      variables configurable through /proc interface.
      
      A new tunable /proc/sys/vm/percpu_pagelist_fraction is added.  This entry
      controls the fraction of pages at most in each zone that are allocated for
      each per cpu page list.  The min value for this is 8.  It means that we
      don't allow more than 1/8th of pages in each zone to be allocated in any
      single per_cpu_pagelist.
      
      The batch value of each per cpu pagelist is also updated as a result.  It
      is set to pcp->high/4.  The upper limit of batch is (PAGE_SHIFT * 8)
      Signed-off-by: NRohit Seth <rohit.seth@intel.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      8ad4b1fb
    • A
      [PATCH] drop-pagecache · 9d0243bc
      Andrew Morton 提交于
      Add /proc/sys/vm/drop_caches.  When written to, this will cause the kernel to
      discard as much pagecache and/or reclaimable slab objects as it can.  THis
      operation requires root permissions.
      
      It won't drop dirty data, so the user should run `sync' first.
      
      Caveats:
      
      a) Holds inode_lock for exorbitant amounts of time.
      
      b) Needs to be taught about NUMA nodes: propagate these all the way through
         so the discarding can be controlled on a per-node basis.
      
      This is a debugging feature: useful for getting consistent results between
      filesystem benchmarks.  We could possibly put it under a config option, but
      it's less than 300 bytes.
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      9d0243bc
  2. 07 1月, 2006 16 次提交
  3. 05 1月, 2006 4 次提交
    • A
      [PATCH] kobject_uevent CONFIG_NET=n fix · f743ca5e
      akpm@osdl.org 提交于
      lib/lib.a(kobject_uevent.o)(.text+0x25f): In function `kobject_uevent':
      : undefined reference to `__alloc_skb'
      lib/lib.a(kobject_uevent.o)(.text+0x2a1): In function `kobject_uevent':
      : undefined reference to `skb_over_panic'
      lib/lib.a(kobject_uevent.o)(.text+0x31d): In function `kobject_uevent':
      : undefined reference to `skb_over_panic'
      lib/lib.a(kobject_uevent.o)(.text+0x356): In function `kobject_uevent':
      : undefined reference to `netlink_broadcast'
      lib/lib.a(kobject_uevent.o)(.init.text+0x9): In function `kobject_uevent_init':
      : undefined reference to `netlink_kernel_create'
      make: *** [.tmp_vmlinux1] Error 1
      
      Netlink is unconditionally enabled if CONFIG_NET, so that's OK.
      
      kobject_uevent.o is compiled even if !CONFIG_HOTPLUG, which is lazy.
      
      Let's compound the sin.
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      f743ca5e
    • K
      [PATCH] driver core: replace "hotplug" by "uevent" · 312c004d
      Kay Sievers 提交于
      Leave the overloaded "hotplug" word to susbsystems which are handling
      real devices. The driver core does not "plug" anything, it just exports
      the state to userspace and generates events.
      Signed-off-by: NKay Sievers <kay.sievers@suse.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      312c004d
    • K
      [PATCH] add uevent_helper control in /sys/kernel/ · 0f76e5ac
      Kay Sievers 提交于
      This deprecates the /proc/sys/kernel/hotplug file, as all
      this stuff should be in /sys some day, right? :)
      In /sys/kernel/ we have now uevent_seqnum and uevent_helper.
      The seqnum is no longer used by udev, as the version for this
      kernel depends on netlink which events will never get
      out-of-order.
      
      Recent udev versions disable the /sbin/hotplug helper with
      an init script, cause it leads to OOM on big boxes by running
      hundreds of shells in parallel. It should be done now by:
        echo "" > /sys/kernel/uevent_helper
      
      (Note that "-n" does not work, cause neighter proc nor sysfs
      support truncate().)
      Signed-off-by: NKay Sievers <kay.sievers@suse.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      0f76e5ac
    • K
      [PATCH] remove CONFIG_KOBJECT_UEVENT option · 0296b228
      Kay Sievers 提交于
      It makes zero sense to have hotplug, but not the netlink
      events enabled today. Remove this option and merge the
      kobject_uevent.h header into the kobject.h header file.
      Signed-off-by: NKay Sievers <kay.sievers@suse.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      0296b228
  4. 03 1月, 2006 2 次提交
  5. 01 1月, 2006 1 次提交
  6. 31 12月, 2005 2 次提交
    • Y
      [PATCH] Fix false old value return of sysctl · 82c9df82
      Yi Yang 提交于
      For the sysctl syscall, if the user wants to get the old value of a
      sysctl entry and set a new value for it in the same syscall, the old
      value is always overwritten by the new value if the sysctl entry is of
      string type and if the user sets its strategy to sysctl_string.  This
      issue lies in the strategy being run twice if the strategy is set to
      sysctl_string, the general strategy sysctl_string always returns 0 if
      success.
      
      Such strategy routines as sysctl_jiffies and sysctl_jiffies_ms return 1
      because they do read and write for the sysctl entry.
      
      The strategy routine sysctl_string return 0 although it actually read
      and write the sysctl entry.
      
      According to my analysis, if a strategy routine do read and write, it
      should return 1, if it just does some necessary check but not read and
      write, it should return 0, for example sysctl_intvec.
      Signed-off-by: NYi Yang <yang.y.yi@gmail.com>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      82c9df82
    • L
      sysctl: don't overflow the user-supplied buffer with '\0' · 8febdd85
      Linus Torvalds 提交于
      If the string was too long to fit in the user-supplied buffer,
      the sysctl layer would zero-terminate it by writing past the
      end of the buffer. Don't do that.
      
      Noticed by Yi Yang <yang.y.yi@gmail.com>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      8febdd85
  7. 25 12月, 2005 1 次提交
  8. 21 12月, 2005 1 次提交
  9. 13 12月, 2005 6 次提交