1. 04 7月, 2006 1 次提交
  2. 03 7月, 2006 1 次提交
  3. 01 7月, 2006 1 次提交
    • C
      [PATCH] zoned vm counters: create vmstat.c/.h from page_alloc.c/.h · f6ac2354
      Christoph Lameter 提交于
      NOTE: ZVC are *not* the lightweight event counters.  ZVCs are reliable whereas
      event counters do not need to be.
      
      Zone based VM statistics are necessary to be able to determine what the state
      of memory in one zone is.  In a NUMA system this can be helpful for local
      reclaim and other memory optimizations that may be able to shift VM load in
      order to get more balanced memory use.
      
      It is also useful to know how the computing load affects the memory
      allocations on various zones.  This patchset allows the retrieval of that data
      from userspace.
      
      The patchset introduces a framework for counters that is a cross between the
      existing page_stats --which are simply global counters split per cpu-- and the
      approach of deferred incremental updates implemented for nr_pagecache.
      
      Small per cpu 8 bit counters are added to struct zone.  If the counter exceeds
      certain thresholds then the counters are accumulated in an array of
      atomic_long in the zone and in a global array that sums up all zone values.
      The small 8 bit counters are next to the per cpu page pointers and so they
      will be in high in the cpu cache when pages are allocated and freed.
      
      Access to VM counter information for a zone and for the whole machine is then
      possible by simply indexing an array (Thanks to Nick Piggin for pointing out
      that approach).  The access to the total number of pages of various types does
      no longer require the summing up of all per cpu counters.
      
      Benefits of this patchset right now:
      
      - Ability for UP and SMP configuration to determine how memory
        is balanced between the DMA, NORMAL and HIGHMEM zones.
      
      - loops over all processors are avoided in writeback and
        reclaim paths. We can avoid caching the writeback information
        because the needed information is directly accessible.
      
      - Special handling for nr_pagecache removed.
      
      - zone_reclaim_interval vanishes since VM stats can now determine
        when it is worth to do local reclaim.
      
      - Fast inline per node page state determination.
      
      - Accurate counters in /sys/devices/system/node/node*/meminfo. Current
        counters are counting simply which processor allocated a page somewhere
        and guestimate based on that. So the counters were not useful to show
        the actual distribution of page use on a specific zone.
      
      - The swap_prefetch patch requires per node statistics in order to
        figure out when processors of a node can prefetch. This patch provides
        some of the needed numbers.
      
      - Detailed VM counters available in more /proc and /sys status files.
      
      References to earlier discussions:
      V1 http://marc.theaimsgroup.com/?l=linux-kernel&m=113511649910826&w=2
      V2 http://marc.theaimsgroup.com/?l=linux-kernel&m=114980851924230&w=2
      V3 http://marc.theaimsgroup.com/?l=linux-kernel&m=115014697910351&w=2
      V4 http://marc.theaimsgroup.com/?l=linux-kernel&m=115024767318740&w=2
      
      Performance tests with AIM7 did not show any regressions.  Seems to be a tad
      faster even.  Tested on ia64/NUMA.  Builds fine on i386, SMP / UP.  Includes
      fixes for s390/arm/uml arch code.
      
      This patch:
      
      Move counter code from page_alloc.c/page-flags.h to vmstat.c/h.
      
      Create vmstat.c/vmstat.h by separating the counter code and the proc
      functions.
      
      Move the vm_stat_text array before zoneinfo_show.
      
      [akpm@osdl.org: s390 build fix]
      [akpm@osdl.org: HOTPLUG_CPU build fix]
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      f6ac2354
  4. 30 6月, 2006 1 次提交
    • C
      [AF_UNIX]: Datagram getpeersec · 877ce7c1
      Catherine Zhang 提交于
      This patch implements an API whereby an application can determine the
      label of its peer's Unix datagram sockets via the auxiliary data mechanism of
      recvmsg.
      
      Patch purpose:
      
      This patch enables a security-aware application to retrieve the
      security context of the peer of a Unix datagram socket.  The application
      can then use this security context to determine the security context for
      processing on behalf of the peer who sent the packet.
      
      Patch design and implementation:
      
      The design and implementation is very similar to the UDP case for INET
      sockets.  Basically we build upon the existing Unix domain socket API for
      retrieving user credentials.  Linux offers the API for obtaining user
      credentials via ancillary messages (i.e., out of band/control messages
      that are bundled together with a normal message).  To retrieve the security
      context, the application first indicates to the kernel such desire by
      setting the SO_PASSSEC option via getsockopt.  Then the application
      retrieves the security context using the auxiliary data mechanism.
      
      An example server application for Unix datagram socket should look like this:
      
      toggle = 1;
      toggle_len = sizeof(toggle);
      
      setsockopt(sockfd, SOL_SOCKET, SO_PASSSEC, &toggle, &toggle_len);
      recvmsg(sockfd, &msg_hdr, 0);
      if (msg_hdr.msg_controllen > sizeof(struct cmsghdr)) {
          cmsg_hdr = CMSG_FIRSTHDR(&msg_hdr);
          if (cmsg_hdr->cmsg_len <= CMSG_LEN(sizeof(scontext)) &&
              cmsg_hdr->cmsg_level == SOL_SOCKET &&
              cmsg_hdr->cmsg_type == SCM_SECURITY) {
              memcpy(&scontext, CMSG_DATA(cmsg_hdr), sizeof(scontext));
          }
      }
      
      sock_setsockopt is enhanced with a new socket option SOCK_PASSSEC to allow
      a server socket to receive security context of the peer.
      
      Testing:
      
      We have tested the patch by setting up Unix datagram client and server
      applications.  We verified that the server can retrieve the security context
      using the auxiliary data mechanism of recvmsg.
      Signed-off-by: NCatherine Zhang <cxzhang@watson.ibm.com>
      Acked-by: NAcked-by: James Morris <jmorris@namei.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      877ce7c1
  5. 29 6月, 2006 6 次提交
  6. 26 6月, 2006 1 次提交
  7. 23 6月, 2006 1 次提交
    • J
      [PATCH] adjust handle_IRR_event() return type · 908dcecd
      Jan Beulich 提交于
      Correct the return type of handle_IRQ_event() (inconsistency noticed during
      Xen development), and remove redundant declarations.  The return type
      adjustment required breaking out the definition of irqreturn_t into a
      separate header, in order to satisfy current include order dependencies.
      Signed-off-by: NJan Beulich <jbeulich@novell.com>
      
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Ian Molton <spyro@f2s.com>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Hirokazu Takata <takata.hirokazu@renesas.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: William Lee Irwin III <wli@holomorphy.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Roman Zippel <zippel@linux-m68k.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      908dcecd
  8. 21 6月, 2006 1 次提交
  9. 14 6月, 2006 1 次提交
  10. 09 6月, 2006 1 次提交
  11. 06 6月, 2006 1 次提交
  12. 16 5月, 2006 1 次提交
  13. 08 5月, 2006 1 次提交
  14. 04 5月, 2006 2 次提交
  15. 29 4月, 2006 1 次提交
  16. 28 4月, 2006 2 次提交
  17. 27 4月, 2006 1 次提交
  18. 26 4月, 2006 1 次提交
  19. 11 4月, 2006 2 次提交
  20. 01 4月, 2006 1 次提交
  21. 28 3月, 2006 2 次提交
  22. 27 3月, 2006 3 次提交
  23. 26 3月, 2006 1 次提交
    • D
      [PATCH] POLLRDHUP/EPOLLRDHUP handling for half-closed devices notifications · f348d70a
      Davide Libenzi 提交于
      Implement the half-closed devices notifiation, by adding a new POLLRDHUP
      (and its alias EPOLLRDHUP) bit to the existing poll/select sets.  Since the
      existing POLLHUP handling, that does not report correctly half-closed
      devices, was feared to be changed, this implementation leaves the current
      POLLHUP reporting unchanged and simply add a new bit that is set in the few
      places where it makes sense.  The same thing was discussed and conceptually
      agreed quite some time ago:
      
      http://lkml.org/lkml/2003/7/12/116
      
      Since this new event bit is added to the existing Linux poll infrastruture,
      even the existing poll/select system calls will be able to use it.  As far
      as the existing POLLHUP handling, the patch leaves it as is.  The
      pollrdhup-2.6.16.rc5-0.10.diff defines the POLLRDHUP for all the existing
      archs and sets the bit in the six relevant files.  The other attached diff
      is the simple change required to sys/epoll.h to add the EPOLLRDHUP
      definition.
      
      There is "a stupid program" to test POLLRDHUP delivery here:
      
       http://www.xmailserver.org/pollrdhup-test.c
      
      It tests poll(2), but since the delivery is same epoll(2) will work equally.
      Signed-off-by: NDavide Libenzi <davidel@xmailserver.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Michael Kerrisk <mtk-manpages@gmx.net>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      f348d70a
  24. 24 3月, 2006 1 次提交
  25. 23 3月, 2006 2 次提交
    • A
      [PATCH] more for_each_cpu() conversions · 394e3902
      Andrew Morton 提交于
      When we stop allocating percpu memory for not-possible CPUs we must not touch
      the percpu data for not-possible CPUs at all.  The correct way of doing this
      is to test cpu_possible() or to use for_each_cpu().
      
      This patch is a kernel-wide sweep of all instances of NR_CPUS.  I found very
      few instances of this bug, if any.  But the patch converts lots of open-coded
      test to use the preferred helper macros.
      
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: David Howells <dhowells@redhat.com>
      Acked-by: NKyle McMartin <kyle@parisc-linux.org>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: William Lee Irwin III <wli@holomorphy.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: Christian Zankel <chris@zankel.net>
      Cc: Philippe Elie <phil.el@wanadoo.fr>
      Cc: Nathan Scott <nathans@sgi.com>
      Cc: Jens Axboe <axboe@suse.de>
      Cc: Eric Dumazet <dada1@cosmosbay.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      394e3902
    • N
      [PATCH] atomic: add_unless cmpxchg optimise · 0b2fcfdb
      Nick Piggin 提交于
      Without branch hints, the very unlikely chance of the loop repeating due to
      cmpxchg failure is unrolled with gcc-4 that I have tested.
      
      Improve this for architectures with a native cas/cmpxchg.  llsc archs
      should try to implement this natively.
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Cc: Andi Kleen <ak@muc.de>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Roman Zippel <zippel@linux-m68k.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      0b2fcfdb
  26. 22 3月, 2006 1 次提交
  27. 07 3月, 2006 1 次提交
  28. 21 2月, 2006 1 次提交