1. 08 12月, 2006 40 次提交
    • M
      [PATCH] kprobes: enable booster on the preemptible kernel · b4c6c34a
      Masami Hiramatsu 提交于
      When we are unregistering a kprobe-booster, we can't release its
      instruction buffer immediately on the preemptive kernel, because some
      processes might be preempted on the buffer.  The freeze_processes() and
      thaw_processes() functions can clean most of processes up from the buffer.
      There are still some non-frozen threads who have the PF_NOFREEZE flag.  If
      those threads are sleeping (not preempted) at the known place outside the
      buffer, we can ensure safety of freeing.
      
      However, the processing of this check routine takes a long time.  So, this
      patch introduces the garbage collection mechanism of insn_slot.  It also
      introduces the "dirty" flag to free_insn_slot because of efficiency.
      
      The "clean" instruction slots (dirty flag is cleared) are released
      immediately.  But the "dirty" slots which are used by boosted kprobes, are
      marked as garbages.  collect_garbage_slots() will be invoked to release
      "dirty" slots if there are more than INSNS_PER_PAGE garbage slots or if
      there are no unused slots.
      
      Cc: "Keshavamurthy, Anil S" <anil.s.keshavamurthy@intel.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: "bibo,mao" <bibo.mao@intel.com>
      Cc: Prasanna S Panchamukhi <prasanna@in.ibm.com>
      Cc: Yumiko Sugita <yumiko.sugita.yf@hitachi.com>
      Cc: Satoshi Oshima <soshima@redhat.com>
      Cc: Hideo Aoki <haoki@redhat.com>
      Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      b4c6c34a
    • M
      [PATCH] elf: Always define elf_addr_t in linux/elf.h · 386d9a7e
      Magnus Damm 提交于
      Define elf_addr_t in linux/elf.h.  The size of the type is determined using
      ELF_CLASS.  This allows us to remove the defines that today are spread all
      over .c and .h files.
      Signed-off-by: NMagnus Damm <magnus@valinux.co.jp>
      Cc: Daniel Jacobowitz <drow@false.org>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: Jakub Jelinek <jakub@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      386d9a7e
    • S
      [PATCH] Fix the size limit of compat space msgsize · 651971cb
      suzuki 提交于
      Currently we allocate 64k space on the user stack and use it the msgbuf for
      sys_{msgrcv,msgsnd} for compat and the results are later copied in user [
      by copy_in_user].  This patch introduces helper routines for
      sys_{msgrcv,msgsnd} as below:
      
      do_msgsnd() : Accepts the mtype and user space ptr to the buffer along with
      the msqid and msgflg.
      
      do_msgrcv() : Accepts a kernel space ptr to mtype and a userspace ptr to
      the buffer.  The mtype has to be copied back the user space msgbuf by the
      caller.
      
      These changes avoid the need to allocate the msgsize on the userspace (
      thus removing the size limt ) and the overhead of an extra copy_in_user().
      Signed-off-by: NSuzuki K P <suzuki@in.ibm.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      651971cb
    • T
      [PATCH] ktime: Fix signed / unsigned mismatch in ktime_to_ns · cfd18934
      Thomas Gleixner 提交于
      The 32 bit implementation of ktime_to_ns returns unsigned value, while the
      64 bit version correctly returns an signed value.  There is no current user
      affected by this, but it has to be fixed, as ktime values can be negative.
      Pointed-out-by: NHelmut Duregger <Helmut.Duregger@student.uibk.ac.at>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      cfd18934
    • A
      [PATCH] remove HASH_HIGHMEM · 04903664
      Andrew Morton 提交于
      It has no users and it's doubtful that we'll need it again.
      
      Cc: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      04903664
    • P
      [PATCH] debug: workqueue locking sanity · d5abe669
      Peter Zijlstra 提交于
      Workqueue functions should not leak locks, assert so, printing the
      last function ran.
      
      Use macros in lockdep.h to avoid include dependency pains.
      
      [akpm@osdl.org: build fix]
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      d5abe669
    • I
      [PATCH] sleep profiling · ece8a684
      Ingo Molnar 提交于
      Implement prof=sleep profiling.  TASK_UNINTERRUPTIBLE sleeps will be taken
      as a profile hit, and every millisecond spent sleeping causes a profile-hit
      for the call site that initiated the sleep.
      
      Sample readprofile output on i386:
      
         306 ps2_sendbyte                               1.3973
         432 call_usermodehelper_keys                   1.9548
         484 ps2_command                                0.6453
         790 __driver_attach                            4.7879
        1593 msleep                                    44.2500
        3976 sync_buffer                               64.1290
        4076 do_lookup                                 12.4648
        8587 sync_page                                122.6714
       20820 total                                      0.0067
      
      (NOTE: architectures need to check whether get_wchan() can be called from
      deep within the wakeup path.)
      
      akpm: we need to mark more functions __sched.  lock_sock(), msleep(), others..
      
      akpm: the contention in do_lookup() is a surprise.  Presumably doing disk
      reads for directory contents while holding i_mutex.
      
      [akpm@osdl.org: various fixes]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      ece8a684
    • P
      [PATCH] lockdep: name some old style locks · 6cfd76a2
      Peter Zijlstra 提交于
      Name some of the remaning 'old_style_spin_init' locks
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      6cfd76a2
    • A
      [PATCH] ext4: uninline large functions · 8984d137
      Andrew Morton 提交于
      Saves nearly 4kbytes on x86.
      
      Cc: Arnaldo Carvalho de Melo <acme@mandriva.com>
      Cc: <linux-ext4@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      8984d137
    • A
      [PATCH] ext3: uninline large functions · 3a229b39
      Andrew Morton 提交于
      Saves nearly 4kbytes on x86.
      
      Cc: Arnaldo Carvalho de Melo <acme@mandriva.com>
      Cc: <linux-ext4@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      3a229b39
    • P
      [PATCH] Exar quad port serial · e0980daf
      Paul B Schroeder 提交于
      This is on our "Envoy" boxes which we have, according to the documentation, an
      "Exar ST16C554/554D Quad UART with 16-byte Fifo's".  The box also has two
      other "on-board" serial ports and a modem chip.
      
      The two on-board serial UARTs were being detected along with the first two
      Exar UARTs.  The last two Exar UARTs were not showing up and neither was the
      modem.
      
      This patch was the only way I could the kernel to see beyond the standard four
      serial ports and get all four of the Exar UARTs to show up.
      
      [akpm@osdl.org: build fix]
      Signed-off-by: NPaul B Schroeder <pschroeder@uplogix.com>
      Cc: Lennart Sorensen <lsorense@csclub.uwaterloo.ca>
      Acked-by: NAlan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Greg KH <greg@kroah.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      e0980daf
    • A
      [PATCH] Compile-time check re world-writeable module params · 9774a1f5
      Alexey Dobriyan 提交于
      One of the mistakes a module_param() user can make is to supply default
      value of module parameter as the last argument.  module_param() accepts
      permissions instead.  If default value is, say, 3 (-------wx), parameter
      becomes world-writeable.
      
      So far, the only remedy was to apply grep(1) and read drivers submitted
      to -mm. BTDT.
      
      With this patch applied, compiler will finally do some job.
      
      *) bounds checking on permissions
      *) world-writeable bit checking on permissions
      *) compile breakage if checks trigger
      
      First version of this check (only "& 2" part) directly caught 4 out of 7
      places during my last grep.
      
          Subject: Neverending module_param() bugs
          [X] drivers/acpi/sbs.c:101:module_param(capacity_mode, int, CAPACITY_UNIT);
          [X] drivers/acpi/sbs.c:102:module_param(update_mode, int, UPDATE_MODE);
          [ ] drivers/acpi/sbs.c:103:module_param(update_info_mode, int, UPDATE_INFO_MODE);
          [ ] drivers/acpi/sbs.c:104:module_param(update_time, int, UPDATE_TIME);
          [ ] drivers/acpi/sbs.c:105:module_param(update_time2, int, UPDATE_TIME2);
          [X] drivers/char/watchdog/sbc8360.c:203:module_param(timeout, int, 27);
          [X] drivers/media/video/tuner-simple.c:13:module_param(offset, int, 0666);
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      9774a1f5
    • O
      [PATCH] taskstats: cleanup ->signal->stats allocation · 34ec1234
      Oleg Nesterov 提交于
      Allocate ->signal->stats on demand in taskstats_exit(), this allows us to
      remove taskstats_tgid_alloc() (the last non-trivial inline) from taskstat's
      public interface.
      Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
      Cc: Balbir Singh <balbir@in.ibm.com>
      Cc: Shailabh Nagar <nagar@watson.ibm.com>
      Cc: Jay Lan <jlan@engr.sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      34ec1234
    • O
      [PATCH] taskstats: cleanup do_exit() path · 115085ea
      Oleg Nesterov 提交于
      do_exit:
      	taskstats_exit_alloc()
      	...
      	taskstats_exit_send()
      	taskstats_exit_free()
      
      I think this is not good, let it be a single function exported to the core
      kernel, taskstats_exit(), which does alloc + send + free itself.
      Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
      Cc: Balbir Singh <balbir@in.ibm.com>
      Cc: Shailabh Nagar <nagar@watson.ibm.com>
      Cc: Jay Lan <jlan@engr.sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      115085ea
    • A
      [PATCH] probe_kernel_address() needs to do set_fs() · 20aa7b21
      Andrew Morton 提交于
      probe_kernel_address() purports to be generic, only it forgot to select
      KERNEL_DS, so it presently won't work right on all architectures.
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      20aa7b21
    • R
      [PATCH] parport_pc: Add support for OX16PCI952 parallel port · c140e110
      Ryan Underwood 提交于
      Add support for the parallel port (implemented as separate PCI function) on
      the Oxford Semiconductor OX16PCI952.
      Signed-off-by: NRyan Underwood <nemesis@icequake.net>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      c140e110
    • J
      [PATCH] pull in necessary header files for cdev.h · 5ec68b2e
      Jan Engelhardt 提交于
      linux/cdev.h uses struct kobject and other structs and should therefore
      include them.  Currently, a module either needs to add the missing includes
      itself, or, in case a module includes other headers already, needs to put
      <linux/cdev.h> last, which goes against a alphabetically-sorted include
      list.
      Signed-off-by: NJan Engelhardt <jengelh@gmx.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      5ec68b2e
    • I
      [PATCH] SysRq-X: show blocked tasks · e59e2ae2
      Ingo Molnar 提交于
      Add SysRq-X support: show blocked (TASK_UNINTERRUPTIBLE) tasks only.
      
      Useful for debugging IO stalls.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      e59e2ae2
    • M
      [PATCH] fuse: add DESTROY operation · 0ec7ca41
      Miklos Szeredi 提交于
      Add a DESTROY operation for block device based filesystems.  With the help of
      this operation, such a filesystem can flush dirty data to the device
      synchronously before the umount returns.
      
      This is needed in situations where the filesystem is assumed to be clean
      immediately after unmount (e.g.  ejecting removable media).
      Signed-off-by: NMiklos Szeredi <miklos@szeredi.hu>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      0ec7ca41
    • M
      [PATCH] fuse: add bmap support · b2d2272f
      Miklos Szeredi 提交于
      Add support for the BMAP operation for block device based filesystems.  This
      is needed to support swap-files and lilo.
      Signed-off-by: NMiklos Szeredi <miklos@szeredi.hu>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      b2d2272f
    • M
      [PATCH] fuse: update userspace interface to version 7.8 · e9168c18
      Miklos Szeredi 提交于
      Add a flag to the RELEASE message which specifies that a FLUSH operation
      should be performed as well.  This interface update is needed for the FreeBSD
      port, and doesn't actually touch the Linux implementation at all.
      
      Also rename the unused 'flush_flags' in the FLUSH message to 'unused'.
      Signed-off-by: NMiklos Szeredi <miklos@szeredi.hu>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      e9168c18
    • J
      [PATCH] constify inode accessors · 48ed214d
      Jan Engelhardt 提交于
      Change the signature of i_size_read(), IMINOR() and IMAJOR() because they,
      or the functions they call, will never modify the argument.
      Signed-off-by: NJan Engelhardt <jengelh@gmx.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      48ed214d
    • P
      [PATCH] serial uartlite driver · 238b8721
      Peter Korsgaard 提交于
      Add a driver for the Xilinx uartlite serial controller used in boards with
      the PPC405 core in the Xilinx V2P/V4 fpgas.
      
      The hardware is very simple (baudrate/start/stopbits fixed and no break
      support).  See the datasheet for details:
      
      	http://www.xilinx.com/bvdocs/ipcenter/data_sheet/opb_uartlite.pdf
      
      See http://thread.gmane.org/gmane.linux.serial/1237/ for the email thread.
      Signed-off-by: NPeter Korsgaard <jacmet@sunsite.dk>
      Acked-by: NOlof Johansson <olof@lixom.net>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      238b8721
    • M
      [PATCH] cciss: add support for 1024 logical volumes · 799202cb
      Mike Miller 提交于
      Add the support for a large number of logical volumes.  We will soon have
      hardware that support up to 1024 logical volumes.
      Signed-off-by: NMike Miller <mike.miller@hp.com>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      799202cb
    • R
      [PATCH] Support for freezeable workqueues · 341a5958
      Rafael J. Wysocki 提交于
      Make it possible to create a workqueue the worker thread of which will be
      frozen during suspend, along with other kernel threads.
      Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
      Acked-by: NPavel Machek <pavel@ucw.cz>
      Cc: Nigel Cunningham <nigel@suspend2.net>
      Cc: David Chinner <dgc@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      341a5958
    • R
      [PATCH] swsusp: Untangle thaw_processes · a9b6f562
      Rafael J. Wysocki 提交于
      Move the loop from thaw_processes() to a separate function and call it
      independently for kernel threads and user space processes so that the order
      of thawing tasks is clearly visible.
      
      Drop thaw_kernel_threads() which is never used.
      Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Nigel Cunningham <nigel@suspend2.net>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      a9b6f562
    • N
      [PATCH] swsusp: thaw userspace and kernel space separately · ff39593a
      Nigel Cunningham 提交于
      Modify process thawing so that we can thaw kernel space without thawing
      userspace, and thaw kernelspace first.  This will be useful in later
      patches, where I intend to get swsusp thawing kernel threads only before
      seeking to free memory.
      Signed-off-by: NNigel Cunningham <nigel@suspend2.net>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      ff39593a
    • N
      [PATCH] Add include/linux/freezer.h and move definitions from sched.h · 7dfb7103
      Nigel Cunningham 提交于
      Move process freezing functions from include/linux/sched.h to freezer.h, so
      that modifications to the freezer or the kernel configuration don't require
      recompiling just about everything.
      
      [akpm@osdl.org: fix ueagle driver]
      Signed-off-by: NNigel Cunningham <nigel@suspend2.net>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Cc: Pavel Machek <pavel@ucw.cz>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      7dfb7103
    • R
      [PATCH] swsusp: Improve handling of highmem · 8357376d
      Rafael J. Wysocki 提交于
      Currently swsusp saves the contents of highmem pages by copying them to the
      normal zone which is quite inefficient (eg.  it requires two normal pages
      to be used for saving one highmem page).  This may be improved by using
      highmem for saving the contents of saveable highmem pages.
      
      Namely, during the suspend phase of the suspend-resume cycle we try to
      allocate as many free highmem pages as there are saveable highmem pages.
      If there are not enough highmem image pages to store the contents of all of
      the saveable highmem pages, some of them will be stored in the "normal"
      memory.  Next, we allocate as many free "normal" pages as needed to store
      the (remaining) image data.  We use a memory bitmap to mark the allocated
      free pages (ie.  highmem as well as "normal" image pages).
      
      Now, we use another memory bitmap to mark all of the saveable pages
      (highmem as well as "normal") and the contents of the saveable pages are
      copied into the image pages.  Then, the second bitmap is used to save the
      pfns corresponding to the saveable pages and the first one is used to save
      their data.
      
      During the resume phase the pfns of the pages that were saveable during the
      suspend are loaded from the image and used to mark the "unsafe" page
      frames.  Next, we try to allocate as many free highmem page frames as to
      load all of the image data that had been in the highmem before the suspend
      and we allocate so many free "normal" page frames that the total number of
      allocated free pages (highmem and "normal") is equal to the size of the
      image.  While doing this we have to make sure that there will be some extra
      free "normal" and "safe" page frames for two lists of PBEs constructed
      later.
      
      Now, the image data are loaded, if possible, into their "original" page
      frames.  The image data that cannot be written into their "original" page
      frames are loaded into "safe" page frames and their "original" kernel
      virtual addresses, as well as the addresses of the "safe" pages containing
      their copies, are stored in one of two lists of PBEs.
      
      One list of PBEs is for the copies of "normal" suspend pages (ie.  "normal"
      pages that were saveable during the suspend) and it is used in the same way
      as previously (ie.  by the architecture-dependent parts of swsusp).  The
      other list of PBEs is for the copies of highmem suspend pages.  The pages
      in this list are restored (in a reversible way) right before the
      arch-dependent code is called.
      Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
      Cc: Pavel Machek <pavel@ucw.cz>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      8357376d
    • R
      [PATCH] swsusp: use block device offsets to identify swap locations · 3aef83e0
      Rafael J. Wysocki 提交于
      Make swsusp use block device offsets instead of swap offsets to identify swap
      locations and make it use the same code paths for writing as well as for
      reading data.
      
      This allows us to use the same code for handling swap files and swap
      partitions and to simplify the code, eg.  by dropping rw_swap_page_sync().
      Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
      Cc: Pavel Machek <pavel@ucw.cz>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      3aef83e0
    • R
      [PATCH] swsusp: use partition device and offset to identify swap areas · 915bae9e
      Rafael J. Wysocki 提交于
      The Linux kernel handles swap files almost in the same way as it handles swap
      partitions and there are only two differences between these two types of swap
      areas:
      
      (1) swap files need not be contiguous,
      
      (2) the header of a swap file is not in the first block of the partition
          that holds it.  From the swsusp's point of view (1) is not a problem,
          because it is already taken care of by the swap-handling code, but (2) has
          to be taken into consideration.
      
      In principle the location of a swap file's header may be determined with the
      help of appropriate filesystem driver.  Unfortunately, however, it requires
      the filesystem holding the swap file to be mounted, and if this filesystem is
      journaled, it cannot be mounted during a resume from disk.  For this reason we
      need some other means by which swap areas can be identified.
      
      For example, to identify a swap area we can use the partition that holds the
      area and the offset from the beginning of this partition at which the swap
      header is located.
      
      The following patch allows swsusp to identify swap areas this way.  It changes
      swap_type_of() so that it takes an additional argument representing an offset
      of the swap header within the partition represented by its first argument.
      Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
      Acked-by: NPavel Machek <pavel@ucw.cz>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      915bae9e
    • N
      [PATCH] radix-tree: RCU lockless readside · 7cf9c2c7
      Nick Piggin 提交于
      Make radix tree lookups safe to be performed without locks.  Readers are
      protected against nodes being deleted by using RCU based freeing.  Readers
      are protected against new node insertion by using memory barriers to ensure
      the node itself will be properly written before it is visible in the radix
      tree.
      
      Each radix tree node keeps a record of their height (above leaf nodes).
      This height does not change after insertion -- when the radix tree is
      extended, higher nodes are only inserted in the top.  So a lookup can take
      the pointer to what is *now* the root node, and traverse down it even if
      the tree is concurrently extended and this node becomes a subtree of a new
      root.
      
      "Direct" pointers (tree height of 0, where root->rnode points directly to
      the data item) are handled by using the low bit of the pointer to signal
      whether rnode is a direct pointer or a pointer to a radix tree node.
      
      When a reader wants to traverse the next branch, they will take a copy of
      the pointer.  This pointer will be either NULL (and the branch is empty) or
      non-NULL (and will point to a valid node).
      
      [akpm@osdl.org: cleanups]
      [Lee.Schermerhorn@hp.com: bugfixes, comments, simplifications]
      [clameter@sgi.com: build fix]
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
      Signed-off-by: NLee Schermerhorn <lee.schermerhorn@hp.com>
      Cc: Christoph Lameter <clameter@engr.sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      7cf9c2c7
    • A
      [PATCH] Save some bytes in struct mm_struct · 36de6437
      Arnaldo Carvalho de Melo 提交于
      Before:
      [acme@newtoy net-2.6.20]$ pahole --cacheline 32 kernel/sched.o mm_struct
      
      /* include2/asm/processor.h:542 */
      struct mm_struct {
              struct vm_area_struct *    mmap;                 /*     0     4 */
              struct rb_root             mm_rb;                /*     4     4 */
              struct vm_area_struct *    mmap_cache;           /*     8     4 */
              long unsigned int          (*get_unmapped_area)(); /*    12     4 */
              void                       (*unmap_area)();      /*    16     4 */
              long unsigned int          mmap_base;            /*    20     4 */
              long unsigned int          task_size;            /*    24     4 */
              long unsigned int          cached_hole_size;     /*    28     4 */
              /* ---------- cacheline 1 boundary ---------- */
              long unsigned int          free_area_cache;      /*    32     4 */
              pgd_t *                    pgd;                  /*    36     4 */
              atomic_t                   mm_users;             /*    40     4 */
              atomic_t                   mm_count;             /*    44     4 */
              int                        map_count;            /*    48     4 */
              struct rw_semaphore        mmap_sem;             /*    52    64 */
              spinlock_t                 page_table_lock;      /*   116    40 */
              struct list_head           mmlist;               /*   156     8 */
              mm_counter_t               _file_rss;            /*   164     4 */
              mm_counter_t               _anon_rss;            /*   168     4 */
              long unsigned int          hiwater_rss;          /*   172     4 */
              long unsigned int          hiwater_vm;           /*   176     4 */
              long unsigned int          total_vm;             /*   180     4 */
              long unsigned int          locked_vm;            /*   184     4 */
              long unsigned int          shared_vm;            /*   188     4 */
              /* ---------- cacheline 6 boundary ---------- */
              long unsigned int          exec_vm;              /*   192     4 */
              long unsigned int          stack_vm;             /*   196     4 */
              long unsigned int          reserved_vm;          /*   200     4 */
              long unsigned int          def_flags;            /*   204     4 */
              long unsigned int          nr_ptes;              /*   208     4 */
              long unsigned int          start_code;           /*   212     4 */
              long unsigned int          end_code;             /*   216     4 */
              long unsigned int          start_data;           /*   220     4 */
              /* ---------- cacheline 7 boundary ---------- */
              long unsigned int          end_data;             /*   224     4 */
              long unsigned int          start_brk;            /*   228     4 */
              long unsigned int          brk;                  /*   232     4 */
              long unsigned int          start_stack;          /*   236     4 */
              long unsigned int          arg_start;            /*   240     4 */
              long unsigned int          arg_end;              /*   244     4 */
              long unsigned int          env_start;            /*   248     4 */
              long unsigned int          env_end;              /*   252     4 */
              /* ---------- cacheline 8 boundary ---------- */
              long unsigned int          saved_auxv[44];       /*   256   176 */
              unsigned int               dumpable:2;           /*   432     4 */
              cpumask_t                  cpu_vm_mask;          /*   436     4 */
              mm_context_t               context;              /*   440    68 */
              long unsigned int          swap_token_time;      /*   508     4 */
              /* ---------- cacheline 16 boundary ---------- */
              char                       recent_pagein;        /*   512     1 */
      
              /* XXX 3 bytes hole, try to pack */
      
              int                        core_waiters;         /*   516     4 */
              struct completion *        core_startup_done;    /*   520     4 */
              struct completion          core_done;            /*   524    52 */
              rwlock_t                   ioctx_list_lock;      /*   576    36 */
              struct kioctx *            ioctx_list;           /*   612     4 */
      }; /* size: 616, sum members: 613, holes: 1, sum holes: 3, cachelines: 20,
            last cacheline: 8 bytes */
      
      After:
      
      [acme@newtoy net-2.6.20]$ pahole --cacheline 32 kernel/sched.o mm_struct
      /* include2/asm/processor.h:542 */
      struct mm_struct {
              struct vm_area_struct *    mmap;                 /*     0     4 */
              struct rb_root             mm_rb;                /*     4     4 */
              struct vm_area_struct *    mmap_cache;           /*     8     4 */
              long unsigned int          (*get_unmapped_area)(); /*    12     4 */
              void                       (*unmap_area)();      /*    16     4 */
              long unsigned int          mmap_base;            /*    20     4 */
              long unsigned int          task_size;            /*    24     4 */
              long unsigned int          cached_hole_size;     /*    28     4 */
              /* ---------- cacheline 1 boundary ---------- */
              long unsigned int          free_area_cache;      /*    32     4 */
              pgd_t *                    pgd;                  /*    36     4 */
              atomic_t                   mm_users;             /*    40     4 */
              atomic_t                   mm_count;             /*    44     4 */
              int                        map_count;            /*    48     4 */
              struct rw_semaphore        mmap_sem;             /*    52    64 */
              spinlock_t                 page_table_lock;      /*   116    40 */
              struct list_head           mmlist;               /*   156     8 */
              mm_counter_t               _file_rss;            /*   164     4 */
              mm_counter_t               _anon_rss;            /*   168     4 */
              long unsigned int          hiwater_rss;          /*   172     4 */
              long unsigned int          hiwater_vm;           /*   176     4 */
              long unsigned int          total_vm;             /*   180     4 */
              long unsigned int          locked_vm;            /*   184     4 */
              long unsigned int          shared_vm;            /*   188     4 */
              /* ---------- cacheline 6 boundary ---------- */
              long unsigned int          exec_vm;              /*   192     4 */
              long unsigned int          stack_vm;             /*   196     4 */
              long unsigned int          reserved_vm;          /*   200     4 */
              long unsigned int          def_flags;            /*   204     4 */
              long unsigned int          nr_ptes;              /*   208     4 */
              long unsigned int          start_code;           /*   212     4 */
              long unsigned int          end_code;             /*   216     4 */
              long unsigned int          start_data;           /*   220     4 */
              /* ---------- cacheline 7 boundary ---------- */
              long unsigned int          end_data;             /*   224     4 */
              long unsigned int          start_brk;            /*   228     4 */
              long unsigned int          brk;                  /*   232     4 */
              long unsigned int          start_stack;          /*   236     4 */
              long unsigned int          arg_start;            /*   240     4 */
              long unsigned int          arg_end;              /*   244     4 */
              long unsigned int          env_start;            /*   248     4 */
              long unsigned int          env_end;              /*   252     4 */
              /* ---------- cacheline 8 boundary ---------- */
              long unsigned int          saved_auxv[44];       /*   256   176 */
              cpumask_t                  cpu_vm_mask;          /*   432     4 */
              mm_context_t               context;              /*   436    68 */
              long unsigned int          swap_token_time;      /*   504     4 */
              char                       recent_pagein;        /*   508     1 */
              unsigned char              dumpable:2;           /*   509     1 */
      
              /* XXX 2 bytes hole, try to pack */
      
              int                        core_waiters;         /*   512     4 */
              struct completion *        core_startup_done;    /*   516     4 */
              struct completion          core_done;            /*   520    52 */
              rwlock_t                   ioctx_list_lock;      /*   572    36 */
              struct kioctx *            ioctx_list;           /*   608     4 */
      }; /* size: 612, sum members: 610, holes: 1, sum holes: 2, cachelines: 20,
            last cacheline: 4 bytes */
      
      [acme@newtoy net-2.6.20]$ codiff -V /tmp/sched.o.before kernel/sched.o
      /pub/scm/linux/kernel/git/acme/net-2.6.20/kernel/sched.c:
        struct mm_struct |   -4
          dumpable:2;
           from: unsigned int          /*   432(30)    4(2) */
           to:   unsigned char         /*   509(6)     1(2) */
      < SNIP other offset changes >
       1 struct changed
      [acme@newtoy net-2.6.20]$
      
      I'm not aware of any problem about using 2 byte wide bitfields where
      previously a 4 byte wide one was, holler if there is any, I wouldn't be
      surprised, bitfields are things from hell.
      
      For the curious, 432(30) means: at offset 432 from the struct start, at
      offset 30 in the bitfield (yeah, it comes backwards, hellish, huh?) ditto
      for 509(6), while 4(2) and 1(2) means "struct field size(bitfield size)".
      
      Now we have a 2 bytes hole and are using only 4 bytes of the last 32
      bytes cacheline, any takers? :-)
      Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      36de6437
    • A
      [PATCH] mm: make compound page destructor handling explicit · 33f2ef89
      Andy Whitcroft 提交于
      Currently we we use the lru head link of the second page of a compound page
      to hold its destructor.  This was ok when it was purely an internal
      implmentation detail.  However, hugetlbfs overrides this destructor
      violating the layering.  Abstract this out as explicit calls, also
      introduce a type for the callback function allowing them to be type
      checked.  For each callback we pre-declare the function, causing a type
      error on definition rather than on use elsewhere.
      
      [akpm@osdl.org: cleanups]
      Signed-off-by: NAndy Whitcroft <apw@shadowen.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      33f2ef89
    • A
      [PATCH] slab: deprecate kmem_cache_t · 1b1cec4b
      Andrew Morton 提交于
      Cc: Christoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      1b1cec4b
    • C
      [PATCH] slab: remove kmem_cache_t · e18b890b
      Christoph Lameter 提交于
      Replace all uses of kmem_cache_t with struct kmem_cache.
      
      The patch was generated using the following script:
      
      	#!/bin/sh
      	#
      	# Replace one string by another in all the kernel sources.
      	#
      
      	set -e
      
      	for file in `find * -name "*.c" -o -name "*.h"|xargs grep -l $1`; do
      		quilt add $file
      		sed -e "1,\$s/$1/$2/g" $file >/tmp/$$
      		mv /tmp/$$ $file
      		quilt refresh
      	done
      
      The script was run like this
      
      	sh replace kmem_cache_t "struct kmem_cache"
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      e18b890b
    • C
      [PATCH] slab: remove SLAB_DMA · 441e143e
      Christoph Lameter 提交于
      SLAB_DMA is an alias of GFP_DMA. This is the last one so we
      remove the leftover comment too.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      441e143e
    • C
      [PATCH] slab: remove SLAB_KERNEL · e94b1766
      Christoph Lameter 提交于
      SLAB_KERNEL is an alias of GFP_KERNEL.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      e94b1766
    • C
      [PATCH] slab: remove SLAB_ATOMIC · 54e6ecb2
      Christoph Lameter 提交于
      SLAB_ATOMIC is an alias of GFP_ATOMIC
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      54e6ecb2
    • C
      [PATCH] slab: remove SLAB_USER · f7267c0c
      Christoph Lameter 提交于
      SLAB_USER is an alias of GFP_USER
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      f7267c0c