1. 25 11月, 2010 1 次提交
    • M
      cgroups: make swap accounting default behavior configurable · a42c390c
      Michal Hocko 提交于
      Swap accounting can be configured by CONFIG_CGROUP_MEM_RES_CTLR_SWAP
      configuration option and then it is turned on by default.  There is a boot
      option (noswapaccount) which can disable this feature.
      
      This makes it hard for distributors to enable the configuration option as
      this feature leads to a bigger memory consumption and this is a no-go for
      general purpose distribution kernel.  On the other hand swap accounting
      may be very usuful for some workloads.
      
      This patch adds a new configuration option which controls the default
      behavior (CGROUP_MEM_RES_CTLR_SWAP_ENABLED).  If the option is selected
      then the feature is turned on by default.
      
      It also adds a new boot parameter swapaccount[=1|0] which enhances the
      original noswapaccount parameter semantic by means of enable/disable logic
      (defaults to 1 if no value is provided to be still consistent with
      noswapaccount).
      
      The default behavior is unchanged (if CONFIG_CGROUP_MEM_RES_CTLR_SWAP is
      enabled then CONFIG_CGROUP_MEM_RES_CTLR_SWAP_ENABLED is enabled as well)
      Signed-off-by: NMichal Hocko <mhocko@suse.cz>
      Acked-by: NDaisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
      Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a42c390c
  2. 18 11月, 2010 1 次提交
  3. 28 10月, 2010 6 次提交
  4. 27 10月, 2010 1 次提交
  5. 23 10月, 2010 2 次提交
    • A
      SYSFS: Allow boot time switching between deprecated and modern sysfs layout · e52eec13
      Andi Kleen 提交于
      I have some systems which need legacy sysfs due to old tools that are
      making assumptions that a directory can never be a symlink to another
      directory, and it's a big hazzle to compile separate kernels for them.
      
      This patch turns CONFIG_SYSFS_DEPRECATED into a run time option
      that can be switched on/off the kernel command line. This way
      the same binary can be used in both cases with just a option
      on the command line.
      
      The old CONFIG_SYSFS_DEPRECATED_V2 option is still there to set
      the default. I kept the weird name to not break existing
      config files.
      
      Also the compat code can be still completely disabled by undefining
      CONFIG_SYSFS_DEPRECATED_SWITCH -- just the optimizer takes
      care of this now instead of lots of ifdefs. This makes the code
      look nicer.
      
      v2: This is an updated version on top of Kay's patch to only
      handle the block devices. I tested it on my old systems
      and that seems to work.
      
      Cc: axboe@kernel.dk
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Kay Sievers <kay.sievers@vrfy.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      e52eec13
    • K
      driver core: remove CONFIG_SYSFS_DEPRECATED_V2 but keep it for block devices · 39aba963
      Kay Sievers 提交于
      This patch removes the old CONFIG_SYSFS_DEPRECATED_V2 config option,
      but it keeps the logic around to handle block devices in the old manner
      as some people like to run new kernel versions on old (pre 2007/2008)
      distros.
      Signed-off-by: NKay Sievers <kay.sievers@vrfy.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Stephen Hemminger <shemminger@vyatta.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: "James E.J. Bottomley" <James.Bottomley@suse.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
      Cc: Randy Dunlap <randy.dunlap@oracle.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Jaroslav Kysela <perex@perex.cz>
      Cc: Takashi Iwai <tiwai@suse.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: David Howells <dhowells@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      
      39aba963
  6. 21 10月, 2010 1 次提交
    • A
      BKL: introduce CONFIG_BKL. · 6de5bd12
      Arnd Bergmann 提交于
      With all the patches we have queued in the BKL removal tree, only a
      few dozen modules are left that actually rely on the BKL, and even
      there are lots of low-hanging fruit. We need to decide what to do
      about them, this patch illustrates one of the options:
      
      Every user of the BKL is marked as 'depends on BKL' in Kconfig,
      and the CONFIG_BKL becomes a user-visible option. If it gets
      disabled, no BKL using module can be built any more and the BKL
      code itself is compiled out.
      
      The one exception is file locking, which is practically always
      enabled and does a 'select BKL' instead. This effectively forces
      CONFIG_BKL to be enabled until we have solved the fs/lockd
      mess and can apply the patch that removes the BKL from fs/locks.c.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      6de5bd12
  7. 19 10月, 2010 2 次提交
  8. 12 10月, 2010 1 次提交
  9. 04 10月, 2010 1 次提交
  10. 29 9月, 2010 1 次提交
    • H
      initramfs: fix initramfs size calculation · ffe8018c
      Hendrik Brueckner 提交于
      The size of a built-in initramfs is calculated in init/initramfs.c by
      "__initramfs_end - __initramfs_start".  Those symbols are defined in the
      linker script include/asm-generic/vmlinux.lds.h:
      
      #define INIT_RAM_FS                                                     \
              . = ALIGN(PAGE_SIZE);                                           \
              VMLINUX_SYMBOL(__initramfs_start) = .;                          \
              *(.init.ramfs)                                                  \
              VMLINUX_SYMBOL(__initramfs_end) = .;
      
      If the initramfs file has an odd number of bytes, the "__initramfs_end"
      symbol points to an odd address, for example, the symbols in the
      System.map might look like:
      
          0000000000572000 T __initramfs_start
          00000000005bcd05 T __initramfs_end	  <-- odd address
      
      At least on s390 this causes a problem:
      
      Certain s390 instructions, especially instructions for loading addresses
      (larl) or branch addresses must be on even addresses.  The compiler loads
      the symbol addresses with the "larl" instruction.  This instruction sets
      the last bit to 0 and, therefore, for odd size files, the calculated size
      is one byte less than it should be:
      
          0000000000540a9c <populate_rootfs>:
            540a9c:     eb cf f0 78 00 24       stmg    %r12,%r15,120(%r15),
            540aa2:     c0 10 00 01 8a af       larl    %r1,572000 <__initramfs_start>
            540aa8:     c0 c0 00 03 e1 2e       larl    %r12,5bcd04 <initramfs_end>
                                                        (Instead of  5bcd05)
            ...
            540abe:     1b c1                   sr      %r12,%r1
      
      To fix the problem, this patch introduces the global variable
      __initramfs_size, which is calculated in the "usr/initramfs_data.S" file.
      The populate_rootfs() function can then use the start marker of the
      .init.ramfs section and the value of __initramfs_size for loading the
      initramfs.  Because the start marker and size is sufficient, the
      __initramfs_end symbol is no longer needed and is removed.
      Signed-off-by: NMichael Holzheu <holzheu@linux.vnet.ibm.com>
      Signed-off-by: NHendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Reviewed-by: NWANG Cong <xiyou.wangcong@gmail.com>
      Acked-by: NMichal Marek <mmarek@suse.cz>
      Acked-by: N"H. Peter Anvin" <hpa@zytor.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NMichal Marek <mmarek@suse.cz>
      ffe8018c
  11. 17 9月, 2010 2 次提交
    • C
      NFS: Use super.c for NFSROOT mount option parsing · 56463e50
      Chuck Lever 提交于
      Replace duplicate code in NFSROOT for mounting an NFS server on '/'
      with logic that uses the existing mainline text-based logic in the NFS
      client.
      
      Add documenting comments where appropriate.
      
      Note that this means NFSROOT mounts now use the same default settings
      as v2/v3 mounts done via mount(2) from user space.
      
        vers=3,tcp,rsize=<negotiated default>,wsize=<negotiated default>
      
      As before, however, no version/protocol negotiation with the server is
      done.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      56463e50
    • J
      do_mounts: only enable PARTUUID for CONFIG_BLOCK · 6d0aed7a
      Jens Axboe 提交于
      When CONFIG_BLOCK is not enabled:
      
      init/do_mounts.c:71: error: implicit declaration of function 'dev_to_part'
      init/do_mounts.c:71: warning: initialization makes pointer from integer without a cast
      init/do_mounts.c:73: error: dereferencing pointer to incomplete type
      init/do_mounts.c:76: error: dereferencing pointer to incomplete type
      init/do_mounts.c:76: error: dereferencing pointer to incomplete type
      init/do_mounts.c:102: error: implicit declaration of function 'part_pack_uuid'
      init/do_mounts.c:104: error: 'block_class' undeclared (first use in this function)
      Reported-by: NRandy Dunlap <randy.dunlap@oracle.com>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      6d0aed7a
  12. 16 9月, 2010 2 次提交
  13. 15 9月, 2010 1 次提交
    • W
      init: add support for root devices specified by partition UUID · b5af921e
      Will Drewry 提交于
      This is the third patch in a series which adds support for
      storing partition metadata, optionally, off of the hd_struct.
      
      One major use for that data is being able to resolve partition
      by other identities than just the index on a block device.  Device
      enumeration varies by platform and there's a benefit to being able
      to use something like EFI GPT's GUIDs to determine the correct
      block device and partition to mount as the root.
      
      This change adds that support to root= by adding support for
      the following syntax:
      
        root=PARTUUID=hex-uuid
      Signed-off-by: NWill Drewry <wad@chromium.org>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      b5af921e
  14. 23 8月, 2010 1 次提交
  15. 20 8月, 2010 3 次提交
    • P
      rcu: Add a TINY_PREEMPT_RCU · a57eb940
      Paul E. McKenney 提交于
      Implement a small-memory-footprint uniprocessor-only implementation of
      preemptible RCU.  This implementation uses but a single blocked-tasks
      list rather than the combinatorial number used per leaf rcu_node by
      TREE_PREEMPT_RCU, which reduces memory consumption and greatly simplifies
      processing.  This version also takes advantage of uniprocessor execution
      to accelerate grace periods in the case where there are no readers.
      
      The general design is otherwise broadly similar to that of TREE_PREEMPT_RCU.
      
      This implementation is a step towards having RCU implementation driven
      off of the SMP and PREEMPT kernel configuration variables, which can
      happen once this implementation has accumulated sufficient experience.
      
      Removed ACCESS_ONCE() from __rcu_read_unlock() and added barrier() as
      suggested by Steve Rostedt in order to avoid the compiler-reordering
      issue noted by Mathieu Desnoyers (http://lkml.org/lkml/2010/8/16/183).
      
      As can be seen below, CONFIG_TINY_PREEMPT_RCU represents almost 5Kbyte
      savings compared to CONFIG_TREE_PREEMPT_RCU.  Of course, for non-real-time
      workloads, CONFIG_TINY_RCU is even better.
      
      	CONFIG_TREE_PREEMPT_RCU
      
      	   text	   data	    bss	    dec	   filename
      	     13	      0	      0	     13	   kernel/rcupdate.o
      	   6170	    825	     28	   7023	   kernel/rcutree.o
      				   ----
      				   7026    Total
      
      	CONFIG_TINY_PREEMPT_RCU
      
      	   text	   data	    bss	    dec	   filename
      	     13	      0	      0	     13	   kernel/rcupdate.o
      	   2081	     81	      8	   2170	   kernel/rcutiny.o
      				   ----
      				   2183    Total
      
      	CONFIG_TINY_RCU (non-preemptible)
      
      	   text	   data	    bss	    dec	   filename
      	     13	      0	      0	     13	   kernel/rcupdate.o
      	    719	     25	      0	    744	   kernel/rcutiny.o
      				    ---
      				    757    Total
      Requested-by: NLoïc Minier <loic.minier@canonical.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      a57eb940
    • P
      rcu: Fix RCU_FANOUT help message · 4d87ffad
      Paul E. McKenney 提交于
      Commit cf244dc0 added a fourth level to the TREE_RCU hierarchy,
      but the RCU_FANOUT help message still said "cube root".  This commit
      fixes this to "fourth root" and also emphasizes that production
      systems are well-served by the default.  (Stress-testing RCU itself
      uses small RCU_FANOUT values in order to test large-system code paths
      on small(er) systems.)
      Located-by: NJohn Kacur <jkacur@redhat.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      4d87ffad
    • P
      rcu: restrict TREE_RCU to SMP builds with !PREEMPT · 687d7a96
      Paul E. McKenney 提交于
      Because both TINY_RCU and TREE_PREEMPT_RCU have been in mainline for
      several releases, it is time to restrict the use of TREE_RCU to SMP
      non-preemptible systems.  This reduces testing/validation effort.  This
      commit is a first step towards driving the selection of RCU implementation
      directly off of the SMP and PREEMPT configuration parameters.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      687d7a96
  16. 18 8月, 2010 1 次提交
    • D
      Make do_execve() take a const filename pointer · d7627467
      David Howells 提交于
      Make do_execve() take a const filename pointer so that kernel_execve() compiles
      correctly on ARM:
      
      arch/arm/kernel/sys_arm.c:88: warning: passing argument 1 of 'do_execve' discards qualifiers from pointer target type
      
      This also requires the argv and envp arguments to be consted twice, once for
      the pointer array and once for the strings the array points to.  This is
      because do_execve() passes a pointer to the filename (now const) to
      copy_strings_kernel().  A simpler alternative would be to cast the filename
      pointer in do_execve() when it's passed to copy_strings_kernel().
      
      do_execve() may not change any of the strings it is passed as part of the argv
      or envp lists as they are some of them in .rodata, so marking these strings as
      const should be fine.
      
      Further kernel_execve() and sys_execve() need to be changed to match.
      
      This has been test built on x86_64, frv, arm and mips.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Tested-by: NRalf Baechle <ralf@linux-mips.org>
      Acked-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d7627467
  17. 11 8月, 2010 2 次提交
  18. 10 8月, 2010 2 次提交
  19. 01 8月, 2010 1 次提交
    • S
      workqueue: mark init_workqueues() as early_initcall() · 6ee0578b
      Suresh Siddha 提交于
      Mark init_workqueues() as early_initcall() and thus it will be initialized
      before smp bringup. init_workqueues() registers for the hotcpu notifier
      and thus it should cope with the processors that are brought online after
      the workqueues are initialized.
      
      x86 smp bringup code uses workqueues and uses a workaround for the
      cold boot process (as the workqueues are initialized post smp_init()).
      Marking init_workqueues() as early_initcall() will pave the way for
      cleaning up this code.
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      6ee0578b
  20. 28 7月, 2010 3 次提交
  21. 23 7月, 2010 1 次提交
  22. 09 7月, 2010 1 次提交
  23. 30 6月, 2010 1 次提交
  24. 29 6月, 2010 1 次提交
  25. 28 6月, 2010 1 次提交
    • T
      percpu: allow limited allocation before slab is online · 099a19d9
      Tejun Heo 提交于
      This patch updates percpu allocator such that it can serve limited
      amount of allocation before slab comes online.  This is primarily to
      allow slab to depend on working percpu allocator.
      
      Two parameters, PERCPU_DYNAMIC_EARLY_SIZE and SLOTS, determine how
      much memory space and allocation map slots are reserved.  If this
      reserved area is exhausted, WARN_ON_ONCE() will trigger and allocation
      will fail till slab comes online.
      
      The following changes are made to implement early alloc.
      
      * pcpu_mem_alloc() now checks slab_is_available()
      
      * Chunks are allocated using pcpu_mem_alloc()
      
      * Init paths make sure ai->dyn_size is at least as large as
        PERCPU_DYNAMIC_EARLY_SIZE.
      
      * Initial alloc maps are allocated in __initdata and copied to
        kmalloc'd areas once slab is online.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      099a19d9