1. 25 9月, 2012 1 次提交
    • F
      cputime: Gather time/stats accounting config options into a single menu · 391dc69c
      Frederic Weisbecker 提交于
      This debloats a bit the general config menu and make these
      config options easier to find.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      391dc69c
  2. 17 8月, 2012 1 次提交
    • F
      cputime: Generalize CONFIG_VIRT_CPU_ACCOUNTING · b952741c
      Frederic Weisbecker 提交于
      S390, ia64 and powerpc all define their own version
      of CONFIG_VIRT_CPU_ACCOUNTING. Generalize the config
      and its description to a single place to avoid
      duplication.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      b952741c
  3. 15 8月, 2012 1 次提交
  4. 01 8月, 2012 3 次提交
    • J
      mm/hotplug: correctly setup fallback zonelists when creating new pgdat · 9adb62a5
      Jiang Liu 提交于
      When hotadd_new_pgdat() is called to create new pgdat for a new node, a
      fallback zonelist should be created for the new node.  There's code to try
      to achieve that in hotadd_new_pgdat() as below:
      
      	/*
      	 * The node we allocated has no zone fallback lists. For avoiding
      	 * to access not-initialized zonelist, build here.
      	 */
      	mutex_lock(&zonelists_mutex);
      	build_all_zonelists(pgdat, NULL);
      	mutex_unlock(&zonelists_mutex);
      
      But it doesn't work as expected.  When hotadd_new_pgdat() is called, the
      new node is still in offline state because node_set_online(nid) hasn't
      been called yet.  And build_all_zonelists() only builds zonelists for
      online nodes as:
      
              for_each_online_node(nid) {
                      pg_data_t *pgdat = NODE_DATA(nid);
      
                      build_zonelists(pgdat);
                      build_zonelist_cache(pgdat);
              }
      
      Though we hope to create zonelist for the new pgdat, but it doesn't.  So
      add a new parameter "pgdat" the build_all_zonelists() to build pgdat for
      the new pgdat too.
      Signed-off-by: NJiang Liu <liuj97@gmail.com>
      Signed-off-by: NXishi Qiu <qiuxishi@huawei.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Keping Chen <chenkeping@huawei.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9adb62a5
    • A
      memcg: rename config variables · c255a458
      Andrew Morton 提交于
      Sanity:
      
      CONFIG_CGROUP_MEM_RES_CTLR -> CONFIG_MEMCG
      CONFIG_CGROUP_MEM_RES_CTLR_SWAP -> CONFIG_MEMCG_SWAP
      CONFIG_CGROUP_MEM_RES_CTLR_SWAP_ENABLED -> CONFIG_MEMCG_SWAP_ENABLED
      CONFIG_CGROUP_MEM_RES_CTLR_KMEM -> CONFIG_MEMCG_KMEM
      
      [mhocko@suse.cz: fix missed bits]
      Cc: Glauber Costa <glommer@parallels.com>
      Acked-by: NMichal Hocko <mhocko@suse.cz>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c255a458
    • A
      mm/hugetlb: add new HugeTLB cgroup · 2bc64a20
      Aneesh Kumar K.V 提交于
      Implement a new controller that allows us to control HugeTLB allocations.
      The extension allows to limit the HugeTLB usage per control group and
      enforces the controller limit during page fault.  Since HugeTLB doesn't
      support page reclaim, enforcing the limit at page fault time implies that,
      the application will get SIGBUS signal if it tries to access HugeTLB pages
      beyond its limit.  This requires the application to know beforehand how
      much HugeTLB pages it would require for its use.
      
      The charge/uncharge calls will be added to HugeTLB code in later patch.
      Support for cgroup removal will be added in later patches.
      
      [akpm@linux-foundation.org: s/CONFIG_CGROUP_HUGETLB_RES_CTLR/CONFIG_MEMCG_HUGETLB/g]
      [akpm@linux-foundation.org: s/CONFIG_MEMCG_HUGETLB/CONFIG_CGROUP_HUGETLB/g]
      Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Hillf Danton <dhillf@gmail.com>
      Reviewed-by: NMichal Hocko <mhocko@suse.cz>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2bc64a20
  5. 27 7月, 2012 1 次提交
  6. 23 7月, 2012 1 次提交
    • A
      switch fput to task_work_add · 4a9d4b02
      Al Viro 提交于
      ... and schedule_work() for interrupt/kernel_thread callers
      (and yes, now it *is* OK to call from interrupt).
      
      We are guaranteed that __fput() will be done before we return
      to userland (or exit).  Note that for fput() from a kernel
      thread we get an async behaviour; it's almost always OK, but
      sometimes you might need to have __fput() completed before
      you do anything else.  There are two mechanisms for that -
      a general barrier (flush_delayed_fput()) and explicit
      __fput_sync().  Both should be used with care (as was the
      case for fput() from kernel threads all along).  See comments
      in fs/file_table.c for details.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      4a9d4b02
  7. 10 7月, 2012 1 次提交
  8. 08 6月, 2012 2 次提交
    • B
      init: Drop initcall level output · 19efb72f
      Borislav Petkov 提交于
      9fb48c74 ("params: add 3rd arg to option handler callback
      signature") added similar lines to dmesg:
      
      initlevel:0=early, 4 registered initcalls
      initlevel:1=core, 31 registered initcalls
      initlevel:2=postcore, 11 registered initcalls
      initlevel:3=arch, 7 registered initcalls
      initlevel:4=subsys, 40 registered initcalls
      initlevel:5=fs, 30 registered initcalls
      initlevel:6=device, 250 registered initcalls
      initlevel:7=late, 35 registered initcalls
      
      but they don't contain any info for the general user staring at dmesg.
      I'm very doubtful the count of initcalls registered per level helps
      anyone so drop that output completely.
      
      Cc: Jim Cromie <jim.cromie@gmail.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Jason Baron <jbaron@redhat.com>
      Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      19efb72f
    • R
      module_param: stop double-calling parameters. · ae82fdb1
      Rusty Russell 提交于
      Commit 026cee00 "params:
      <level>_initcall-like kernel parameters" set old-style module
      parameters to level 0.  And we call those level 0 calls where we used
      to, early in start_kernel().
      
      We also loop through the initcall levels and call the levelled
      module_params before the corresponding initcall.  Unfortunately level
      0 is early_init(), so we call the standard module_param calls twice.
      
      (Turns out most things don't care, but at least ubi.mtd does).
      
      Change the level to -1 for standard module_param calls.
      Reported-by: NBenoît Thébaudeau <benoit.thebaudeau@advansee.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: stable@kernel.org
      ae82fdb1
  9. 06 6月, 2012 1 次提交
    • J
      x86-64/efi: Use EFI to deal with platform wall clock · bacef661
      Jan Beulich 提交于
      Other than ix86, x86-64 on EFI so far didn't set the
      {g,s}et_wallclock accessors to the EFI routines, thus
      incorrectly using raw RTC accesses instead.
      
      Simply removing the #ifdef around the respective code isn't
      enough, however: While so far early get-time calls were done in
      physical mode, this doesn't work properly for x86-64, as virtual
      addresses would still need to be set up for all runtime regions
      (which wasn't the case on the system I have access to), so
      instead the patch moves the call to efi_enter_virtual_mode()
      ahead (which in turn allows to drop all code related to calling
      efi-get-time in physical mode).
      
      Additionally the earlier calling of efi_set_executable()
      requires the CPA code to cope, i.e. during early boot it must be
      avoided to call cpa_flush_array(), as the first thing this
      function does is a BUG_ON(irqs_disabled()).
      
      Also make the two EFI functions in question here static -
      they're not being referenced elsewhere.
      Signed-off-by: NJan Beulich <jbeulich@suse.com>
      Tested-by: NMatt Fleming <matt.fleming@intel.com>
      Acked-by: NMatthew Garrett <mjg@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/4FBFBF5F020000780008637F@nat28.tlf.novell.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      bacef661
  10. 01 6月, 2012 2 次提交
  11. 22 5月, 2012 2 次提交
    • T
      timers: Fixup the Kconfig consolidation fallout · 764e0da1
      Thomas Gleixner 提交于
      Sigh, I missed to check which architecture Kconfig files actually
      include the core Kconfig file. There are a few which did not. So we
      broke them.
      
      Instead of adding the includes to those, we are better off to move the
      include to init/Kconfig like we did already with irqs and others.
      
      This does not change anything for the architectures using the old
      style periodic timer mode. It just solves the build wreckage there.
      
      For those architectures which use the clock events infrastructure it
      moves the include of the core Kconfig file to "General setup" which is
      a way more logical place than having it at random locations specified
      by the architecture specific Kconfigs.
      Reported-by: NIngo Molnar <mingo@kernel.org>
      Cc: Anna-Maria Gleixner <anna-maria@glx-um.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      764e0da1
    • L
      Fix blocking allocations called very early during bootup · 31a67102
      Linus Torvalds 提交于
      During early boot, when the scheduler hasn't really been fully set up,
      we really can't do blocking allocations because with certain (dubious)
      configurations the "might_resched()" calls can actually result in
      scheduling events.
      
      We could just make such users always use GFP_ATOMIC, but quite often the
      code that does the allocation isn't really aware of the fact that the
      scheduler isn't up yet, and forcing that kind of random knowledge on the
      initialization code is just annoying and not good for anybody.
      
      And we actually have a the 'gfp_allowed_mask' exactly for this reason:
      it's just that the kernel init sequence happens to set it to allow
      blocking allocations much too early.
      
      So move the 'gfp_allowed_mask' initialization from 'start_kernel()'
      (which is some of the earliest init code, and runs with preemption
      disabled for good reasons) into 'kernel_init()'.  kernel_init() is run
      in the newly created thread that will become the 'init' process, as
      opposed to the early startup code that runs within the context of what
      will be the first idle thread.
      
      So by the time we reach 'kernel_init()', we know that the scheduler must
      be at least limping along, because we've already scheduled from the idle
      thread into the init thread.
      Reported-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      31a67102
  12. 16 5月, 2012 12 次提交
  13. 06 5月, 2012 1 次提交
    • S
      init: don't try mounting device as nfs root unless type fully matches · 377485f6
      Sasha Levin 提交于
      Currently, we'll try mounting any device who's major device number is
      UNNAMED_MAJOR as NFS root.  This would happen for non-NFS devices as
      well (such as 9p devices) but it wouldn't cause any issues since
      mounting the device as NFS would fail quickly and the code proceeded to
      doing the proper mount:
      
             [  101.522716] VFS: Unable to mount root fs via NFS, trying floppy.
             [  101.534499] VFS: Mounted root (9p filesystem) on device 0:18.
      
      Commit 6829a048102a ("NFS: Retry mounting NFSROOT") introduced retries
      when mounting NFS root, which means that now we don't immediately fail
      and instead it takes an additional 90+ seconds until we stop retrying,
      which has revealed the issue this patch fixes.
      
      This meant that it would take an additional 90 seconds to boot when
      we're not using a device type which gets detected in order before NFS.
      
      This patch modifies the NFS type check to require device type to be
      'Root_NFS' instead of requiring the device to have an UNNAMED_MAJOR
      major.  This makes boot process cleaner since we now won't go through
      the NFS mounting code at all when the device isn't an NFS root
      ("/dev/nfs").
      Signed-off-by: NSasha Levin <levinsasha928@gmail.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      377485f6
  14. 05 5月, 2012 2 次提交
    • T
      init_task: Replace CONFIG_HAVE_GENERIC_INIT_TASK · a6359d1e
      Thomas Gleixner 提交于
      Now that all archs except ia64 are converted, replace the config and
      let the ia64 select CONFIG_ARCH_INIT_TASK
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20120503085035.867948914@linutronix.de
      a6359d1e
    • T
      init_task: Create generic init_task instance · a4a2eb49
      Thomas Gleixner 提交于
      All archs define init_task in the same way (except ia64, but there is
      no particular reason why ia64 cannot use the common version). Create a
      generic instance so all archs can be converted over.
      
      The config switch is temporary and will be removed when all archs are
      converted over.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Chen Liqin <liqin.chen@sunplusct.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: David Howells <dhowells@redhat.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
      Cc: Hirokazu Takata <takata@linux-m32r.org>
      Cc: James E.J. Bottomley <jejb@parisc-linux.org>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Mike Frysinger <vapier@gentoo.org>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Link: http://lkml.kernel.org/r/20120503085034.092585287@linutronix.de
      a4a2eb49
  15. 01 5月, 2012 1 次提交
    • J
      params: add 3rd arg to option handler callback signature · 9fb48c74
      Jim Cromie 提交于
      Add a 3rd arg, named "doing", to unknown-options callbacks invoked
      from parse_args(). The arg is passed as:
      
        "Booting kernel" from start_kernel(),
        initcall_level_names[i] from do_initcall_level(),
        mod->name from load_module(), via parse_args(), parse_one()
      
      parse_args() already has the "name" parameter, which is renamed to
      "doing" to better reflect current uses 1,2 above.  parse_args() passes
      it to an altered parse_one(), which now passes it down into the
      unknown option handler callbacks.
      
      The mod->name will be needed to handle dyndbg for loadable modules,
      since params passed by modprobe are not qualified (they do not have a
      "$modname." prefix), and by the time the unknown-param callback is
      called, the module name is not otherwise available.
      
      Minor tweaks:
      
      Add param-name to parse_one's pr_debug(), current message doesnt
      identify the param being handled, add it.
      
      Add a pr_info to print current level and level_name of the initcall,
      and number of registered initcalls at that level.  This adds 7 lines
      to dmesg output, like:
      
         initlevel:6=device, 172 registered initcalls
      
      Drop "parameters" from initcall_level_names[], its unhelpful in the
      pr_info() added above.  This array is passed into parse_args() by
      do_initcall_level().
      
      CC: Rusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NJim Cromie <jim.cromie@gmail.com>
      Acked-by: NJason Baron <jbaron@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9fb48c74
  16. 26 4月, 2012 1 次提交
  17. 25 4月, 2012 3 次提交
    • C
      init: fix bug where environment vars can't be passed via boot args · a99cd112
      Chris Metcalf 提交于
      Commit 026cee00 had the side-effect of dropping the '=' from
      the unknown boot arguments that are passed to init as environment
      variables.  This is because parse_args() puts a NUL in the string
      where the '=' was when it passes the "param" and "val" pointers
      to the parsing subfunctions.  Previously, unknown_bootoption() was
      the last parse_args() subfunction to run, and it carefully put back
      the '=' character.  Now the ignore_unknown_bootoption() is the last
      one to run, and it wasn't doing the necessary repair, so the
      envp params ended up with the embedded NUL and were no longer
      seen as valid environment variables by init.
      Tested-by: NWoody Suwalski <terraluna977@gmail.com>
      Acked-by: NPawel Moll <pawel.moll@arm.com>
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      a99cd112
    • P
      rcu: Reduce cache-miss initialization latencies for large systems · 8932a63d
      Paul E. McKenney 提交于
      Commit #0209f649 (rcu: limit rcu_node leaf-level fanout) set an upper
      limit of 16 on the leaf-level fanout for the rcu_node tree.  This was
      needed to reduce lock contention that was induced by the synchronization
      of scheduling-clock interrupts, which was in turn needed to improve
      energy efficiency for moderate-sized lightly loaded servers.
      
      However, reducing the leaf-level fanout means that there are more
      leaf-level rcu_node structures in the tree, which in turn means that
      RCU's grace-period initialization incurs more cache misses.  This is
      not a problem on moderate-sized servers with only a few tens of CPUs,
      but becomes a major source of real-time latency spikes on systems with
      many hundreds of CPUs.  In addition, the workloads running on these large
      systems tend to be CPU-bound, which eliminates the energy-efficiency
      advantages of synchronizing scheduling-clock interrupts.  Therefore,
      these systems need maximal values for the rcu_node leaf-level fanout.
      
      This commit addresses this problem by introducing a new kernel parameter
      named RCU_FANOUT_LEAF that directly controls the leaf-level fanout.
      This parameter defaults to 16 to handle the common case of a moderate
      sized lightly loaded servers, but may be set higher on larger systems.
      Reported-by: NMike Galbraith <efault@gmx.de>
      Reported-by: NDimitri Sivanich <sivanich@sgi.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      8932a63d
    • P
      rcu: Clarify help text for RCU_BOOST_PRIO · c9336643
      Paul E. McKenney 提交于
      The old text confused real-time applications with real-time threads, so
      that you pretty much needed to understand how this kernel configuration
      parameter worked to understand the help text.  This commit therefore
      attempts to make the help text human-readable.
      Reported-by: NJörn Engel <joern@purestorage.com>
      Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      c9336643
  18. 20 4月, 2012 1 次提交
  19. 08 4月, 2012 1 次提交
  20. 01 4月, 2012 2 次提交