1. 20 12月, 2012 1 次提交
  2. 02 11月, 2012 1 次提交
  3. 13 10月, 2012 1 次提交
    • A
      infrastructure for saner ret_from_kernel_thread semantics · a74fb73c
      Al Viro 提交于
      * allow kernel_execve() leave the actual return to userland to
      caller (selected by CONFIG_GENERIC_KERNEL_EXECVE).  Callers
      updated accordingly.
      * architecture that does select GENERIC_KERNEL_EXECVE in its
      Kconfig should have its ret_from_kernel_thread() do this:
      	call schedule_tail
      	call the callback left for it by copy_thread(); if it ever
      returns, that's because it has just done successful kernel_execve()
      	jump to return from syscall
      IOW, its only difference from ret_from_fork() is that it does call the
      callback.
      * such an architecture should also get rid of ret_from_kernel_execve()
      and __ARCH_WANT_KERNEL_EXECVE
      
      This is the last part of infrastructure patches in that area - from
      that point on work on different architectures can live independently.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      a74fb73c
  4. 12 10月, 2012 1 次提交
    • A
      make sure that we always have a return path from kernel_execve() · d6b21238
      Al Viro 提交于
      The only place where kernel_execve() is called without a way to
      return to the caller of kernel_thread() callback is kernel_post().
      Reorganize kernel_init()/kernel_post() - instead of the former
      calling the latter in the end (and getting freed by it), have the
      latter *begin* with calling the former (and turn the latter into
      kernel_thread() callback, of course).
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      d6b21238
  5. 09 10月, 2012 1 次提交
  6. 30 9月, 2012 2 次提交
  7. 15 8月, 2012 1 次提交
  8. 01 8月, 2012 1 次提交
    • J
      mm/hotplug: correctly setup fallback zonelists when creating new pgdat · 9adb62a5
      Jiang Liu 提交于
      When hotadd_new_pgdat() is called to create new pgdat for a new node, a
      fallback zonelist should be created for the new node.  There's code to try
      to achieve that in hotadd_new_pgdat() as below:
      
      	/*
      	 * The node we allocated has no zone fallback lists. For avoiding
      	 * to access not-initialized zonelist, build here.
      	 */
      	mutex_lock(&zonelists_mutex);
      	build_all_zonelists(pgdat, NULL);
      	mutex_unlock(&zonelists_mutex);
      
      But it doesn't work as expected.  When hotadd_new_pgdat() is called, the
      new node is still in offline state because node_set_online(nid) hasn't
      been called yet.  And build_all_zonelists() only builds zonelists for
      online nodes as:
      
              for_each_online_node(nid) {
                      pg_data_t *pgdat = NODE_DATA(nid);
      
                      build_zonelists(pgdat);
                      build_zonelist_cache(pgdat);
              }
      
      Though we hope to create zonelist for the new pgdat, but it doesn't.  So
      add a new parameter "pgdat" the build_all_zonelists() to build pgdat for
      the new pgdat too.
      Signed-off-by: NJiang Liu <liuj97@gmail.com>
      Signed-off-by: NXishi Qiu <qiuxishi@huawei.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Keping Chen <chenkeping@huawei.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9adb62a5
  9. 27 7月, 2012 1 次提交
  10. 23 7月, 2012 1 次提交
    • A
      switch fput to task_work_add · 4a9d4b02
      Al Viro 提交于
      ... and schedule_work() for interrupt/kernel_thread callers
      (and yes, now it *is* OK to call from interrupt).
      
      We are guaranteed that __fput() will be done before we return
      to userland (or exit).  Note that for fput() from a kernel
      thread we get an async behaviour; it's almost always OK, but
      sometimes you might need to have __fput() completed before
      you do anything else.  There are two mechanisms for that -
      a general barrier (flush_delayed_fput()) and explicit
      __fput_sync().  Both should be used with care (as was the
      case for fput() from kernel threads all along).  See comments
      in fs/file_table.c for details.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      4a9d4b02
  11. 08 6月, 2012 2 次提交
    • B
      init: Drop initcall level output · 19efb72f
      Borislav Petkov 提交于
      9fb48c74 ("params: add 3rd arg to option handler callback
      signature") added similar lines to dmesg:
      
      initlevel:0=early, 4 registered initcalls
      initlevel:1=core, 31 registered initcalls
      initlevel:2=postcore, 11 registered initcalls
      initlevel:3=arch, 7 registered initcalls
      initlevel:4=subsys, 40 registered initcalls
      initlevel:5=fs, 30 registered initcalls
      initlevel:6=device, 250 registered initcalls
      initlevel:7=late, 35 registered initcalls
      
      but they don't contain any info for the general user staring at dmesg.
      I'm very doubtful the count of initcalls registered per level helps
      anyone so drop that output completely.
      
      Cc: Jim Cromie <jim.cromie@gmail.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Jason Baron <jbaron@redhat.com>
      Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      19efb72f
    • R
      module_param: stop double-calling parameters. · ae82fdb1
      Rusty Russell 提交于
      Commit 026cee00 "params:
      <level>_initcall-like kernel parameters" set old-style module
      parameters to level 0.  And we call those level 0 calls where we used
      to, early in start_kernel().
      
      We also loop through the initcall levels and call the levelled
      module_params before the corresponding initcall.  Unfortunately level
      0 is early_init(), so we call the standard module_param calls twice.
      
      (Turns out most things don't care, but at least ubi.mtd does).
      
      Change the level to -1 for standard module_param calls.
      Reported-by: NBenoît Thébaudeau <benoit.thebaudeau@advansee.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: stable@kernel.org
      ae82fdb1
  12. 06 6月, 2012 1 次提交
    • J
      x86-64/efi: Use EFI to deal with platform wall clock · bacef661
      Jan Beulich 提交于
      Other than ix86, x86-64 on EFI so far didn't set the
      {g,s}et_wallclock accessors to the EFI routines, thus
      incorrectly using raw RTC accesses instead.
      
      Simply removing the #ifdef around the respective code isn't
      enough, however: While so far early get-time calls were done in
      physical mode, this doesn't work properly for x86-64, as virtual
      addresses would still need to be set up for all runtime regions
      (which wasn't the case on the system I have access to), so
      instead the patch moves the call to efi_enter_virtual_mode()
      ahead (which in turn allows to drop all code related to calling
      efi-get-time in physical mode).
      
      Additionally the earlier calling of efi_set_executable()
      requires the CPA code to cope, i.e. during early boot it must be
      avoided to call cpa_flush_array(), as the first thing this
      function does is a BUG_ON(irqs_disabled()).
      
      Also make the two EFI functions in question here static -
      they're not being referenced elsewhere.
      Signed-off-by: NJan Beulich <jbeulich@suse.com>
      Tested-by: NMatt Fleming <matt.fleming@intel.com>
      Acked-by: NMatthew Garrett <mjg@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/4FBFBF5F020000780008637F@nat28.tlf.novell.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      bacef661
  13. 22 5月, 2012 1 次提交
    • L
      Fix blocking allocations called very early during bootup · 31a67102
      Linus Torvalds 提交于
      During early boot, when the scheduler hasn't really been fully set up,
      we really can't do blocking allocations because with certain (dubious)
      configurations the "might_resched()" calls can actually result in
      scheduling events.
      
      We could just make such users always use GFP_ATOMIC, but quite often the
      code that does the allocation isn't really aware of the fact that the
      scheduler isn't up yet, and forcing that kind of random knowledge on the
      initialization code is just annoying and not good for anybody.
      
      And we actually have a the 'gfp_allowed_mask' exactly for this reason:
      it's just that the kernel init sequence happens to set it to allow
      blocking allocations much too early.
      
      So move the 'gfp_allowed_mask' initialization from 'start_kernel()'
      (which is some of the earliest init code, and runs with preemption
      disabled for good reasons) into 'kernel_init()'.  kernel_init() is run
      in the newly created thread that will become the 'init' process, as
      opposed to the early startup code that runs within the context of what
      will be the first idle thread.
      
      So by the time we reach 'kernel_init()', we know that the scheduler must
      be at least limping along, because we've already scheduled from the idle
      thread into the init thread.
      Reported-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      31a67102
  14. 01 5月, 2012 1 次提交
    • J
      params: add 3rd arg to option handler callback signature · 9fb48c74
      Jim Cromie 提交于
      Add a 3rd arg, named "doing", to unknown-options callbacks invoked
      from parse_args(). The arg is passed as:
      
        "Booting kernel" from start_kernel(),
        initcall_level_names[i] from do_initcall_level(),
        mod->name from load_module(), via parse_args(), parse_one()
      
      parse_args() already has the "name" parameter, which is renamed to
      "doing" to better reflect current uses 1,2 above.  parse_args() passes
      it to an altered parse_one(), which now passes it down into the
      unknown option handler callbacks.
      
      The mod->name will be needed to handle dyndbg for loadable modules,
      since params passed by modprobe are not qualified (they do not have a
      "$modname." prefix), and by the time the unknown-param callback is
      called, the module name is not otherwise available.
      
      Minor tweaks:
      
      Add param-name to parse_one's pr_debug(), current message doesnt
      identify the param being handled, add it.
      
      Add a pr_info to print current level and level_name of the initcall,
      and number of registered initcalls at that level.  This adds 7 lines
      to dmesg output, like:
      
         initlevel:6=device, 172 registered initcalls
      
      Drop "parameters" from initcall_level_names[], its unhelpful in the
      pr_info() added above.  This array is passed into parse_args() by
      do_initcall_level().
      
      CC: Rusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NJim Cromie <jim.cromie@gmail.com>
      Acked-by: NJason Baron <jbaron@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9fb48c74
  15. 25 4月, 2012 1 次提交
  16. 29 3月, 2012 1 次提交
  17. 26 3月, 2012 1 次提交
    • P
      params: <level>_initcall-like kernel parameters · 026cee00
      Pawel Moll 提交于
      This patch adds a set of macros that can be used to declare
      kernel parameters to be parsed _before_ initcalls at a chosen
      level are executed.  We rename the now-unused "flags" field of
      struct kernel_param as the level.  It's signed, for when we
      use this for early params as well, in future.
      
      Linker macro collating init calls had to be modified in order
      to add additional symbols between levels that are later used
      by the init code to split the calls into blocks.
      Signed-off-by: NPawel Moll <pawel.moll@arm.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      026cee00
  18. 15 3月, 2012 1 次提交
  19. 01 3月, 2012 1 次提交
  20. 13 1月, 2012 1 次提交
  21. 06 12月, 2011 2 次提交
  22. 26 10月, 2011 2 次提交
  23. 30 9月, 2011 1 次提交
    • W
      bootup: move 'usermodehelper_enable()' a little earlier · b0f84374
      wangyanqing 提交于
      Commit d5767c53 ("bootup: move 'usermodehelper_enable()' to the end
      of do_basic_setup()") moved 'usermodehelper_enable()' to end of
      do_basic_setup() to after the initcalls.  But then I get failed to let
      uvesafb work on my computer, and lose the splash boot.
      
      So maybe we could start usermodehelper_enable a little early to make
      some task work that need eary init with the help of user mode.
      
      [ I would *really* prefer that initcalls not call into user space - even
        the real 'init' hasn't been execve'd yet, after all! But for uvesafb
        it really does look like we don't have much choice.
      
        I considered doing this when we mount the root filesystem, but
        depending on config options that is in multiple places.  We could do
        the usermode helper enable as a rootfs_initcall()..
      
        So I'm just using wang yanqing's trivial patch.  It's not wonderful,
        but it's simple and should work.  We should revisit this some day,
        though.      - Linus ]
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b0f84374
  24. 29 9月, 2011 1 次提交
    • L
      bootup: move 'usermodehelper_enable()' to the end of do_basic_setup() · d5767c53
      Linus Torvalds 提交于
      Doing it just before starting to call into cpu_idle() made a sick kind
      of sense only because the original bug we fixed (see commit
      288d5abe: "Boot up with usermodehelper disabled") was about problems
      with some scheduler data structures not being initialized, and they had
      better be initialized at that point.
      
      But it really didn't make any other conceptual sense, and doing it after
      the initial "schedule()" call for the idle thread actually opened up a
      race: what if the main initialization thread did everything without
      needing to sleep, and got all the way into user land too? Without
      actually having scheduled back to the idle thread?
      
      Now, in normal circumstances that doesn't ever happen, but it looks like
      Richard Cochran triggered exactly that on his ARM IXP4xx machines:
      
        "I have some ARM IXP4xx based machines that use the two on chip MAC
         ports (aka NPEs).  The NPE needs a firmware in order to function.
         Ever since the following commit [that 288d5abe one], it is no
         longer possible to bring up the interfaces during the init scripts."
      
      with a call trace showing an ioctl coming from user space. Richard says:
      
        "The init is busybox, and the startup script does mount, syslogd, and
         then ifup, so that all can go by quickly."
      
      The fix is to move the usermodehelper_enable() into the main 'init'
      thread, and just put it after we've done all our initcalls.  By then,
      everything really should be up, but we've obviously not actually started
      the user-mode portion of init yet.
      Reported-and-tested-by: NRichard Cochran <richardcochran@gmail.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d5767c53
  25. 22 9月, 2011 1 次提交
  26. 04 8月, 2011 2 次提交
  27. 17 6月, 2011 1 次提交
  28. 30 5月, 2011 1 次提交
    • L
      mm: Fix boot crash in mm_alloc() · 6345d24d
      Linus Torvalds 提交于
      Thomas Gleixner reports that we now have a boot crash triggered by
      CONFIG_CPUMASK_OFFSTACK=y:
      
          BUG: unable to handle kernel NULL pointer dereference at   (null)
          IP: [<c11ae035>] find_next_bit+0x55/0xb0
          Call Trace:
           [<c11addda>] cpumask_any_but+0x2a/0x70
           [<c102396b>] flush_tlb_mm+0x2b/0x80
           [<c1022705>] pud_populate+0x35/0x50
           [<c10227ba>] pgd_alloc+0x9a/0xf0
           [<c103a3fc>] mm_init+0xec/0x120
           [<c103a7a3>] mm_alloc+0x53/0xd0
      
      which was introduced by commit de03c72c ("mm: convert
      mm->cpu_vm_cpumask into cpumask_var_t"), and is due to wrong ordering of
      mm_init() vs mm_init_cpumask
      
      Thomas wrote a patch to just fix the ordering of initialization, but I
      hate the new double allocation in the fork path, so I ended up instead
      doing some more radical surgery to clean it all up.
      Reported-by: NThomas Gleixner <tglx@linutronix.de>
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6345d24d
  29. 25 5月, 2011 2 次提交
    • M
      printk: allocate kernel log buffer earlier · 162a7e75
      Mike Travis 提交于
      On larger systems, because of the numerous ACPI, Bootmem and EFI messages,
      the static log buffer overflows before the larger one specified by the
      log_buf_len param is allocated.  Minimize the overflow by allocating the
      new log buffer as soon as possible.
      
      On kernels without memblock, a later call to setup_log_buf from
      kernel/init.c is the fallback.
      
      [akpm@linux-foundation.org: coding-style fixes]
      [akpm@linux-foundation.org: fix CONFIG_PRINTK=n build]
      Signed-off-by: NMike Travis <travis@sgi.com>
      Cc: Yinghai Lu <yhlu.kernel@gmail.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Jack Steiner <steiner@sgi.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      162a7e75
    • K
      mm: convert mm->cpu_vm_cpumask into cpumask_var_t · de03c72c
      KOSAKI Motohiro 提交于
      cpumask_t is very big struct and cpu_vm_mask is placed wrong position.
      It might lead to reduce cache hit ratio.
      
      This patch has two change.
      1) Move the place of cpumask into last of mm_struct. Because usually cpumask
         is accessed only front bits when the system has cpu-hotplug capability
      2) Convert cpu_vm_mask into cpumask_var_t. It may help to reduce memory
         footprint if cpumask_size() will use nr_cpumask_bits properly in future.
      
      In addition, this patch change the name of cpu_vm_mask with cpu_vm_mask_var.
      It may help to detect out of tree cpu_vm_mask users.
      
      This patch has no functional change.
      
      [akpm@linux-foundation.org: build fix]
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      de03c72c
  30. 20 5月, 2011 1 次提交
  31. 24 3月, 2011 1 次提交
  32. 23 3月, 2011 1 次提交
  33. 20 1月, 2011 1 次提交
    • T
      lockdep: Move early boot local IRQ enable/disable status to init/main.c · 2ce802f6
      Tejun Heo 提交于
      During early boot, local IRQ is disabled until IRQ subsystem is
      properly initialized.  During this time, no one should enable
      local IRQ and some operations which usually are not allowed with
      IRQ disabled, e.g. operations which might sleep or require
      communications with other processors, are allowed.
      
      lockdep tracked this with early_boot_irqs_off/on() callbacks.
      As other subsystems need this information too, move it to
      init/main.c and make it generally available.  While at it,
      toggle the boolean to early_boot_irqs_disabled instead of
      enabled so that it can be initialized with %false and %true
      indicates the exceptional condition.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NPekka Enberg <penberg@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      LKML-Reference: <20110120110635.GB6036@htj.dyndns.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2ce802f6
  34. 24 12月, 2010 1 次提交
    • T
      init: don't call flush_scheduled_work() from do_initcalls() · ee4569a3
      Tejun Heo 提交于
      The call to flush_scheduled_work() in do_initcalls() is there to make
      sure all works queued to system_wq by initcalls finish before the init
      sections are dropped.
      
      However, the call doesn't make much sense at this point - there
      already are multiple different workqueues and different subsystems are
      free to create and use their own.  Ordering requirements are and
      should be expressed explicitly.
      
      Drop the call to prepare for the deprecation and removal of
      flush_scheduled_work().
      
      Andrew suggested adding sanity check where the workqueue code checks
      whether any pending or running work has the work function in the init
      text section.  However, checking this for running works requires the
      worker to keep track of the current function being executed, and
      checking only the pending works will miss most cases.  As a violation
      will almost always be caught by the usual page fault mechanism, I
      don't think it would be worthwhile to make the workqueue code track
      extra state just for this.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      ee4569a3