1. 31 1月, 2013 1 次提交
    • M
      efi: Make 'efi_enabled' a function to query EFI facilities · 83e68189
      Matt Fleming 提交于
      Originally 'efi_enabled' indicated whether a kernel was booted from
      EFI firmware. Over time its semantics have changed, and it now
      indicates whether or not we are booted on an EFI machine with
      bit-native firmware, e.g. 64-bit kernel with 64-bit firmware.
      
      The immediate motivation for this patch is the bug report at,
      
          https://bugs.launchpad.net/ubuntu-cdimage/+bug/1040557
      
      which details how running a platform driver on an EFI machine that is
      designed to run under BIOS can cause the machine to become
      bricked. Also, the following report,
      
          https://bugzilla.kernel.org/show_bug.cgi?id=47121
      
      details how running said driver can also cause Machine Check
      Exceptions. Drivers need a new means of detecting whether they're
      running on an EFI machine, as sadly the expression,
      
          if (!efi_enabled)
      
      hasn't been a sufficient condition for quite some time.
      
      Users actually want to query 'efi_enabled' for different reasons -
      what they really want access to is the list of available EFI
      facilities.
      
      For instance, the x86 reboot code needs to know whether it can invoke
      the ResetSystem() function provided by the EFI runtime services, while
      the ACPI OSL code wants to know whether the EFI config tables were
      mapped successfully. There are also checks in some of the platform
      driver code to simply see if they're running on an EFI machine (which
      would make it a bad idea to do BIOS-y things).
      
      This patch is a prereq for the samsung-laptop fix patch.
      
      Cc: David Airlie <airlied@linux.ie>
      Cc: Corentin Chary <corentincj@iksaif.net>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: Dave Jiang <dave.jiang@intel.com>
      Cc: Olof Johansson <olof@lixom.net>
      Cc: Peter Jones <pjones@redhat.com>
      Cc: Colin Ian King <colin.king@canonical.com>
      Cc: Steve Langasek <steve.langasek@canonical.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Konrad Rzeszutek Wilk <konrad@kernel.org>
      Cc: Rafael J. Wysocki <rjw@sisk.pl>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      83e68189
  2. 19 1月, 2013 1 次提交
    • T
      init, block: try to load default elevator module early during boot · bb813f4c
      Tejun Heo 提交于
      This patch adds default module loading and uses it to load the default
      block elevator.  During boot, it's called right after initramfs or
      initrd is made available and right before control is passed to
      userland.  This ensures that as long as the modules are available in
      the usual places in initramfs, initrd or the root filesystem, the
      default modules are loaded as soon as possible.
      
      This will replace the on-demand elevator module loading from elevator
      init path.
      
      v2: Fixed build breakage when !CONFIG_BLOCK.  Reported by kbuild test
          robot.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Alex Riesen <raa.lkml@gmail.com>
      Cc: Fengguang We <fengguang.wu@intel.com>
      bb813f4c
  3. 26 12月, 2012 1 次提交
    • V
      Ensure that kernel_init_freeable() is not inlined into non __init code · f80b0c90
      Vineet Gupta 提交于
      Commit d6b21238 "make sure that we always have a return path from
      kernel_execve()" reshuffled kernel_init()/init_post() to ensure that
      kernel_execve() has a caller to return to.
      
      It removed __init annotation for kernel_init() and introduced/calls a
      new routine kernel_init_freeable(). Latter however is inlined by any
      reasonable compiler (ARC gcc 4.4 in this case), causing slight code
      bloat.
      
      This patch forces kernel_init_freeable() as noinline reducing the .text
      
      bloat-o-meter vmlinux vmlinux_new
      add/remove: 1/0 grow/shrink: 0/1 up/down: 374/-334 (40)
      function                        old     new   delta
      kernel_init_freeable              -     374    +374 (.init.text)
      kernel_init                     628     294    -334 (.text)
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      f80b0c90
  4. 20 12月, 2012 1 次提交
  5. 16 12月, 2012 1 次提交
  6. 13 12月, 2012 1 次提交
  7. 19 11月, 2012 1 次提交
    • E
      pidns: Consolidate initialzation of special init task state · 1c4042c2
      Eric W. Biederman 提交于
      Instead of setting child_reaper and SIGNAL_UNKILLABLE one way
      for the system init process, and another way for pid namespace
      init processes test pid->nr == 1 and use the same code for both.
      
      For the global init this results in SIGNAL_UNKILLABLE being set
      much earlier in the initialization process.
      
      This is a small cleanup and it paves the way for allowing unshare and
      enter of the pid namespace as that path like our global init also will
      not set CLONE_NEWPID.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      1c4042c2
  8. 02 11月, 2012 1 次提交
  9. 30 10月, 2012 1 次提交
    • J
      x86-64/efi: Use EFI to deal with platform wall clock (again) · bd52276f
      Jan Beulich 提交于
      Other than ix86, x86-64 on EFI so far didn't set the
      {g,s}et_wallclock accessors to the EFI routines, thus
      incorrectly using raw RTC accesses instead.
      
      Simply removing the #ifdef around the respective code isn't
      enough, however: While so far early get-time calls were done in
      physical mode, this doesn't work properly for x86-64, as virtual
      addresses would still need to be set up for all runtime regions
      (which wasn't the case on the system I have access to), so
      instead the patch moves the call to efi_enter_virtual_mode()
      ahead (which in turn allows to drop all code related to calling
      efi-get-time in physical mode).
      
      Additionally the earlier calling of efi_set_executable()
      requires the CPA code to cope, i.e. during early boot it must be
      avoided to call cpa_flush_array(), as the first thing this
      function does is a BUG_ON(irqs_disabled()).
      
      Also make the two EFI functions in question here static -
      they're not being referenced elsewhere.
      
      History:
      
          This commit was originally merged as bacef661 ("x86-64/efi:
          Use EFI to deal with platform wall clock") but it resulted in some
          ASUS machines no longer booting due to a firmware bug, and so was
          reverted in f026cfa8. A pre-emptive fix for the buggy ASUS
          firmware was merged in 03a1c254975e ("x86, efi: 1:1 pagetable
          mapping for virtual EFI calls") so now this patch can be
          reapplied.
      Signed-off-by: NJan Beulich <jbeulich@suse.com>
      Tested-by: NMatt Fleming <matt.fleming@intel.com>
      Acked-by: NMatthew Garrett <mjg@redhat.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Signed-off-by: Matt Fleming <matt.fleming@intel.com> [added commit history]
      bd52276f
  10. 13 10月, 2012 1 次提交
    • A
      infrastructure for saner ret_from_kernel_thread semantics · a74fb73c
      Al Viro 提交于
      * allow kernel_execve() leave the actual return to userland to
      caller (selected by CONFIG_GENERIC_KERNEL_EXECVE).  Callers
      updated accordingly.
      * architecture that does select GENERIC_KERNEL_EXECVE in its
      Kconfig should have its ret_from_kernel_thread() do this:
      	call schedule_tail
      	call the callback left for it by copy_thread(); if it ever
      returns, that's because it has just done successful kernel_execve()
      	jump to return from syscall
      IOW, its only difference from ret_from_fork() is that it does call the
      callback.
      * such an architecture should also get rid of ret_from_kernel_execve()
      and __ARCH_WANT_KERNEL_EXECVE
      
      This is the last part of infrastructure patches in that area - from
      that point on work on different architectures can live independently.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      a74fb73c
  11. 12 10月, 2012 1 次提交
    • A
      make sure that we always have a return path from kernel_execve() · d6b21238
      Al Viro 提交于
      The only place where kernel_execve() is called without a way to
      return to the caller of kernel_thread() callback is kernel_post().
      Reorganize kernel_init()/kernel_post() - instead of the former
      calling the latter in the end (and getting freed by it), have the
      latter *begin* with calling the former (and turn the latter into
      kernel_thread() callback, of course).
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      d6b21238
  12. 09 10月, 2012 1 次提交
  13. 30 9月, 2012 2 次提交
  14. 15 8月, 2012 1 次提交
  15. 01 8月, 2012 1 次提交
    • J
      mm/hotplug: correctly setup fallback zonelists when creating new pgdat · 9adb62a5
      Jiang Liu 提交于
      When hotadd_new_pgdat() is called to create new pgdat for a new node, a
      fallback zonelist should be created for the new node.  There's code to try
      to achieve that in hotadd_new_pgdat() as below:
      
      	/*
      	 * The node we allocated has no zone fallback lists. For avoiding
      	 * to access not-initialized zonelist, build here.
      	 */
      	mutex_lock(&zonelists_mutex);
      	build_all_zonelists(pgdat, NULL);
      	mutex_unlock(&zonelists_mutex);
      
      But it doesn't work as expected.  When hotadd_new_pgdat() is called, the
      new node is still in offline state because node_set_online(nid) hasn't
      been called yet.  And build_all_zonelists() only builds zonelists for
      online nodes as:
      
              for_each_online_node(nid) {
                      pg_data_t *pgdat = NODE_DATA(nid);
      
                      build_zonelists(pgdat);
                      build_zonelist_cache(pgdat);
              }
      
      Though we hope to create zonelist for the new pgdat, but it doesn't.  So
      add a new parameter "pgdat" the build_all_zonelists() to build pgdat for
      the new pgdat too.
      Signed-off-by: NJiang Liu <liuj97@gmail.com>
      Signed-off-by: NXishi Qiu <qiuxishi@huawei.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Keping Chen <chenkeping@huawei.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9adb62a5
  16. 27 7月, 2012 1 次提交
  17. 23 7月, 2012 1 次提交
    • A
      switch fput to task_work_add · 4a9d4b02
      Al Viro 提交于
      ... and schedule_work() for interrupt/kernel_thread callers
      (and yes, now it *is* OK to call from interrupt).
      
      We are guaranteed that __fput() will be done before we return
      to userland (or exit).  Note that for fput() from a kernel
      thread we get an async behaviour; it's almost always OK, but
      sometimes you might need to have __fput() completed before
      you do anything else.  There are two mechanisms for that -
      a general barrier (flush_delayed_fput()) and explicit
      __fput_sync().  Both should be used with care (as was the
      case for fput() from kernel threads all along).  See comments
      in fs/file_table.c for details.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      4a9d4b02
  18. 08 6月, 2012 2 次提交
    • B
      init: Drop initcall level output · 19efb72f
      Borislav Petkov 提交于
      9fb48c74 ("params: add 3rd arg to option handler callback
      signature") added similar lines to dmesg:
      
      initlevel:0=early, 4 registered initcalls
      initlevel:1=core, 31 registered initcalls
      initlevel:2=postcore, 11 registered initcalls
      initlevel:3=arch, 7 registered initcalls
      initlevel:4=subsys, 40 registered initcalls
      initlevel:5=fs, 30 registered initcalls
      initlevel:6=device, 250 registered initcalls
      initlevel:7=late, 35 registered initcalls
      
      but they don't contain any info for the general user staring at dmesg.
      I'm very doubtful the count of initcalls registered per level helps
      anyone so drop that output completely.
      
      Cc: Jim Cromie <jim.cromie@gmail.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Jason Baron <jbaron@redhat.com>
      Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      19efb72f
    • R
      module_param: stop double-calling parameters. · ae82fdb1
      Rusty Russell 提交于
      Commit 026cee00 "params:
      <level>_initcall-like kernel parameters" set old-style module
      parameters to level 0.  And we call those level 0 calls where we used
      to, early in start_kernel().
      
      We also loop through the initcall levels and call the levelled
      module_params before the corresponding initcall.  Unfortunately level
      0 is early_init(), so we call the standard module_param calls twice.
      
      (Turns out most things don't care, but at least ubi.mtd does).
      
      Change the level to -1 for standard module_param calls.
      Reported-by: NBenoît Thébaudeau <benoit.thebaudeau@advansee.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: stable@kernel.org
      ae82fdb1
  19. 06 6月, 2012 1 次提交
    • J
      x86-64/efi: Use EFI to deal with platform wall clock · bacef661
      Jan Beulich 提交于
      Other than ix86, x86-64 on EFI so far didn't set the
      {g,s}et_wallclock accessors to the EFI routines, thus
      incorrectly using raw RTC accesses instead.
      
      Simply removing the #ifdef around the respective code isn't
      enough, however: While so far early get-time calls were done in
      physical mode, this doesn't work properly for x86-64, as virtual
      addresses would still need to be set up for all runtime regions
      (which wasn't the case on the system I have access to), so
      instead the patch moves the call to efi_enter_virtual_mode()
      ahead (which in turn allows to drop all code related to calling
      efi-get-time in physical mode).
      
      Additionally the earlier calling of efi_set_executable()
      requires the CPA code to cope, i.e. during early boot it must be
      avoided to call cpa_flush_array(), as the first thing this
      function does is a BUG_ON(irqs_disabled()).
      
      Also make the two EFI functions in question here static -
      they're not being referenced elsewhere.
      Signed-off-by: NJan Beulich <jbeulich@suse.com>
      Tested-by: NMatt Fleming <matt.fleming@intel.com>
      Acked-by: NMatthew Garrett <mjg@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/4FBFBF5F020000780008637F@nat28.tlf.novell.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      bacef661
  20. 22 5月, 2012 1 次提交
    • L
      Fix blocking allocations called very early during bootup · 31a67102
      Linus Torvalds 提交于
      During early boot, when the scheduler hasn't really been fully set up,
      we really can't do blocking allocations because with certain (dubious)
      configurations the "might_resched()" calls can actually result in
      scheduling events.
      
      We could just make such users always use GFP_ATOMIC, but quite often the
      code that does the allocation isn't really aware of the fact that the
      scheduler isn't up yet, and forcing that kind of random knowledge on the
      initialization code is just annoying and not good for anybody.
      
      And we actually have a the 'gfp_allowed_mask' exactly for this reason:
      it's just that the kernel init sequence happens to set it to allow
      blocking allocations much too early.
      
      So move the 'gfp_allowed_mask' initialization from 'start_kernel()'
      (which is some of the earliest init code, and runs with preemption
      disabled for good reasons) into 'kernel_init()'.  kernel_init() is run
      in the newly created thread that will become the 'init' process, as
      opposed to the early startup code that runs within the context of what
      will be the first idle thread.
      
      So by the time we reach 'kernel_init()', we know that the scheduler must
      be at least limping along, because we've already scheduled from the idle
      thread into the init thread.
      Reported-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      31a67102
  21. 01 5月, 2012 1 次提交
    • J
      params: add 3rd arg to option handler callback signature · 9fb48c74
      Jim Cromie 提交于
      Add a 3rd arg, named "doing", to unknown-options callbacks invoked
      from parse_args(). The arg is passed as:
      
        "Booting kernel" from start_kernel(),
        initcall_level_names[i] from do_initcall_level(),
        mod->name from load_module(), via parse_args(), parse_one()
      
      parse_args() already has the "name" parameter, which is renamed to
      "doing" to better reflect current uses 1,2 above.  parse_args() passes
      it to an altered parse_one(), which now passes it down into the
      unknown option handler callbacks.
      
      The mod->name will be needed to handle dyndbg for loadable modules,
      since params passed by modprobe are not qualified (they do not have a
      "$modname." prefix), and by the time the unknown-param callback is
      called, the module name is not otherwise available.
      
      Minor tweaks:
      
      Add param-name to parse_one's pr_debug(), current message doesnt
      identify the param being handled, add it.
      
      Add a pr_info to print current level and level_name of the initcall,
      and number of registered initcalls at that level.  This adds 7 lines
      to dmesg output, like:
      
         initlevel:6=device, 172 registered initcalls
      
      Drop "parameters" from initcall_level_names[], its unhelpful in the
      pr_info() added above.  This array is passed into parse_args() by
      do_initcall_level().
      
      CC: Rusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NJim Cromie <jim.cromie@gmail.com>
      Acked-by: NJason Baron <jbaron@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9fb48c74
  22. 25 4月, 2012 1 次提交
  23. 29 3月, 2012 1 次提交
  24. 26 3月, 2012 1 次提交
    • P
      params: <level>_initcall-like kernel parameters · 026cee00
      Pawel Moll 提交于
      This patch adds a set of macros that can be used to declare
      kernel parameters to be parsed _before_ initcalls at a chosen
      level are executed.  We rename the now-unused "flags" field of
      struct kernel_param as the level.  It's signed, for when we
      use this for early params as well, in future.
      
      Linker macro collating init calls had to be modified in order
      to add additional symbols between levels that are later used
      by the init code to split the calls into blocks.
      Signed-off-by: NPawel Moll <pawel.moll@arm.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      026cee00
  25. 15 3月, 2012 1 次提交
  26. 01 3月, 2012 1 次提交
  27. 13 1月, 2012 1 次提交
  28. 06 12月, 2011 2 次提交
  29. 26 10月, 2011 2 次提交
  30. 30 9月, 2011 1 次提交
    • W
      bootup: move 'usermodehelper_enable()' a little earlier · b0f84374
      wangyanqing 提交于
      Commit d5767c53 ("bootup: move 'usermodehelper_enable()' to the end
      of do_basic_setup()") moved 'usermodehelper_enable()' to end of
      do_basic_setup() to after the initcalls.  But then I get failed to let
      uvesafb work on my computer, and lose the splash boot.
      
      So maybe we could start usermodehelper_enable a little early to make
      some task work that need eary init with the help of user mode.
      
      [ I would *really* prefer that initcalls not call into user space - even
        the real 'init' hasn't been execve'd yet, after all! But for uvesafb
        it really does look like we don't have much choice.
      
        I considered doing this when we mount the root filesystem, but
        depending on config options that is in multiple places.  We could do
        the usermode helper enable as a rootfs_initcall()..
      
        So I'm just using wang yanqing's trivial patch.  It's not wonderful,
        but it's simple and should work.  We should revisit this some day,
        though.      - Linus ]
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b0f84374
  31. 29 9月, 2011 1 次提交
    • L
      bootup: move 'usermodehelper_enable()' to the end of do_basic_setup() · d5767c53
      Linus Torvalds 提交于
      Doing it just before starting to call into cpu_idle() made a sick kind
      of sense only because the original bug we fixed (see commit
      288d5abe: "Boot up with usermodehelper disabled") was about problems
      with some scheduler data structures not being initialized, and they had
      better be initialized at that point.
      
      But it really didn't make any other conceptual sense, and doing it after
      the initial "schedule()" call for the idle thread actually opened up a
      race: what if the main initialization thread did everything without
      needing to sleep, and got all the way into user land too? Without
      actually having scheduled back to the idle thread?
      
      Now, in normal circumstances that doesn't ever happen, but it looks like
      Richard Cochran triggered exactly that on his ARM IXP4xx machines:
      
        "I have some ARM IXP4xx based machines that use the two on chip MAC
         ports (aka NPEs).  The NPE needs a firmware in order to function.
         Ever since the following commit [that 288d5abe one], it is no
         longer possible to bring up the interfaces during the init scripts."
      
      with a call trace showing an ioctl coming from user space. Richard says:
      
        "The init is busybox, and the startup script does mount, syslogd, and
         then ifup, so that all can go by quickly."
      
      The fix is to move the usermodehelper_enable() into the main 'init'
      thread, and just put it after we've done all our initcalls.  By then,
      everything really should be up, but we've obviously not actually started
      the user-mode portion of init yet.
      Reported-and-tested-by: NRichard Cochran <richardcochran@gmail.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d5767c53
  32. 22 9月, 2011 1 次提交
  33. 04 8月, 2011 2 次提交
  34. 17 6月, 2011 1 次提交
  35. 30 5月, 2011 1 次提交
    • L
      mm: Fix boot crash in mm_alloc() · 6345d24d
      Linus Torvalds 提交于
      Thomas Gleixner reports that we now have a boot crash triggered by
      CONFIG_CPUMASK_OFFSTACK=y:
      
          BUG: unable to handle kernel NULL pointer dereference at   (null)
          IP: [<c11ae035>] find_next_bit+0x55/0xb0
          Call Trace:
           [<c11addda>] cpumask_any_but+0x2a/0x70
           [<c102396b>] flush_tlb_mm+0x2b/0x80
           [<c1022705>] pud_populate+0x35/0x50
           [<c10227ba>] pgd_alloc+0x9a/0xf0
           [<c103a3fc>] mm_init+0xec/0x120
           [<c103a7a3>] mm_alloc+0x53/0xd0
      
      which was introduced by commit de03c72c ("mm: convert
      mm->cpu_vm_cpumask into cpumask_var_t"), and is due to wrong ordering of
      mm_init() vs mm_init_cpumask
      
      Thomas wrote a patch to just fix the ordering of initialization, but I
      hate the new double allocation in the fork path, so I ended up instead
      doing some more radical surgery to clean it all up.
      Reported-by: NThomas Gleixner <tglx@linutronix.de>
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6345d24d