1. 07 1月, 2012 1 次提交
  2. 05 1月, 2012 1 次提交
    • C
      NFS: Retry mounting NFSROOT · 43717c7d
      Chuck Lever 提交于
      Lukas Razik <linux@razik.name> reports that on his SPARC system,
      booting with an NFS root file system stopped working after commit
      56463e50 "NFS: Use super.c for NFSROOT mount option parsing."
      
      We found that the network switch to which Lukas' client was attached
      was delaying access to the LAN after the client's NIC driver reported
      that its link was up.  The delay was longer than the timeouts used in
      the NFS client during mounting.
      
      NFSROOT worked for Lukas before commit 56463e50 because in those
      kernels, the client's first operation was an rpcbind request to
      determine which port the NFS server was listening on.  When that
      request failed after a long timeout, the client simply selected the
      default NFS port (2049).  By that time the switch was allowing access
      to the LAN, and the mount succeeded.
      
      Neither of these client behaviors is desirable, so reverting 56463e50
      is really not a choice.  Instead, introduce a mechanism that retries
      the NFSROOT mount request several times.  This is the same tactic that
      normal user space NFS mounts employ to overcome server and network
      delays.
      Signed-off-by: NLukas Razik <linux@razik.name>
      [ cel: match kernel coding style, add proper patch description ]
      [ cel: add exponential back-off ]
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Tested-by: NLukas Razik <linux@razik.name>
      Cc: stable@kernel.org # > 2.6.38
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      43717c7d
  3. 04 1月, 2012 1 次提交
  4. 13 12月, 2011 1 次提交
  5. 12 12月, 2011 1 次提交
  6. 06 12月, 2011 3 次提交
  7. 03 11月, 2011 3 次提交
    • W
      sysctl: make CONFIG_SYSCTL_SYSCALL default to n · c736de60
      WANG Cong 提交于
      When I tried to send a patch to remove it, Andi told me we still need to
      keep compabitlies for old libc, so we can't remove this completely.  Then
      just make it default to n and remove the doc from
      feature-removal-schedule.txt.
      Signed-off-by: NWANG Cong <amwang@redhat.com>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c736de60
    • W
      init: add root=PARTUUID=UUID/PARTNROFF=%d support · 79975f13
      Will Drewry 提交于
      Expand root=PARTUUID=UUID syntax to support selecting a root partition by
      integer offset from a known, unique partition.  This approach provides
      similar properties to specifying a device and partition number, but using
      the UUID as the unique path prior to evaluating the offset.
      
      For example,
        root=PARTUUID=99DE9194-FC15-4223-9192-FC243948F88B/PARTNROFF=1
      selects the partition with UUID 99DE.. then select the next
      partition.
      
      This change is motivated by a particular usecase in Chromium OS where the
      bootloader can easily determine what partition it is on (by UUID) but
      doesn't perform general partition table walking.
      
      That said, support for this model provides a direct mechanism for the user
      to modify the root partition to boot without specifically needing to
      extract each UUID or update the bootloader explicitly when the root
      partition UUID is changed (if it is recreated to be larger, for instance).
       Pinning to a /boot-style partition UUID allows the arbitrary root
      partition reconfiguration/modifications with slightly less ambiguity than
      just [dev][partition] and less stringency than the specific root partition
      UUID.
      
      [sfr@canb.auug.org.au: fix init sections warning]
      Signed-off-by: NWill Drewry <wad@chromium.org>
      Cc: Kay Sievers <kay.sievers@vrfy.org>
      Cc: Randy Dunlap <rdunlap@xenotime.net>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      79975f13
    • N
      init/do_mounts_rd.c: fix ramdisk identification for padded cramfs · f919b923
      Neil Armstrong 提交于
      When a cramfs ramdisk padded with 512 bytes is given to the kernel, the
      current identify_ramdisk_image function fails to identify it.
      
      Tested with a padded cramfs image on an ARM based board.
      Signed-off-by: NNeil Armstrong <narmstrong@neotion.com>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Davidlohr Bueso <dave@gnu.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f919b923
  8. 26 10月, 2011 2 次提交
  9. 30 9月, 2011 1 次提交
    • W
      bootup: move 'usermodehelper_enable()' a little earlier · b0f84374
      wangyanqing 提交于
      Commit d5767c53 ("bootup: move 'usermodehelper_enable()' to the end
      of do_basic_setup()") moved 'usermodehelper_enable()' to end of
      do_basic_setup() to after the initcalls.  But then I get failed to let
      uvesafb work on my computer, and lose the splash boot.
      
      So maybe we could start usermodehelper_enable a little early to make
      some task work that need eary init with the help of user mode.
      
      [ I would *really* prefer that initcalls not call into user space - even
        the real 'init' hasn't been execve'd yet, after all! But for uvesafb
        it really does look like we don't have much choice.
      
        I considered doing this when we mount the root filesystem, but
        depending on config options that is in multiple places.  We could do
        the usermode helper enable as a rootfs_initcall()..
      
        So I'm just using wang yanqing's trivial patch.  It's not wonderful,
        but it's simple and should work.  We should revisit this some day,
        though.      - Linus ]
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b0f84374
  10. 29 9月, 2011 2 次提交
    • P
      rcu: Drive configuration directly from SMP and PREEMPT · 8008e129
      Paul E. McKenney 提交于
      This commit eliminates the possibility of running TREE_PREEMPT_RCU
      when SMP=n and of running TINY_RCU when PREEMPT=y.  People who really
      want these combinations can hand-edit init/Kconfig, but eliminating
      them as choices for production systems reduces the amount of testing
      required.  It will also allow cutting out a few #ifdefs.
      
      Note that running TREE_RCU and TINY_RCU on single-CPU systems using
      SMP-built kernels is still supported.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      8008e129
    • L
      bootup: move 'usermodehelper_enable()' to the end of do_basic_setup() · d5767c53
      Linus Torvalds 提交于
      Doing it just before starting to call into cpu_idle() made a sick kind
      of sense only because the original bug we fixed (see commit
      288d5abe: "Boot up with usermodehelper disabled") was about problems
      with some scheduler data structures not being initialized, and they had
      better be initialized at that point.
      
      But it really didn't make any other conceptual sense, and doing it after
      the initial "schedule()" call for the idle thread actually opened up a
      race: what if the main initialization thread did everything without
      needing to sleep, and got all the way into user land too? Without
      actually having scheduled back to the idle thread?
      
      Now, in normal circumstances that doesn't ever happen, but it looks like
      Richard Cochran triggered exactly that on his ARM IXP4xx machines:
      
        "I have some ARM IXP4xx based machines that use the two on chip MAC
         ports (aka NPEs).  The NPE needs a firmware in order to function.
         Ever since the following commit [that 288d5abe one], it is no
         longer possible to bring up the interfaces during the init scripts."
      
      with a call trace showing an ioctl coming from user space. Richard says:
      
        "The init is busybox, and the startup script does mount, syslogd, and
         then ifup, so that all can go by quickly."
      
      The fix is to move the usermodehelper_enable() into the main 'init'
      thread, and just put it after we've done all our initcalls.  By then,
      everything really should be up, but we've obviously not actually started
      the user-mode portion of init yet.
      Reported-and-tested-by: NRichard Cochran <richardcochran@gmail.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d5767c53
  11. 22 9月, 2011 1 次提交
  12. 14 8月, 2011 1 次提交
  13. 04 8月, 2011 2 次提交
  14. 26 7月, 2011 2 次提交
  15. 23 6月, 2011 1 次提交
    • R
      Fix CPU spinlock lockups on secondary CPU bringup · 1b19ca9f
      Russell King 提交于
      Secondary CPU bringup typically calls calibrate_delay() during its
      initialization.  However, calibrate_delay() modifies a global variable
      (loops_per_jiffy) used for udelay() and __delay().
      
      A side effect of 71c696b1 ("calibrate: extract fall-back calculation
      into own helper") introduced in the 2.6.39 merge window means that we
      end up with a substantial period where loops_per_jiffy is zero.  This
      causes the spinlock debugging code to malfunction:
      
      	u64 loops = loops_per_jiffy * HZ;
      	for (;;) {
      		for (i = 0; i < loops; i++) {
      			if (arch_spin_trylock(&lock->raw_lock))
      				return;
      			__delay(1);
      		}
      		...
      	}
      
      by never calling arch_spin_trylock() - resulting in the CPU locking
      up in an infinite loop inside __spin_lock_debug().
      
      Work around this by only writing to loops_per_jiffy only once we have
      completed all the calibration decisions.
      Tested-by: NSantosh Shilimkar <santosh.shilimkar@ti.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      Cc: <stable@kernel.org> (2.6.39-stable)
      --
      Better solutions (such as omitting the calibration for secondary CPUs,
      or arranging for calibrate_delay() to return the LPJ value and leave
      it to the caller to decide where to store it) are a possibility, but
      would be much more invasive into each architecture.
      
      I think this is the best solution for -rc and stable, but it should be
      revisited for the next merge window.
      
       init/calibrate.c |   14 ++++++++------
       1 files changed, 8 insertions(+), 6 deletions(-)
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1b19ca9f
  16. 17 6月, 2011 1 次提交
  17. 16 6月, 2011 3 次提交
  18. 09 6月, 2011 2 次提交
  19. 07 6月, 2011 1 次提交
  20. 30 5月, 2011 1 次提交
    • L
      mm: Fix boot crash in mm_alloc() · 6345d24d
      Linus Torvalds 提交于
      Thomas Gleixner reports that we now have a boot crash triggered by
      CONFIG_CPUMASK_OFFSTACK=y:
      
          BUG: unable to handle kernel NULL pointer dereference at   (null)
          IP: [<c11ae035>] find_next_bit+0x55/0xb0
          Call Trace:
           [<c11addda>] cpumask_any_but+0x2a/0x70
           [<c102396b>] flush_tlb_mm+0x2b/0x80
           [<c1022705>] pud_populate+0x35/0x50
           [<c10227ba>] pgd_alloc+0x9a/0xf0
           [<c103a3fc>] mm_init+0xec/0x120
           [<c103a7a3>] mm_alloc+0x53/0xd0
      
      which was introduced by commit de03c72c ("mm: convert
      mm->cpu_vm_cpumask into cpumask_var_t"), and is due to wrong ordering of
      mm_init() vs mm_init_cpumask
      
      Thomas wrote a patch to just fix the ordering of initialization, but I
      hate the new double allocation in the fork path, so I ended up instead
      doing some more radical surgery to clean it all up.
      Reported-by: NThomas Gleixner <tglx@linutronix.de>
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6345d24d
  21. 27 5月, 2011 1 次提交
  22. 25 5月, 2011 3 次提交
    • M
      printk: allocate kernel log buffer earlier · 162a7e75
      Mike Travis 提交于
      On larger systems, because of the numerous ACPI, Bootmem and EFI messages,
      the static log buffer overflows before the larger one specified by the
      log_buf_len param is allocated.  Minimize the overflow by allocating the
      new log buffer as soon as possible.
      
      On kernels without memblock, a later call to setup_log_buf from
      kernel/init.c is the fallback.
      
      [akpm@linux-foundation.org: coding-style fixes]
      [akpm@linux-foundation.org: fix CONFIG_PRINTK=n build]
      Signed-off-by: NMike Travis <travis@sgi.com>
      Cc: Yinghai Lu <yhlu.kernel@gmail.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Jack Steiner <steiner@sgi.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      162a7e75
    • A
      init/calibrate.c: fix for critical bogoMIPS intermittent calculation failure · d2b46313
      Andrew Worsley 提交于
      A fix to the TSC (Time Stamp Counter) based bogoMIPS calculation used on
      secondary CPUs which has two faults:
      
      1: Not handling wrapping of the lower 32 bits of the TSC counter on
         32bit kernel - perhaps TSC is not reset by a warm reset?
      
      2: TSC and Jiffies are no incrementing together properly.  Either
         jiffies increment too quickly or Time Stamp Counter isn't incremented
         in during an SMI but the real time clock is and jiffies are
         incremented.
      
      Case 1 can result in a factor of 16 too large a value which makes udelay()
      values too small and can cause mysterious driver errors.  Case 2 appears
      to give smaller 10-15% errors after averaging but enough to cause
      occasional failures on my own board
      
      I have tested this code on my own branch and attach patch suitable for
      current kernel code.  See below for examples of the failures and how the
      fix handles these situations now.
      
      I reported this issue earlier here:
           Intermittent problem with BogoMIPs calculation on Intel AP CPUs -
      http://marc.info/?l=linux-kernel&m=129947246316875&w=4
      
      I suspect this issue has been seen by others but as it is intermittent and
      bogoMIPS for secondary CPUs are no longer printed out it might have been
      difficult to identify this as the cause.  Perhaps these unresolved issues,
      although quite old, might be relevant as possibly this fault has been
      around for a while.  In particular Case 1 may only be relevant to 32bit
      kernels on newer HW (most people run 64bit kernels?).  Case 2 is less
      dramatic since the earlier fix in this area and also intermittent.
      
         Re: bogomips discrepancy on Intel Core2 Quad CPU -
      http://marc.info/?l=linux-kernel&m=118929277524298&w=4
         slow system and bogus bogomips  -
      http://marc.info/?l=linux-kernel&m=116791286716107&w=4
         Re: Re: [RFC-PATCH] clocksource: update lpj if clocksource has -
      http://marc.info/?l=linux-kernel&m=128952775819467&w=4
      
      This issue is masked a little by commit feae3203 ("timers, init:
      Limit the number of per cpu calibration bootup messages") which only
      prints out the first bogoMIPS value making it much harder to notice other
      values differing.  Perhaps it should be changed to only suppress them when
      they are similar values?
      
      Here are some outputs showing faults occurring and the new code handling
      them properly.  See my earlier message for examples of the original
      failure.
      
          Case 1:   A Time Stamp Counter wrap:
      ...
      Calibrating delay loop (skipped), value calculated using timer
      frequency.. 6332.70 BogoMIPS (lpj=31663540)
      ....
      calibrate_delay_direct() timer_rate_max=31666493
      timer_rate_min=31666151 pre_start=4170369255 pre_end=4202035539
      calibrate_delay_direct() timer_rate_max=2425955274
      timer_rate_min=2425954941 pre_start=4265368533 pre_end=2396356387
      calibrate_delay_direct() ignoring timer_rate as we had a TSC wrap
      around start=4265368581 >=post_end=2396356511
      calibrate_delay_direct() timer_rate_max=31666274
      timer_rate_min=31665942 pre_start=2440373374 pre_end=2472039515
      calibrate_delay_direct() timer_rate_max=31666492
      timer_rate_min=31666160 pre_start=2535372139 pre_end=2567038422
      calibrate_delay_direct() timer_rate_max=31666455
      timer_rate_min=31666207 pre_start=2630371084 pre_end=2662037415
      Calibrating delay using timer specific routine.. 6333.28 BogoMIPS (lpj=31666428)
      Total of 2 processors activated (12665.99 BogoMIPS).
      ....
      
          Case 2:  Some thing (presumably the SMM interrupt?) causing the
      very low increase in TSC counter for the DELAY_CALIBRATION_TICKS
      increase in jiffies
      ...
      Calibrating delay loop (skipped), value calculated using timer
      frequency.. 6333.25 BogoMIPS (lpj=31666270)
      ...
      calibrate_delay_direct() timer_rate_max=31666483
      timer_rate_min=31666074 pre_start=4199536526 pre_end=4231202809
      calibrate_delay_direct() timer_rate_max=864348 timer_rate_min=864016
      pre_start=2405343672 pre_end=2406207897
      calibrate_delay_direct() timer_rate_max=31666483
      timer_rate_min=31666179 pre_start=2469540464 pre_end=2501206823
      calibrate_delay_direct() timer_rate_max=31666511
      timer_rate_min=31666122 pre_start=2564539400 pre_end=2596205712
      calibrate_delay_direct() timer_rate_max=31666084
      timer_rate_min=31665685 pre_start=2659538782 pre_end=2691204657
      calibrate_delay_direct() dropping min bogoMips estimate 1 = 864348
      Calibrating delay using timer specific routine.. 6333.27 BogoMIPS (lpj=31666390)
      Total of 2 processors activated (12666.53 BogoMIPS).
      ...
      
      After 70 boots I saw 2 variations <1% slip through
      
      [akpm@linux-foundation.org: coding-style fixes]
      [akpm@linux-foundation.org: fix straggly printk mess]
      Signed-off-by: NAndrew Worsley <amworsley@gmail.com>
      Reviewed-by: NPhil Carmody <ext-phil.2.carmody@nokia.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d2b46313
    • K
      mm: convert mm->cpu_vm_cpumask into cpumask_var_t · de03c72c
      KOSAKI Motohiro 提交于
      cpumask_t is very big struct and cpu_vm_mask is placed wrong position.
      It might lead to reduce cache hit ratio.
      
      This patch has two change.
      1) Move the place of cpumask into last of mm_struct. Because usually cpumask
         is accessed only front bits when the system has cpu-hotplug capability
      2) Convert cpu_vm_mask into cpumask_var_t. It may help to reduce memory
         footprint if cpumask_size() will use nr_cpumask_bits properly in future.
      
      In addition, this patch change the name of cpu_vm_mask with cpu_vm_mask_var.
      It may help to detect out of tree cpu_vm_mask users.
      
      This patch has no functional change.
      
      [akpm@linux-foundation.org: build fix]
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      de03c72c
  23. 23 5月, 2011 1 次提交
    • L
      Give up on pushing CC_OPTIMIZE_FOR_SIZE · 281dc5c5
      Linus Torvalds 提交于
      I still happen to believe that I$ miss costs are a major thing, but
      sadly, -Os doesn't seem to be the solution.  With or without it, gcc
      will miss some obvious code size improvements, and with it enabled gcc
      will sometimes make choices that aren't good even with high I$ miss
      ratios.
      
      For example, with -Os, gcc on x86 will turn a 20-byte constant memcpy
      into a "rep movsl".  While I sincerely hope that x86 CPU's will some day
      do a good job at that, they certainly don't do it yet, and the cost is
      higher than a L1 I$ miss would be.
      
      Some day I hope we can re-enable this.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      281dc5c5
  24. 21 5月, 2011 1 次提交
  25. 20 5月, 2011 1 次提交
  26. 11 5月, 2011 1 次提交
  27. 06 5月, 2011 1 次提交