1. 05 9月, 2013 3 次提交
  2. 30 8月, 2013 1 次提交
  3. 27 8月, 2013 2 次提交
    • P
      powerpc: Work around gcc miscompilation of __pa() on 64-bit · bdbc29c1
      Paul Mackerras 提交于
      On 64-bit, __pa(&static_var) gets miscompiled by recent versions of
      gcc as something like:
      
              addis 3,2,.LANCHOR1+4611686018427387904@toc@ha
              addi 3,3,.LANCHOR1+4611686018427387904@toc@l
      
      This ends up effectively ignoring the offset, since its bottom 32 bits
      are zero, and means that the result of __pa() still has 0xC in the top
      nibble.  This happens with gcc 4.8.1, at least.
      
      To work around this, for 64-bit we make __pa() use an AND operator,
      and for symmetry, we make __va() use an OR operator.  Using an AND
      operator rather than a subtraction ends up with slightly shorter code
      since it can be done with a single clrldi instruction, whereas it
      takes three instructions to form the constant (-PAGE_OFFSET) and add
      it on.  (Note that MEMORY_START is always 0 on 64-bit.)
      
      CC: <stable@vger.kernel.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      bdbc29c1
    • B
      powerpc: Don't Oops when accessing /proc/powerpc/lparcfg without hypervisor · f5f6cbb6
      Benjamin Herrenschmidt 提交于
      /proc/powerpc/lparcfg is an ancient facility (though still actively used)
      which allows access to some informations relative to the partition when
      running underneath a PAPR compliant hypervisor.
      
      It makes no sense on non-pseries machines. However, currently, not only
      can it be created on these if the kernel has pseries support, but accessing
      it on such a machine will crash due to trying to do hypervisor calls.
      
      In fact, it should also not do HV calls on older pseries that didn't have
      an hypervisor either.
      
      Finally, it has the plumbing to be a module but is a "bool" Kconfig option.
      
      This fixes the whole lot by turning it into a machine_device_initcall
      that is only created on pseries, and adding the necessary hypervisor
      check before calling the H_GET_EM_PARMS hypercall
      
      CC: <stable@vger.kernel.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      f5f6cbb6
  4. 25 8月, 2013 1 次提交
  5. 23 8月, 2013 2 次提交
  6. 22 8月, 2013 1 次提交
    • S
      ARM: tegra: always enable USB VBUS regulators · 30ca2226
      Stephen Warren 提交于
      This fixes a regression exposed during the merge window by commit
      9f310ded "ARM: tegra: fix VBUS regulator GPIO polarity in DT"; namely that
      USB VBUS doesn't get turned on, so USB devices are not detected. This
      affects the internal USB port on TrimSlice (i.e. the USB->SATA bridge, to
      which the SSD is connected) and the external port(s) on Seaboard/
      Springbank and Whistler.
      
      The Tegra DT as written in v3.11 allows two paths to enable USB VBUS:
      
      1) Via the legacy DT binding for the USB controller; it can directly
         acquire a VBUS GPIO and activate it.
      
      2) Via a regulator for VBUS, which is referenced by the new DT binding
         for the USB controller.
      
      Those two methods both use the same GPIO, and hence whichever of the
      USB controller and regulator gets probed first ends up owning the GPIO.
      In practice, the USB driver only supports path (1) above, since the
      patches to support the new USB binding are not present until v3.12:-(
      
      In practice, the regulator ends up being probed first and owning the
      GPIO. Since nothing enables the regulator (the USB driver code is not
      yet present), the regulator ends up being turned off. This originally
      caused no problem, because the polarity in the regulator definition was
      incorrect, so attempting to turn off the regulator actually turned it
      on, and everything worked:-(
      
      However, when testing the new USB driver code in v3.12, I noticed the
      incorrect polarity and fixed it in commit 9f310ded "ARM: tegra: fix VBUS
      regulator GPIO polarity in DT". In the context of v3.11, this patch then
      caused the USB VBUS to actually turn off, which broke USB ports with VBUS
      control. I got this patch included in v3.11-rc1 since it fixed a bug in
      device tree (incorrect polarity specification), and hence was suitable to
      be included early in the rc series. I evidently did not test the patch at
      all, or correctly, in the context of v3.11, and hence did not notice the
      issue that I have explained above:-(
      
      Fix this by making the USB VBUS regulators always enabled. This way, if
      the regulator owns the GPIO, it will always be turned on, even if there
      is no USB driver code to request the regulator be turned on. Even
      ignoring this bug, this is a reasonable way to configure the HW anyway.
      
      If this patch is applied to v3.11, it will cause a couple pretty trivial
      conflicts in tegra20-{trimslice,seaboard}.dts when creating v3.12, since
      the context right above the added lines changed in patches destined for
      v3.12.
      Reported-by: NKyle McMartin <kmcmarti@redhat.com>
      Signed-off-by: NStephen Warren <swarren@nvidia.com>
      Signed-off-by: NOlof Johansson <olof@lixom.net>
      30ca2226
  7. 21 8月, 2013 1 次提交
  8. 20 8月, 2013 9 次提交
    • C
      xen/smp: initialize IPI vectors before marking CPU online · fc78d343
      Chuck Anderson 提交于
      An older PVHVM guest (v3.0 based) crashed during vCPU hot-plug with:
      
      	kernel BUG at drivers/xen/events.c:1328!
      
      RCU has detected that a CPU has not entered a quiescent state within the
      grace period.  It needs to send the CPU a reschedule IPI if it is not
      offline.  rcu_implicit_offline_qs() does this check:
      
      	/*
      	 * If the CPU is offline, it is in a quiescent state.  We can
      	 * trust its state not to change because interrupts are disabled.
      	 */
      	if (cpu_is_offline(rdp->cpu)) {
      		rdp->offline_fqs++;
      		return 1;
      	}
      
      	Else the CPU is online.  Send it a reschedule IPI.
      
      The CPU is in the middle of being hot-plugged and has been marked online
      (!cpu_is_offline()).  See start_secondary():
      
      	set_cpu_online(smp_processor_id(), true);
      	...
      	per_cpu(cpu_state, smp_processor_id()) = CPU_ONLINE;
      
      start_secondary() then waits for the CPU bringing up the hot-plugged CPU to
      mark it as active:
      
      	/*
      	 * Wait until the cpu which brought this one up marked it
      	 * online before enabling interrupts. If we don't do that then
      	 * we can end up waking up the softirq thread before this cpu
      	 * reached the active state, which makes the scheduler unhappy
      	 * and schedule the softirq thread on the wrong cpu. This is
      	 * only observable with forced threaded interrupts, but in
      	 * theory it could also happen w/o them. It's just way harder
      	 * to achieve.
      	 */
      	while (!cpumask_test_cpu(smp_processor_id(), cpu_active_mask))
      		cpu_relax();
      
      	/* enable local interrupts */
      	local_irq_enable();
      
      The CPU being hot-plugged will be marked active after it has been fully
      initialized by the CPU managing the hot-plug.  In the Xen PVHVM case
      xen_smp_intr_init() is called to set up the hot-plugged vCPU's
      XEN_RESCHEDULE_VECTOR.
      
      The hot-plugging CPU is marked online, not marked active and does not have
      its IPI vectors set up.  rcu_implicit_offline_qs() sees the hot-plugging
      cpu is !cpu_is_offline() and tries to send it a reschedule IPI:
      This will lead to:
      
      	kernel BUG at drivers/xen/events.c:1328!
      
      	xen_send_IPI_one()
      	xen_smp_send_reschedule()
      	rcu_implicit_offline_qs()
      	rcu_implicit_dynticks_qs()
      	force_qs_rnp()
      	force_quiescent_state()
      	__rcu_process_callbacks()
      	rcu_process_callbacks()
      	__do_softirq()
      	call_softirq()
      	do_softirq()
      	irq_exit()
      	xen_evtchn_do_upcall()
      
      because xen_send_IPI_one() will attempt to use an uninitialized IRQ for
      the XEN_RESCHEDULE_VECTOR.
      
      There is at least one other place that has caused the same crash:
      
      	xen_smp_send_reschedule()
      	wake_up_idle_cpu()
      	add_timer_on()
      	clocksource_watchdog()
      	call_timer_fn()
      	run_timer_softirq()
      	__do_softirq()
      	call_softirq()
      	do_softirq()
      	irq_exit()
      	xen_evtchn_do_upcall()
      	xen_hvm_callback_vector()
      
      clocksource_watchdog() uses cpu_online_mask to pick the next CPU to handle
      a watchdog timer:
      
      	/*
      	 * Cycle through CPUs to check if the CPUs stay synchronized
      	 * to each other.
      	 */
      	next_cpu = cpumask_next(raw_smp_processor_id(), cpu_online_mask);
      	if (next_cpu >= nr_cpu_ids)
      		next_cpu = cpumask_first(cpu_online_mask);
      	watchdog_timer.expires += WATCHDOG_INTERVAL;
      	add_timer_on(&watchdog_timer, next_cpu);
      
      This resulted in an attempt to send an IPI to a hot-plugging CPU that
      had not initialized its reschedule vector. One option would be to make
      the RCU code check to not check for CPU offline but for CPU active.
      As becoming active is done after a CPU is online (in older kernels).
      
      But Srivatsa pointed out that "the cpu_active vs cpu_online ordering has been
      completely reworked - in the online path, cpu_active is set *before* cpu_online,
      and also, in the cpu offline path, the cpu_active bit is reset in the CPU_DYING
      notification instead of CPU_DOWN_PREPARE." Drilling in this the bring-up
      path: "[brought up CPU].. send out a CPU_STARTING notification, and in response
      to that, the scheduler sets the CPU in the cpu_active_mask. Again, this mask
      is better left to the scheduler alone, since it has the intelligence to use it
      judiciously."
      
      The conclusion was that:
      "
      1. At the IPI sender side:
      
         It is incorrect to send an IPI to an offline CPU (cpu not present in
         the cpu_online_mask). There are numerous places where we check this
         and warn/complain.
      
      2. At the IPI receiver side:
      
         It is incorrect to let the world know of our presence (by setting
         ourselves in global bitmasks) until our initialization steps are complete
         to such an extent that we can handle the consequences (such as
         receiving interrupts without crashing the sender etc.)
      " (from Srivatsa)
      
      As the native code enables the interrupts at some point we need to be
      able to service them. In other words a CPU must have valid IPI vectors
      if it has been marked online.
      
      It doesn't need to handle the IPI (interrupts may be disabled) but needs
      to have valid IPI vectors because another CPU may find it in cpu_online_mask
      and attempt to send it an IPI.
      
      This patch will change the order of the Xen vCPU bring-up functions so that
      Xen vectors have been set up before start_secondary() is called.
      It also will not continue to bring up a Xen vCPU if xen_smp_intr_init() fails
      to initialize it.
      
      Orabug 13823853
      Signed-off-by Chuck Anderson <chuck.anderson@oracle.com>
      Acked-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      fc78d343
    • D
      x86/xen: do not identity map UNUSABLE regions in the machine E820 · 3bc38cbc
      David Vrabel 提交于
      If there are UNUSABLE regions in the machine memory map, dom0 will
      attempt to map them 1:1 which is not permitted by Xen and the kernel
      will crash.
      
      There isn't anything interesting in the UNUSABLE region that the dom0
      kernel needs access to so we can avoid making the 1:1 mapping and
      treat it as RAM.
      
      We only do this for dom0, as that is where tboot case shows up.
      A PV domU could have an UNUSABLE region in its pseudo-physical map
      and would need to be handled in another patch.
      
      This fixes a boot failure on hosts with tboot.
      
      tboot marks a region in the e820 map as unusable and the dom0 kernel
      would attempt to map this region and Xen does not permit unusable
      regions to be mapped by guests.
      
        (XEN)  0000000000000000 - 0000000000060000 (usable)
        (XEN)  0000000000060000 - 0000000000068000 (reserved)
        (XEN)  0000000000068000 - 000000000009e000 (usable)
        (XEN)  0000000000100000 - 0000000000800000 (usable)
        (XEN)  0000000000800000 - 0000000000972000 (unusable)
      
      tboot marked this region as unusable.
      
        (XEN)  0000000000972000 - 00000000cf200000 (usable)
        (XEN)  00000000cf200000 - 00000000cf38f000 (reserved)
        (XEN)  00000000cf38f000 - 00000000cf3ce000 (ACPI data)
        (XEN)  00000000cf3ce000 - 00000000d0000000 (reserved)
        (XEN)  00000000e0000000 - 00000000f0000000 (reserved)
        (XEN)  00000000fe000000 - 0000000100000000 (reserved)
        (XEN)  0000000100000000 - 0000000630000000 (usable)
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      [v1: Altered the patch and description with domU's with UNUSABLE regions]
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      3bc38cbc
    • W
      arm64: perf: fix event validation for software group leaders · ee7538a0
      Will Deacon 提交于
      This is a port of c95eb318 ("ARM: 7809/1: perf: fix event validation
      for software group leaders") to arm64, which fixes a panic in the arm64
      perf backend found as a result of Vince's fuzzing tool.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      ee7538a0
    • W
      arm64: perf: fix array out of bounds access in armpmu_map_hw_event() · 868f6fea
      Will Deacon 提交于
      This is a port of d9f96635 ("ARM: 7810/1: perf: Fix array out of
      bounds access in armpmu_map_hw_event()") to arm64, which fixes an oops
      in the arm64 perf backend found as a result of Vince's fuzzing tool.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      868f6fea
    • Y
      x86/mm: Fix boot crash with DEBUG_PAGE_ALLOC=y and more than 512G RAM · 527bf129
      Yinghai Lu 提交于
      Dave Hansen reported that systems between 500G and 600G RAM
      crash early if DEBUG_PAGEALLOC is selected.
      
       > [    0.000000] init_memory_mapping: [mem 0x00000000-0x000fffff]
       > [    0.000000]  [mem 0x00000000-0x000fffff] page 4k
       > [    0.000000] BRK [0x02086000, 0x02086fff] PGTABLE
       > [    0.000000] BRK [0x02087000, 0x02087fff] PGTABLE
       > [    0.000000] BRK [0x02088000, 0x02088fff] PGTABLE
       > [    0.000000] init_memory_mapping: [mem 0xe80ee00000-0xe80effffff]
       > [    0.000000]  [mem 0xe80ee00000-0xe80effffff] page 4k
       > [    0.000000] BRK [0x02089000, 0x02089fff] PGTABLE
       > [    0.000000] BRK [0x0208a000, 0x0208afff] PGTABLE
       > [    0.000000] Kernel panic - not syncing: alloc_low_page: ran out of memory
      
      It turns out that we missed increasing needed pages in BRK to
      mapping initial 2M and [0,1M) when we switched to use the #PF
      handler to set memory mappings:
      
       > commit 8170e6be
       > Author: H. Peter Anvin <hpa@zytor.com>
       > Date:   Thu Jan 24 12:19:52 2013 -0800
       >
       >     x86, 64bit: Use a #PF handler to materialize early mappings on demand
      
      Before that, we had the maping from [0,512M) in head_64.S, and we
      can spare two pages [0-1M).  After that change, we can not reuse
      pages anymore.
      
      When we have more than 512M ram, we need an extra page for pgd page
      with [512G, 1024g).
      
      Increase pages in BRK for page table to solve the boot crash.
      Reported-by: NDave Hansen <dave.hansen@intel.com>
      Bisected-by: NDave Hansen <dave.hansen@intel.com>
      Tested-by: NDave Hansen <dave.hansen@intel.com>
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Cc: <stable@vger.kernel.org> # v3.9 and later
      Link: http://lkml.kernel.org/r/1376351004-4015-1-git-send-email-yinghai@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      527bf129
    • N
      ARM: 7816/1: CONFIG_KUSER_HELPERS: fix help text · ac124504
      Nicolas Pitre 提交于
      Commit f6f91b0d ("ARM: allow kuser helpers to be removed from the
      vector page") introduced some help text for the CONFIG_KUSER_HELPERS
      option which is rather contradictory.
      
      Let's fix that, and improve it a little.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NNicolas Pitre <nico@linaro.org>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      ac124504
    • V
      ARM: 7815/1: kexec: offline non panic CPUs on Kdump panic · 4f9b4fb7
      Vijaya Kumar K 提交于
      In case of normal kexec kernel load, all cpu's are offlined
      before calling machine_kexec().But in case crash panic cpus
      are relaxed in machine_crash_nonpanic_core() SMP function
      but not offlined.
      
      When crash kernel is loaded with kexec and on panic trigger
      machine_kexec() checks for number of cpus online.
      If more than one cpu is online machine_kexec() fails to load
      with below error
      
      kexec: error: multiple CPUs still online
      
      In machine_crash_nonpanic_core() SMP function, offline CPU
      before cpu_relax
      Signed-off-by: NVijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
      Acked-by: NStephen Warren <swarren@wwwdotorg.org>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      4f9b4fb7
    • F
      ARM: 7819/1: fiq: Cast the first argument of flush_icache_range() · 7cb3be0a
      Fabio Estevam 提交于
      Commit 2ba85e7a (ARM: Fix FIQ code on VIVT CPUs) causes the following build warning:
      
      arch/arm/kernel/fiq.c:92:3: warning: passing argument 1 of 'cpu_cache.coherent_kern_range' makes integer from pointer without a cast [enabled by default]
      
      Cast it as '(unsigned long)base' to avoid the warning.
      Signed-off-by: NFabio Estevam <fabio.estevam@freescale.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      7cb3be0a
    • S
      ARM: davinci: nand: specify ecc strength · acd36357
      Sekhar Nori 提交于
      Starting with kernel v3.5, it is mandatory
      to specify ECC strength when using hardware
      ECC. Without this, kernel panics with a warning
      of the sort:
      
      Driver must set ecc.strength when using hardware ECC
      ------------[ cut here ]------------
      kernel BUG at drivers/mtd/nand/nand_base.c:3519!
      
      Fix this by specifying ECC strength for the boards
      which were missing this.
      Reported-by: NHolger Freyther <holger@freyther.de>
      Cc: <stable@vger.kernel.org> #v3.5+
      Signed-off-by: NSekhar Nori <nsekhar@ti.com>
      Signed-off-by: NKevin Hilman <khilman@linaro.org>
      acd36357
  9. 17 8月, 2013 3 次提交
  10. 16 8月, 2013 1 次提交
    • L
      Fix TLB gather virtual address range invalidation corner cases · 2b047252
      Linus Torvalds 提交于
      Ben Tebulin reported:
      
       "Since v3.7.2 on two independent machines a very specific Git
        repository fails in 9/10 cases on git-fsck due to an SHA1/memory
        failures.  This only occurs on a very specific repository and can be
        reproduced stably on two independent laptops.  Git mailing list ran
        out of ideas and for me this looks like some very exotic kernel issue"
      
      and bisected the failure to the backport of commit 53a59fc6 ("mm:
      limit mmu_gather batching to fix soft lockups on !CONFIG_PREEMPT").
      
      That commit itself is not actually buggy, but what it does is to make it
      much more likely to hit the partial TLB invalidation case, since it
      introduces a new case in tlb_next_batch() that previously only ever
      happened when running out of memory.
      
      The real bug is that the TLB gather virtual memory range setup is subtly
      buggered.  It was introduced in commit 597e1c35 ("mm/mmu_gather:
      enable tlb flush range in generic mmu_gather"), and the range handling
      was already fixed at least once in commit e6c495a9 ("mm: fix the TLB
      range flushed when __tlb_remove_page() runs out of slots"), but that fix
      was not complete.
      
      The problem with the TLB gather virtual address range is that it isn't
      set up by the initial tlb_gather_mmu() initialization (which didn't get
      the TLB range information), but it is set up ad-hoc later by the
      functions that actually flush the TLB.  And so any such case that forgot
      to update the TLB range entries would potentially miss TLB invalidates.
      
      Rather than try to figure out exactly which particular ad-hoc range
      setup was missing (I personally suspect it's the hugetlb case in
      zap_huge_pmd(), which didn't have the same logic as zap_pte_range()
      did), this patch just gets rid of the problem at the source: make the
      TLB range information available to tlb_gather_mmu(), and initialize it
      when initializing all the other tlb gather fields.
      
      This makes the patch larger, but conceptually much simpler.  And the end
      result is much more understandable; even if you want to play games with
      partial ranges when invalidating the TLB contents in chunks, now the
      range information is always there, and anybody who doesn't want to
      bother with it won't introduce subtle bugs.
      
      Ben verified that this fixes his problem.
      Reported-bisected-and-tested-by: NBen Tebulin <tebulin@googlemail.com>
      Build-testing-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Build-testing-by: NRichard Weinberger <richard.weinberger@gmail.com>
      Reviewed-by: NMichal Hocko <mhocko@suse.cz>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2b047252
  11. 14 8月, 2013 15 次提交
    • A
      m68k: Truncate base in do_div() · ea077b1b
      Andreas Schwab 提交于
      Explicitly truncate the second operand of do_div() to 32 bits to guard
      against bogus code calling it with a 64-bit divisor.
      
      [Thorsten]
      
      After upgrading from 3.2 to 3.10, mounting a btrfs volume fails with:
      
      btrfs: setting nodatacow, compression disabled
      btrfs: enabling auto recovery
      btrfs: disk space caching is enabled
      *** ZERO DIVIDE ***   FORMAT=2
      Current process id is 722
      BAD KERNEL TRAP: 00000000
      Modules linked in: evdev mac_hid ext4 crc16 jbd2 mbcache btrfs xor lzo_compress zlib_deflate raid6_pq crc32c libcrc32c
      PC: [<319535b2>] __btrfs_map_block+0x11c/0x119a [btrfs]
      SR: 2000  SP: 30c1fab4  a2: 30f0faf0
      d0: 00000000    d1: 00001000    d2: 00000000    d3: 00000000
      d4: 00010000    d5: 00000000    a0: 3085c72c    a1: 3085c72c
      Process mount (pid: 722, task=30f0faf0)
      Frame format=2 instr addr=319535ae
      Stack from 30c1faec:
              00000000 00000020 00000000 00001000 00000000 01401000 30253928 300ffc00
              00a843ac 3026f640 00000000 00010000 0009e250 00d106c0 00011220 00000000
              00001000 301c6830 0009e32a 000000ff 00000009 3085c72c 00000000 00000000
              30c1fd14 00000000 00000020 00000000 30c1fd14 0009e26c 00000020 00000003
              00000000 0009dd8a 300b0b6c 30253928 00a843ac 00001000 00000000 00000000
              0000a008 3194e76a 30253928 00a843ac 00001000 00000000 00000000 00000002
      Call Trace: [<00001000>] kernel_pg_dir+0x0/0x1000
      
          [...]
      
      Code: 222e ff74 2a2e ff5c 2c2e ff60 4c45 1402 <2d40> ff64 2d41 ff68 2205 4c2e 1800 ff68 4c04 0800 2041 d1c0 2206 4c2e 1400 ff68
      
      [Geert]
      
      As diagnosed by Andreas, fs/btrfs/volumes.c:__btrfs_map_block()
      calls
      
          do_div(stripe_nr, stripe_len);
      
      with stripe_len u64, while do_div() assumes the divisor is a 32-bit number.
      
      Due to the lack of truncation in the m68k-specific implementation of
      do_div(), the division is performed using the upper 32-bit word of
      stripe_len, which is zero.
      
      This was introduced by commit 53b381b3
      ("Btrfs: RAID5 and RAID6"), which changed the divisor from
      map->stripe_len (struct map_lookup.stripe_len is int) to a 64-bit temporary.
      Reported-by: NThorsten Glaser <tg@debian.org>
      Signed-off-by: NAndreas Schwab <schwab@linux-m68k.org>
      Tested-by: NThorsten Glaser <tg@debian.org>
      Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Cc: stable@vger.kernel.org
      ea077b1b
    • G
      m68k/atari: ARAnyM - Fix NatFeat module support · e8184e10
      Geert Uytterhoeven 提交于
      As pointed out by Andreas Schwab, pointers passed to ARAnyM NatFeat calls
      should be physical addresses, not virtual addresses.
      
      Fortunately on Atari, physical and virtual kernel addresses are the same,
      as long as normal kernel memory is concerned, so this usually worked fine
      without conversion.
      
      But for modules, pointers to literal strings are located in vmalloc()ed
      memory. Depending on the version of ARAnyM, this causes the nf_get_id()
      call to just fail, or worse, crash ARAnyM itself with e.g.
      
          Gotcha! Illegal memory access. Atari PC = $968c
      
      This is a big issue for distro kernels, who want to have all drivers as
      loadable modules in an initrd.
      
      Add a wrapper for nf_get_id() that copies the literal to the stack to
      work around this issue.
      Reported-by: NThorsten Glaser <tg@debian.org>
      Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Cc: stable@vger.kernel.org
      e8184e10
    • N
      ARM: at91/DT: fix at91sam9n12ek memory node · a57603ca
      Nicolas Ferre 提交于
      Signed-off-by: NNicolas Ferre <nicolas.ferre@atmel.com>
      Cc: stable <stable@vger.kernel.org> # 3.5+
      a57603ca
    • B
      ARM: at91: add missing uart clocks DT entries · b524f389
      Boris BREZILLON 提交于
      Add clocks to clock lookup table for uart DT entries.
      Signed-off-by: NBoris BREZILLON <b.brezillon@overkiz.com>
      Tested-by: NDouglas Gilbert <dgilbert@interlog.com>
      Signed-off-by: NNicolas Ferre <nicolas.ferre@atmel.com>
      b524f389
    • C
      arch: *: Kconfig: add "kernel/Kconfig.freezer" to "arch/*/Kconfig" · 57a1a197
      Chen Gang 提交于
      All architectures include "kernel/Kconfig.freezer" except three left, so
      let them include it too, or 'allmodconfig' will report error.
      
      The related errors: (with allmodconfig for openrisc):
      
          CC      kernel/cgroup_freezer.o
        kernel/cgroup_freezer.c: In function 'freezer_css_online':
        kernel/cgroup_freezer.c:133:15: error: 'system_freezing_cnt' undeclared (first use in this function)
        kernel/cgroup_freezer.c:133:15: note: each undeclared identifier is reported only once for each function it appears in
        kernel/cgroup_freezer.c: In function 'freezer_css_offline':
        kernel/cgroup_freezer.c:157:15: error: 'system_freezing_cnt' undeclared (first use in this function)
        kernel/cgroup_freezer.c: In function 'freezer_attach':
        kernel/cgroup_freezer.c:200:4: error: implicit declaration of function 'freeze_task'
        kernel/cgroup_freezer.c: In function 'freezer_apply_state':
        kernel/cgroup_freezer.c:371:16: error: 'system_freezing_cnt' undeclared (first use in this function)
      Signed-off-by: NChen Gang <gang.chen@asianux.com>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Chen Liqin <liqin.chen@sunplusct.com>
      Cc: Lennox Wu <lennox.wu@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      57a1a197
    • R
      x86 get_unmapped_area(): use proper mmap base for bottom-up direction · df54d6fa
      Radu Caragea 提交于
      When the stack is set to unlimited, the bottomup direction is used for
      mmap-ings but the mmap_base is not used and thus effectively renders
      ASLR for mmapings along with PIE useless.
      
      Cc: Michel Lespinasse <walken@google.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Reviewed-by: NRik van Riel <riel@redhat.com>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Cc: Adrian Sendroiu <molecula2788@gmail.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      df54d6fa
    • M
      microblaze: fix clone syscall · dfa9771a
      Michal Simek 提交于
      Fix inadvertent breakage in the clone syscall ABI for Microblaze that
      was introduced in commit f3268edb ("microblaze: switch to generic
      fork/vfork/clone").
      
      The Microblaze syscall ABI for clone takes the parent tid address in the
      4th argument; the third argument slot is used for the stack size.  The
      incorrectly-used CLONE_BACKWARDS type assigned parent tid to the 3rd
      slot.
      
      This commit restores the original ABI so that existing userspace libc
      code will work correctly.
      
      All kernel versions from v3.8-rc1 were affected.
      Signed-off-by: NMichal Simek <michal.simek@xilinx.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      dfa9771a
    • C
      mm: save soft-dirty bits on file pages · 41bb3476
      Cyrill Gorcunov 提交于
      Andy reported that if file page get reclaimed we lose the soft-dirty bit
      if it was there, so save _PAGE_BIT_SOFT_DIRTY bit when page address get
      encoded into pte entry.  Thus when #pf happens on such non-present pte
      we can restore it back.
      Reported-by: NAndy Lutomirski <luto@amacapital.net>
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Acked-by: NPavel Emelyanov <xemul@parallels.com>
      Cc: Matt Mackall <mpm@selenic.com>
      Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      41bb3476
    • C
      mm: save soft-dirty bits on swapped pages · 179ef71c
      Cyrill Gorcunov 提交于
      Andy Lutomirski reported that if a page with _PAGE_SOFT_DIRTY bit set
      get swapped out, the bit is getting lost and no longer available when
      pte read back.
      
      To resolve this we introduce _PTE_SWP_SOFT_DIRTY bit which is saved in
      pte entry for the page being swapped out.  When such page is to be read
      back from a swap cache we check for bit presence and if it's there we
      clear it and restore the former _PAGE_SOFT_DIRTY bit back.
      
      One of the problem was to find a place in pte entry where we can save
      the _PTE_SWP_SOFT_DIRTY bit while page is in swap.  The _PAGE_PSE was
      chosen for that, it doesn't intersect with swap entry format stored in
      pte.
      Reported-by: NAndy Lutomirski <luto@amacapital.net>
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Acked-by: NPavel Emelyanov <xemul@parallels.com>
      Cc: Matt Mackall <mpm@selenic.com>
      Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
      Reviewed-by: NMinchan Kim <minchan@kernel.org>
      Reviewed-by: NWanpeng Li <liwanp@linux.vnet.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      179ef71c
    • S
      perf/arm: Fix armpmu_map_hw_event() · b88a2595
      Stephen Boyd 提交于
      Fix constraint check in armpmu_map_hw_event().
      Reported-and-tested-by: NVince Weaver <vincent.weaver@maine.edu>
      Cc: <stable@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b88a2595
    • S
      ARM: 7807/1: kexec: validate CPU hotplug support · 2103f6cb
      Stephen Warren 提交于
      Architectures should fully validate whether kexec is possible as part of
      machine_kexec_prepare(), so that user-space's kexec_load() operation can
      report any problems. Performing validation in machine_kexec() itself is
      too late, since it is not allowed to return.
      
      Prior to this patch, ARM's machine_kexec() was testing after-the-fact
      whether machine_kexec_prepare() was able to disable all but one CPU.
      Instead, modify machine_kexec_prepare() to validate all conditions
      necessary for machine_kexec_prepare()'s to succeed. BUG if the validation
      succeeded, yet disabling the CPUs didn't actually work.
      Signed-off-by: NStephen Warren <swarren@nvidia.com>
      Acked-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      2103f6cb
    • W
      ARM: 7812/1: rwlocks: retry trylock operation if strex fails on free lock · 00efaa02
      Will Deacon 提交于
      Commit 15e7e5c1 ("ARM: 7749/1: spinlock: retry trylock operation if
      strex fails on free lock") modifying our arch_spin_trylock to retry the
      acquisition if the lock appeared uncontended, but the strex failed.
      
      This patch does the same for rwlocks, which were missed by the original
      patch.
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      00efaa02
    • W
      ARM: 7811/1: locks: use early clobber in arch_spin_trylock · afa31d8e
      Will Deacon 提交于
      The res variable is written before we've finished with the input
      operands (namely the lock address), so ensure that we mark it as `early
      clobber' to avoid unintended register sharing.
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      afa31d8e
    • S
      ARM: 7810/1: perf: Fix array out of bounds access in armpmu_map_hw_event() · d9f96635
      Stephen Boyd 提交于
      Vince Weaver reports an oops in the ARM perf event code while
      running his perf_fuzzer tool on a pandaboard running v3.11-rc4.
      
      Unable to handle kernel paging request at virtual address 73fd14cc
      pgd = eca6c000
      [73fd14cc] *pgd=00000000
      Internal error: Oops: 5 [#1] SMP ARM
      Modules linked in: snd_soc_omap_hdmi omapdss snd_soc_omap_abe_twl6040 snd_soc_twl6040 snd_soc_omap snd_soc_omap_hdmi_card snd_soc_omap_mcpdm snd_soc_omap_mcbsp snd_soc_core snd_compress regmap_spi snd_pcm snd_page_alloc snd_timer snd soundcore
      CPU: 1 PID: 2790 Comm: perf_fuzzer Not tainted 3.11.0-rc4 #6
      task: eddcab80 ti: ed892000 task.ti: ed892000
      PC is at armpmu_map_event+0x20/0x88
      LR is at armpmu_event_init+0x38/0x280
      pc : [<c001c3e4>]    lr : [<c001c17c>]    psr: 60000013
      sp : ed893e40  ip : ecececec  fp : edfaec00
      r10: 00000000  r9 : 00000000  r8 : ed8c3ac0
      r7 : ed8c3b5c  r6 : edfaec00  r5 : 00000000  r4 : 00000000
      r3 : 000000ff  r2 : c0496144  r1 : c049611c  r0 : edfaec00
      Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
      Control: 10c5387d  Table: aca6c04a  DAC: 00000015
      Process perf_fuzzer (pid: 2790, stack limit = 0xed892240)
      Stack: (0xed893e40 to 0xed894000)
      3e40: 00000800 c001c17c 00000002 c008a748 00000001 00000000 00000000 c00bf078
      3e60: 00000000 edfaee50 00000000 00000000 00000000 edfaec00 ed8c3ac0 edfaec00
      3e80: 00000000 c073ffac ed893f20 c00bf180 00000001 00000000 c00bf078 ed893f20
      3ea0: 00000000 ed8c3ac0 00000000 00000000 00000000 c0cb0818 eddcab80 c00bf440
      3ec0: ed893f20 00000000 eddcab80 eca76800 00000000 eca76800 00000000 00000000
      3ee0: 00000000 ec984c80 eddcab80 c00bfe68 00000000 00000000 00000000 00000080
      3f00: 00000000 ed892000 00000000 ed892030 00000004 ecc7e3c8 ecc7e3c8 00000000
      3f20: 00000000 00000048 ecececec 00000000 00000000 00000000 00000000 00000000
      3f40: 00000000 00000000 00297810 00000000 00000000 00000000 00000000 00000000
      3f60: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
      3f80: 00000002 00000002 000103a4 00000002 0000016c c00128e8 ed892000 00000000
      3fa0: 00090998 c0012700 00000002 000103a4 00090ab8 00000000 00000000 0000000f
      3fc0: 00000002 000103a4 00000002 0000016c 00090ab0 00090ab8 000107a0 00090998
      3fe0: bed92be0 bed92bd0 0000b785 b6e8f6d0 40000010 00090ab8 00000000 00000000
      [<c001c3e4>] (armpmu_map_event+0x20/0x88) from [<c001c17c>] (armpmu_event_init+0x38/0x280)
      [<c001c17c>] (armpmu_event_init+0x38/0x280) from [<c00bf180>] (perf_init_event+0x108/0x180)
      [<c00bf180>] (perf_init_event+0x108/0x180) from [<c00bf440>] (perf_event_alloc+0x248/0x40c)
      [<c00bf440>] (perf_event_alloc+0x248/0x40c) from [<c00bfe68>] (SyS_perf_event_open+0x4f4/0x8fc)
      [<c00bfe68>] (SyS_perf_event_open+0x4f4/0x8fc) from [<c0012700>] (ret_fast_syscall+0x0/0x48)
      Code: 0a000005 e3540004 0a000016 e3540000 (0791010c)
      
      This is because event->attr.config in armpmu_event_init()
      contains a very large number copied directly from userspace and
      is never checked against the size of the array indexed in
      armpmu_map_hw_event(). Fix the problem by checking the value of
      config before indexing the array and rejecting invalid config
      values.
      Reported-by: NVince Weaver <vincent.weaver@maine.edu>
      Tested-by: NVince Weaver <vincent.weaver@maine.edu>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NStephen Boyd <sboyd@codeaurora.org>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      d9f96635
    • W
      ARM: 7809/1: perf: fix event validation for software group leaders · c95eb318
      Will Deacon 提交于
      It is possible to construct an event group with a software event as a
      group leader and then subsequently add a hardware event to the group.
      This results in the event group being validated by adding all members
      of the group to a fake PMU and attempting to allocate each event on
      their respective PMU.
      
      Unfortunately, for software events wthout a corresponding arm_pmu, this
      results in a kernel crash attempting to dereference the ->get_event_idx
      function pointer.
      
      This patch fixes the problem by checking explicitly for software events
      and ignoring those in event validation (since they can always be
      scheduled). We will probably want to revisit this for 3.12, since the
      validation checks don't appear to work correctly when dealing with
      multiple hardware PMUs anyway.
      
      Cc: <stable@vger.kernel.org>
      Reported-by: NVince Weaver <vincent.weaver@maine.edu>
      Tested-by: NVince Weaver <vincent.weaver@maine.edu>
      Tested-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      c95eb318
  12. 13 8月, 2013 1 次提交
    • O
      sched: fix the theoretical signal_wake_up() vs schedule() race · e0acd0a6
      Oleg Nesterov 提交于
      This is only theoretical, but after try_to_wake_up(p) was changed
      to check p->state under p->pi_lock the code like
      
      	__set_current_state(TASK_INTERRUPTIBLE);
      	schedule();
      
      can miss a signal. This is the special case of wait-for-condition,
      it relies on try_to_wake_up/schedule interaction and thus it does
      not need mb() between __set_current_state() and if(signal_pending).
      
      However, this __set_current_state() can move into the critical
      section protected by rq->lock, now that try_to_wake_up() takes
      another lock we need to ensure that it can't be reordered with
      "if (signal_pending(current))" check inside that section.
      
      The patch is actually one-liner, it simply adds smp_wmb() before
      spin_lock_irq(rq->lock). This is what try_to_wake_up() already
      does by the same reason.
      
      We turn this wmb() into the new helper, smp_mb__before_spinlock(),
      for better documentation and to allow the architectures to change
      the default implementation.
      
      While at it, kill smp_mb__after_lock(), it has no callers.
      
      Perhaps we can also add smp_mb__before/after_spinunlock() for
      prepare_to_wait().
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e0acd0a6