1. 14 6月, 2012 7 次提交
    • J
      ARM: OMAP: Remove loses_context variable from timer platform data · 1c2d076b
      Jon Hunter 提交于
      The platform data variable loses_context is used to determine if the timer may
      lose its logic state during power transitions and so needs to be restored. This
      information is also provided in the HWMOD device attributes for OMAP2+ devices
      via the OMAP_TIMER_ALWON flag. When this flag is set the timer will not lose
      context. So use the HWMOD device attributes to determine this.
      
      For OMAP1 devices, loses_context is never set and so set the OMAP_TIMER_ALWON
      flag for OMAP1 timers to ensure that code is equivalent.
      Signed-off-by: NJon Hunter <jon-hunter@ti.com>
      Signed-off-by: NTony Lindgren <tony@atomide.com>
      1c2d076b
    • J
      ARM: OMAP2+: Fix external clock support for dmtimers · 67d2e760
      Jon Hunter 提交于
      Currently, the dmtimer determines whether an timer can support an external
      clock source (sys_altclk) for driving the timer by the IP version. Only
      OMAP24xx devices can support an external clock source, but the IP version
      between OMAP24xx and OMAP3xxx is common and so this incorrectly indicates
      that OMAP3 devices can use an external clock source.
      
      Rather than use the IP version, just let the clock framework handle this.
      If the "alt_ck" does not exist for a timer then the clock framework will fail
      to find the clock and hence will return an error. By doing this we can eliminate
      the "timer_ip_version" variable passed as part of the platform data and simplify
      the code.
      
      We can also remove the timer IP version from the HWMOD data because the dmtimer
      driver uses the TIDR register to determine the IP version.
      Signed-off-by: NJon Hunter <jon-hunter@ti.com>
      Signed-off-by: NTony Lindgren <tony@atomide.com>
      67d2e760
    • J
      ARM: OMAP2+: HWMOD: Correct timer device attributes · 139486fa
      Jon Hunter 提交于
      Fix the following issues with the timer device attributes for OMAP2+ devices:
      
      1. For OMAP24xx devices, timers 2-8 have the ALWAYS-ON attribute indicating
         that these timers are in an ALWAYS-ON power domain. This is not the case
         only timer1 is in an ALWAYS-ON power domain.
      2. For OMAP3xxx devices, timers 2-7 have the ALWAYS-ON attribute indicating
         that these timers are in an ALWAYS-ON power domain. This is not the case
         only timer1 and timer12 are in an ALWAYS-ON power domain.
      3. For OMAP3xxx devices, timer12 does not have the ALWAYS-ON attribute but
         is in an always-on power domain.
      Signed-off-by: NJon Hunter <jon-hunter@ti.com>
      Acked-by: NPaul Walmsley <paul@pwsan.com>
      Signed-off-by: NTony Lindgren <tony@atomide.com>
      139486fa
    • J
      ARM: OMAP: Add DMTIMER capability variable to represent timer features · d1c1691b
      Jon Hunter 提交于
      Although the OMAP timers share a common hardware design, there are some
      differences between the timer instances in a given device. For example, a timer
      maybe in a power domain that can be powered-of, so can lose its logic state and
      need restoring where as another may be in power domain that is always be on.
      Another example, is a timer may support different clock sources to drive the
      timer. This information is passed to the dmtimer via the following platform data
      structure.
      
      struct dmtimer_platform_data {
      	int (*set_timer_src)(struct platform_device *pdev, int source);
      	int timer_ip_version;
      	u32 needs_manual_reset:1;
      	bool loses_context;
      	int (*get_context_loss_count)(struct device *dev);
      };
      
      The above structure uses multiple variables to represent the timer features.
      HWMOD also stores the timer capabilities using a bit-mask that represents the
      features supported. By using the same format for representing the timer
      features in the platform data as used by HWMOD, we can ...
      
      1. Use the flags defined in the plat/dmtimer.h to represent the features
         supported.
      2. For devices using HWMOD, we can retrieve the features supported from HWMOD.
      3. Eventually, simplify the platform data structure to be ...
      
      struct dmtimer_platform_data {
      	int (*set_timer_src)(struct platform_device *pdev, int source);
      	u32 timer_capability;
      }
      
      Another benefit from doing this, is that it will simplify the migration of the
      dmtimer driver to device-tree. For example, in the current OMAP2+ timer code the
      "loses_context" variable is configured at runtime by calling an architecture
      specific function. For device tree this creates a problem, because we would need
      to call the architecture specific function from within the dmtimer driver.
      However, such attributes do not need to be queried at runtime and we can look up
      the attributes via HWMOD or device-tree.
      
      This changes a new "capability" variable to the platform data and timer
      structure so we can start removing and simplifying the platform data structure.
      Signed-off-by: NJon Hunter <jon-hunter@ti.com>
      Signed-off-by: NTony Lindgren <tony@atomide.com>
      d1c1691b
    • J
      ARM: OMAP2+: Add dmtimer platform function to reserve systimers · b7b4ff76
      Jon Hunter 提交于
      During early boot, one or two dmtimers are reserved by the kernel as system
      timers (for clocksource and clockevents). These timers are marked as reserved
      and the dmtimer driver is notified which timers have been reserved via the
      platform data information.
      
      For OMAP2+ devices the timers reserved may vary depending on device and compile
      flags. Therefore, it is not easy to assume which timers we be reserved for the
      system timers. In order to migrate the dmtimer driver to support device-tree we
      need a way to pass the timers reserved for system timers to the dmtimer driver.
      Using the platform data structure will not work in the same way as it is
      currently used because the platform data structure will be stored statically in
      the dmtimer itself and the platform data will be selected via the device-tree
      match device function (of_match_device).
      
      There are a couple ways to workaround this. One option is to store the system
      timers reserved for the kernel in the device-tree and query them on boot.
      The downside of this approach is that it adds some delay to parse the DT blob
      to search for the system timers. Secondly, for OMAP3 devices we have a
      dependency on compile time flags and the device-tree would not be aware of that
      kernel compile flags and so we would need to address that.
      
      The second option is to add a function to the dmtimer code to reserved the
      system timers during boot and so the dmtimer knows exactly which timers are
      being used for system timers. This also allows us to remove the "reserved"
      member from the timer platform data. This seemed like the simpler approach and
      so was implemented here.
      Signed-off-by: NJon Hunter <jon-hunter@ti.com>
      Signed-off-by: NTony Lindgren <tony@atomide.com>
      b7b4ff76
    • J
      ARM: OMAP2+: Remove unused max number of timers definition · 26fe4e45
      Jon Hunter 提交于
      The OMAP2+ timer code has a definition for the maximum number of timers that
      OMAP2+ devices have. This defintion is not used anywhere in the code and
      appears to be left over. Furthermore the definition is not accurate for OMAP4
      devices that only have 11 timers available because the 12th timer is reserved
      as a secure timer and for OMAP3 devices the 12th timer is not available on
      secure devices. Therefore, remove this definition.
      Signed-off-by: NJon Hunter <jon-hunter@ti.com>
      Signed-off-by: NTony Lindgren <tony@atomide.com>
      26fe4e45
    • J
      ARM: OMAP: Remove unnecessary clk structure · b8fd7331
      Jon Hunter 提交于
      In the plat/dmtimer.h there is a structure named "clk" declared. This structure
      is not used and appears to be left over from previous code. Hence, remove this
      unused structure.
      
      Verified that both omap1 and omap2plus kernel configurations build with this
      change.
      Signed-off-by: NJon Hunter <jon-hunter@ti.com>
      Signed-off-by: NTony Lindgren <tony@atomide.com>
      b8fd7331
  2. 09 6月, 2012 14 次提交
    • L
      Linux 3.5-rc2 · cfaf0251
      Linus Torvalds 提交于
      cfaf0251
    • D
      mm, oom: fix badness score underflow · 1e11ad8d
      David Rientjes 提交于
      If the privileges given to root threads (3% of allowable memory) or a
      negative value of /proc/pid/oom_score_adj happen to exceed the amount of
      rss of a thread, its badness score overflows as a result of commit
      a7f638f9 ("mm, oom: normalize oom scores to oom_score_adj scale only
      for userspace").
      
      Fix this by making the type signed and return 1, meaning the thread is
      still eligible for kill, if the value is negative.
      Reported-by: NDave Jones <davej@redhat.com>
      Acked-by: NOleg Nesterov <oleg@redhat.com>
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1e11ad8d
    • L
      Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 72494504
      Linus Torvalds 提交于
      Pull scheduler fixes from Ingo Molnar.
      
      * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched: Fix the relax_domain_level boot parameter
        sched: Validate assumptions in sched_init_numa()
        sched: Always initialize cpu-power
        sched: Fix domain iteration
        sched/rt: Fix lockdep annotation within find_lock_lowest_rq()
        sched/numa: Load balance between remote nodes
        sched/x86: Calculate booted cores after construction of sibling_mask
      72494504
    • R
      sched/fair: fix lots of kernel-doc warnings · cd96891d
      Randy Dunlap 提交于
      Fix lots of new kernel-doc warnings in kernel/sched/fair.c:
      
        Warning(kernel/sched/fair.c:3625): No description found for parameter 'env'
        Warning(kernel/sched/fair.c:3625): Excess function parameter 'sd' description in 'update_sg_lb_stats'
        Warning(kernel/sched/fair.c:3735): No description found for parameter 'env'
        Warning(kernel/sched/fair.c:3735): Excess function parameter 'sd' description in 'update_sd_pick_busiest'
        Warning(kernel/sched/fair.c:3735): Excess function parameter 'this_cpu' description in 'update_sd_pick_busiest'
        .. more warnings
      Signed-off-by: NRandy Dunlap <rdunlap@xenotime.net>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cd96891d
    • L
      Revert "drm/i915/crt: Do not rely upon the HPD presence pin" · 8f53369b
      Linus Torvalds 提交于
      This reverts commit 9e612a00.
      
      It incorrectly finds VGA connectors where none are attached, apparently
      not noticing that nothing replied to the EDID queries, and happily using
      the default EDID modes that have nothing to do with actual hardware.
      
      That in turn then causes X to fall down to the lowest common
      denominator, which is usually the default 1024x768 mode that is in the
      default EDID and pretty much anything supports).
      
      I'd suggest that if not relying on the HDP pin, the code should at least
      check whether it gets valid EDID data back, rather than just assume
      there's something on the VGA connector.
      
      Cc: Dave Airlie <airlied@linux.ie>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8f53369b
    • L
      Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · 77249539
      Linus Torvalds 提交于
      Pull ext4 bug fixes from Theodore Ts'o:
       "This update contains two bug fixes, both destined for the stable tree.
        Perhaps the most important is one which fixes ext4 when used with file
        systems originally formatted for use with ext3, but then later
        converted to take advantage of ext4."
      
      * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
        ext4: don't set i_flags in EXT4_IOC_SETFLAGS
        ext4: fix the free blocks calculation for ext3 file systems w/ uninit_bg
      77249539
    • L
      Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc · 3e9ca022
      Linus Torvalds 提交于
      Pull powerpc fixes from Paul Mackerras:
       "Two small fixes for powerpc:
         - a fix for a regression since 3.2 that causes 4-second (or longer)
           pauses
         - a fix for a potential oops when loading kernel modules on 32-bit
           embedded systems."
      
      * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
        powerpc: Fix kernel panic during kernel module load
        powerpc/time: Sanity check of decrementer expiration is necessary
      3e9ca022
    • L
      Merge tag 'upstream-3.5-rc2' of git://git.infradead.org/linux-ubifs · e7264308
      Linus Torvalds 提交于
      Pull UBI/UBIFS fixes from Artem Bityutskiy:
       "Fix UBI and UBIFS - they refuse to work without debugfs.  This was
        broken by the 3.5-rc1 UBI/UBIFS changes when we removed the debugging
        Kconfig switches.
      
        Also, correct locking in 'ubi_wl_flush()' - it was extended to support
        flushing a specific LEB in 3.5-rc1, and the locking was sub-optimal."
      
      * tag 'upstream-3.5-rc2' of git://git.infradead.org/linux-ubifs:
        UBI: correct ubi_wl_flush locking
        UBIFS: fix debugfs-less systems support
        UBI: fix debugfs-less systems support
      e7264308
    • L
      Revert "vfs: stop d_splice_alias creating directory aliases" · 32ba9c3f
      Linus Torvalds 提交于
      This reverts commit 7732a557 (and commit
      3f50fff4, which was a follow-up
      cleanup).
      
      We're chasing an elusive bug that Dave Jones can apparently reproduce
      using his system call fuzzer tool, and that looks like some kind of
      locking ordering problem on the directory i_mutex chain.  Our i_mutex
      locking is rather complex, and depends on the topological ordering of
      the directories, which is why we have been very wary of splicing
      directory entries around.
      
      Of course, we really don't want to ever see aliased unconnected
      directories anyway, so none of this should ever happen, but this revert
      aims to basically get us back to a known older state.
      
      Bruce points to some of the previous discussion at
      
             http://marc.info/?i=<20110310105821.GE22723@ZenIV.linux.org.uk>
      
      and in particular a long post from Neil:
      
             http://marc.info/?i=<20110311150749.2fa2be66@notabene.brown>
      
      It should be noted that it's possible that Dave's problems come from
      other changes altohgether, including possibly just the fact that Dave
      constantly is teachning his fuzzer new tricks.  So what appears to be a
      new bug could in fact be an old one that just gets newly triggered, but
      reverting these patches as "still under heavy discussion" is the right
      thing regardless.
      Requested-by: NAl Viro <viro@zeniv.linux.org.uk>
      Acked-by: NJ. Bruce Fields <bfields@fieldses.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      32ba9c3f
    • L
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 0b35d326
      Linus Torvalds 提交于
      Pull x86 fixes from Ingo Molnar.
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/nmi: Fix section mismatch warnings on 32-bit
        x86/uv: Fix UV2 BAU legacy mode
        x86/mm: Only add extra pages count for the first memory range during pre-allocation early page table space
        x86, efi stub: Add .reloc section back into image
        x86/ioapic: Fix NULL pointer dereference on CPU hotplug after disabling irqs
        x86/reboot: Fix a warning message triggered by stop_other_cpus()
        x86/intel/moorestown: Change intel_scu_devices_create() to __devinit
        x86/numa: Set numa_nodes_parsed at acpi_numa_memory_affinity_init()
        x86/gart: Fix kmemleak warning
        x86: mce: Add the dropped timer interval init back
        x86/mce: Fix the MCE poll timer logic
      0b35d326
    • L
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 106544d8
      Linus Torvalds 提交于
      Pull perf fixes from Ingo Molnar:
       "A bit larger than what I'd wish for - half of it is due to hw driver
        updates to Intel Ivy-Bridge which info got recently released,
        cycles:pp should work there now too, amongst other things.  (but we
        are generally making exceptions for hardware enablement of this type.)
      
        There are also callchain fixes in it - responding to mostly
        theoretical (but valid) concerns.  The tooling side sports perf.data
        endianness/portability fixes which did not make it for the merge
        window - and various other fixes as well."
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (26 commits)
        perf/x86: Check user address explicitly in copy_from_user_nmi()
        perf/x86: Check if user fp is valid
        perf: Limit callchains to 127
        perf/x86: Allow multiple stacks
        perf/x86: Update SNB PEBS constraints
        perf/x86: Enable/Add IvyBridge hardware support
        perf/x86: Implement cycles:p for SNB/IVB
        perf/x86: Fix Intel shared extra MSR allocation
        x86/decoder: Fix bsr/bsf/jmpe decoding with operand-size prefix
        perf: Remove duplicate invocation on perf_event_for_each
        perf uprobes: Remove unnecessary check before strlist__delete
        perf symbols: Check for valid dso before creating map
        perf evsel: Fix 32 bit values endianity swap for sample_id_all header
        perf session: Handle endianity swap on sample_id_all header data
        perf symbols: Handle different endians properly during symbol load
        perf evlist: Pass third argument to ioctl explicitly
        perf tools: Update ioctl documentation for PERF_IOC_FLAG_GROUP
        perf tools: Make --version show kernel version instead of pull req tag
        perf tools: Check if callchain is corrupted
        perf callchain: Make callchain cursors TLS
        ...
      106544d8
    • L
      Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux · 03d8f540
      Linus Torvalds 提交于
      Pull drm intel and exynos fixes from Dave Airlie:
       "A bunch of fixes for Intel and exynos, nothing too major, a new intel
        PCI ID, and a fix for CRT detection."
      
      * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
        drm/i915: pch_irq_handler -> {ibx, cpt}_irq_handler
        char/agp: add another Ironlake host bridge
        drm/i915: fix up ivb plane 3 pageflips
        drm/exynos: fixed blending for hdmi graphic layer
        drm/exynos: Remove dummy encoder get_crtc operation implementation
        drm/exynos: Keep a reference to frame buffer GEM objects
        drm/exynos: Don't cast GEM object to Exynos GEM object when not needed
        drm/exynos: DRIVER_BUS_PLATFORM is not a driver feature
        drm/exynos: fixed size type.
        drm/exynos: Use DRM_FORMAT_{NV12, YUV420} instead of DRM_FORMAT_{NV12M, YUV420M}
        drm/i915: hold forcewake around ring hw init
        drm/i915: Mark the ringbuffers as being in the GTT domain
        drm/i915/crt: Do not rely upon the HPD presence pin
        drm/i915: Reset last_retired_head when resetting ring
      03d8f540
    • L
      Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · b1e25f41
      Linus Torvalds 提交于
      Pull leap second timer fix from Thomas Gleixner.
      
      * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        timekeeping: Fix CLOCK_MONOTONIC inconsistency during leapsecond
      b1e25f41
    • L
      Merge tag 'moduleparam-for-linus' of... · 857505fa
      Linus Torvalds 提交于
      Merge tag 'moduleparam-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus
      
      Pull minor module param fixes from Rusty Russell:
       "One bugfix for multiple moduleparam levels, one removal of overzealous
        printk."
      
      * tag 'moduleparam-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
        init: Drop initcall level output
        module_param: stop double-calling parameters.
      857505fa
  3. 08 6月, 2012 19 次提交
    • D
      x86/nmi: Fix section mismatch warnings on 32-bit · eeaaa96a
      Don Zickus 提交于
      It was reported that compiling for 32-bit caused a bunch of
      section mismatch warnings:
      
       VDSOSYM arch/x86/vdso/vdso32-syms.lds
        LD      arch/x86/vdso/built-in.o
        LD      arch/x86/built-in.o
      
       WARNING: arch/x86/built-in.o(.data+0x5af0): Section mismatch in
       reference from the variable test_nmi_ipi_callback_na.10451 to
       the function .init.text:test_nmi_ipi_callback() [...]
      
       WARNING: arch/x86/built-in.o(.data+0x5b04): Section mismatch in
       reference from the variable nmi_unk_cb_na.10399 to the function
       .init.text:nmi_unk_cb() The variable nmi_unk_cb_na.10399
       references the function __init nmi_unk_cb() [...]
      
      Both of these are attributed to the internal representation of
      the nmiaction struct created during register_nmi_handler.  The
      reason for this is that those structs are not defined in the
      init section whereas the rest of the code in nmi_selftest.c is.
      
      To resolve this, I created a new #define,
      register_nmi_handler_initonly, that tags the struct as
      __initdata to resolve the mismatch.  This #define should only be
      used in rare situations where the register/unregister is called
      during init of the kernel.
      
      Big thanks to Jan Beulich for decoding this for me as I didn't
      have a clue what was going on.
      Reported-by: NWitold Baryluk <baryluk@smp.if.uj.edu.pl>
      Tested-by: NWitold Baryluk <baryluk@smp.if.uj.edu.pl>
      Cc: Jan Beulich <JBeulich@suse.com>
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Link: http://lkml.kernel.org/r/1338991542-23000-1-git-send-email-dzickus@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      eeaaa96a
    • S
      powerpc: Fix kernel panic during kernel module load · 3c752965
      Steffen Rumler 提交于
      This fixes a problem which can causes kernel oopses while loading
      a kernel module.
      
      According to the PowerPC EABI specification, GPR r11 is assigned
      the dedicated function to point to the previous stack frame.
      In the powerpc-specific kernel module loader, do_plt_call()
      (in arch/powerpc/kernel/module_32.c), GPR r11 is also used
      to generate trampoline code.
      
      This combination crashes the kernel, in the case where the compiler
      chooses to use a helper function for saving GPRs on entry, and the
      module loader has placed the .init.text section far away from the
      .text section, meaning that it has to generate a trampoline for
      functions in the .init.text section to call the GPR save helper.
      Because the trampoline trashes r11, references to the stack frame
      using r11 can cause an oops.
      
      The fix just uses GPR r12 instead of GPR r11 for generating the
      trampoline code.  According to the statements from Freescale, this is
      safe from an EABI perspective.
      
      I've tested the fix for kernel 2.6.33 on MPC8541.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NSteffen Rumler <steffen.rumler.ext@nsn.com>
      [paulus@samba.org: reworded the description]
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      3c752965
    • C
      x86/uv: Fix UV2 BAU legacy mode · d5d2d2ee
      Cliff Wickman 提交于
      The SGI Altix UV2 BAU (Broadcast Assist Unit) as used for
      tlb-shootdown (selective broadcast mode) always uses UV2
      broadcast descriptor format. There is no need to clear the
      'legacy' (UV1) mode, because the hardware always uses UV2 mode
      for selective broadcast.
      
      But the BIOS uses general broadcast and legacy mode, and the
      hardware pays attention to the legacy mode bit for general
      broadcast. So the kernel must not clear that mode bit.
      Signed-off-by: NCliff Wickman <cpw@sgi.com>
      Cc: <stable@kernel.org>
      Link: http://lkml.kernel.org/r/E1SccoO-0002Lh-Cb@eag09.americas.sgi.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      d5d2d2ee
    • Y
      x86/mm: Only add extra pages count for the first memory range during... · bd2753b2
      Yinghai Lu 提交于
      x86/mm: Only add extra pages count for the first memory range during pre-allocation early page table space
      
      Robin found this regression:
      
      | I just tried to boot an 8TB system.  It fails very early in boot with:
      | Kernel panic - not syncing: Cannot find space for the kernel page tables
      
      git bisect commit 722bc6b1.
      
      A git revert of that commit does boot past that point on the 8TB
      configuration.
      
      That commit will add up extra pages for all memory range even
      above 4g.
      
      Try to limit that extra page count adding to first entry only.
      Bisected-by: NRobin Holt <holt@sgi.com>
      Tested-by: NRobin Holt <holt@sgi.com>
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Cc: WANG Cong <xiyou.wangcong@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/CAE9FiQUj3wyzQxtq9yzBNc9u220p8JZ1FYHG7t%3DMOzJ%3D9BZMYA@mail.gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      bd2753b2
    • D
      Merge branch 'exynos-drm-fixes' of... · 2d5c7cd3
      Dave Airlie 提交于
      Merge branch 'exynos-drm-fixes' of git://git.infradead.org/users/kmpark/linux-samsung into drm-fixes
      
      * 'exynos-drm-fixes' of git://git.infradead.org/users/kmpark/linux-samsung:
        drm/exynos: fixed blending for hdmi graphic layer
        drm/exynos: Remove dummy encoder get_crtc operation implementation
        drm/exynos: Keep a reference to frame buffer GEM objects
        drm/exynos: Don't cast GEM object to Exynos GEM object when not needed
        drm/exynos: DRIVER_BUS_PLATFORM is not a driver feature
        drm/exynos: fixed size type.
        drm/exynos: Use DRM_FORMAT_{NV12, YUV420} instead of DRM_FORMAT_{NV12M, YUV420M}
      2d5c7cd3
    • D
      Merge branch 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel into drm-fixes · 6cf98d6e
      Dave Airlie 提交于
      * 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel:
        drm/i915: pch_irq_handler -> {ibx, cpt}_irq_handler
        char/agp: add another Ironlake host bridge
        drm/i915: fix up ivb plane 3 pageflips
        drm/i915: hold forcewake around ring hw init
        drm/i915: Mark the ringbuffers as being in the GTT domain
        drm/i915/crt: Do not rely upon the HPD presence pin
        drm/i915: Reset last_retired_head when resetting ring
      6cf98d6e
    • B
      init: Drop initcall level output · 19efb72f
      Borislav Petkov 提交于
      9fb48c74 ("params: add 3rd arg to option handler callback
      signature") added similar lines to dmesg:
      
      initlevel:0=early, 4 registered initcalls
      initlevel:1=core, 31 registered initcalls
      initlevel:2=postcore, 11 registered initcalls
      initlevel:3=arch, 7 registered initcalls
      initlevel:4=subsys, 40 registered initcalls
      initlevel:5=fs, 30 registered initcalls
      initlevel:6=device, 250 registered initcalls
      initlevel:7=late, 35 registered initcalls
      
      but they don't contain any info for the general user staring at dmesg.
      I'm very doubtful the count of initcalls registered per level helps
      anyone so drop that output completely.
      
      Cc: Jim Cromie <jim.cromie@gmail.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Jason Baron <jbaron@redhat.com>
      Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      19efb72f
    • R
      module_param: stop double-calling parameters. · ae82fdb1
      Rusty Russell 提交于
      Commit 026cee00 "params:
      <level>_initcall-like kernel parameters" set old-style module
      parameters to level 0.  And we call those level 0 calls where we used
      to, early in start_kernel().
      
      We also loop through the initcall levels and call the levelled
      module_params before the corresponding initcall.  Unfortunately level
      0 is early_init(), so we call the standard module_param calls twice.
      
      (Turns out most things don't care, but at least ubi.mtd does).
      
      Change the level to -1 for standard module_param calls.
      Reported-by: NBenoît Thébaudeau <benoit.thebaudeau@advansee.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: stable@kernel.org
      ae82fdb1
    • P
      powerpc/time: Sanity check of decrementer expiration is necessary · 860aed25
      Paul Mackerras 提交于
      This reverts 68568add ("powerpc/time: Remove unnecessary sanity check
      of decrementer expiration").  We do need to check whether we have reached
      the expiration time of the next event, because we sometimes get an early
      decrementer interrupt, most notably when we set the decrementer to 1 in
      arch_irq_work_raise().  The effect of not having the sanity check is that
      if timer_interrupt() gets called early, we leave the decrementer set to
      its maximum value, which means we then don't get any more decrementer
      interrupts for about 4 seconds (or longer, depending on timebase
      frequency).  I saw these pauses as a consequence of getting a stray
      hypervisor decrementer interrupt left over from exiting a KVM guest.
      
      This isn't quite a straight revert because of changes to the surrounding
      code, but it restores the same algorithm as was previously used.
      
      Cc: stable@vger.kernel.org
      Acked-by: NAnton Blanchard <anton@samba.org>
      Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      860aed25
    • L
      Revert "mm: correctly synchronize rss-counters at exit/exec" · 48d212a2
      Linus Torvalds 提交于
      This reverts commit 40af1bbd.
      
      It's horribly and utterly broken for at least the following reasons:
      
       - calling sync_mm_rss() from mmput() is fundamentally wrong, because
         there's absolutely no reason to believe that the task that does the
         mmput() always does it on its own VM.  Example: fork, ptrace, /proc -
         you name it.
      
       - calling it *after* having done mmdrop() on it is doubly insane, since
         the mm struct may well be gone now.
      
       - testing mm against NULL before you call it is insane too, since a
      NULL mm there would have caused oopses long before.
      
      .. and those are just the three bugs I found before I decided to give up
      looking for me and revert it asap.  I should have caught it before I
      even took it, but I trusted Andrew too much.
      
      Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
      Cc: Markus Trippelsdorf <markus@trippelsdorf.de>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      48d212a2
    • T
      ext4: don't set i_flags in EXT4_IOC_SETFLAGS · b22b1f17
      Tao Ma 提交于
      Commit 79906964 uses the ext4_{set,clear}_inode_flags() functions to
      change the i_flags automatically but fails to remove the error setting
      of i_flags.  So we still have the problem of trashing state flags.
      Fix this by removing the assignment.
      Signed-off-by: NTao Ma <boyu.mt@taobao.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@kernel.org
      b22b1f17
    • T
      ext4: fix the free blocks calculation for ext3 file systems w/ uninit_bg · b0dd6b70
      Theodore Ts'o 提交于
      Ext3 filesystems that are converted to use as many ext4 file system
      features as possible will enable uninit_bg to speed up e2fsck times.
      These file systems will have a native ext3 layout of inode tables and
      block allocation bitmaps (as opposed to ext4's flex_bg layout).
      Unfortunately, in these cases, when first allocating a block in an
      uninitialized block group, ext4 would incorrectly calculate the number
      of free blocks in that block group, and then errorneously report that
      the file system was corrupt:
      
      EXT4-fs error (device vdd): ext4_mb_generate_buddy:741: group 30, 32254 clusters in bitmap, 32258 in gd
      
      This problem can be reproduced via:
      
          mke2fs -q -t ext4 -O ^flex_bg /dev/vdd 5g
          mount -t ext4 /dev/vdd /mnt
          fallocate -l 4600m /mnt/test
      
      The problem was caused by a bone headed mistake in the check to see if a
      particular metadata block was part of the block group.
      
      Many thanks to Kees Cook for finding and bisecting the buggy commit
      which introduced this bug (commit fd034a84, present since v3.2).
      Reported-by: NSander Eikelenboom <linux@eikelenboom.it>
      Reported-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Tested-by: NKees Cook <keescook@chromium.org>
      Cc: stable@kernel.org
      b0dd6b70
    • L
      Merge branch 'akpm' (Andrew's fixups) · 46edaeda
      Linus Torvalds 提交于
      Merge random fixes from Andrew Morton.
      
      * emailed from Andrew Morton <akpm@linux-foundation.org>: (11 patches)
        mm: correctly synchronize rss-counters at exit/exec
        btree: catch NULL value before it does harm
        btree: fix tree corruption in btree_get_prev()
        ipc: shm: restore MADV_REMOVE functionality on shared memory segments
        drivers/platform/x86/acerhdf.c: correct Boris' mail address
        c/r: prctl: drop VMA flags test on PR_SET_MM_ stack data assignment
        c/r: prctl: add ability to get clear_tid_address
        c/r: prctl: add minimal address test to PR_SET_MM
        c/r: prctl: update prctl_set_mm_exe_file() after mm->num_exe_file_vmas removal
        MAINTAINERS: whitespace fixes
        shmem: replace_page must flush_dcache and others
      46edaeda
    • K
      mm: correctly synchronize rss-counters at exit/exec · 40af1bbd
      Konstantin Khlebnikov 提交于
      mm->rss_stat counters have per-task delta: task->rss_stat.  Before
      changing task->mm pointer the kernel must flush this delta with
      sync_mm_rss().
      
      do_exit() already calls sync_mm_rss() to flush the rss-counters before
      committing the rss statistics into task->signal->maxrss, taskstats,
      audit and other stuff.  Unfortunately the kernel does this before
      calling mm_release(), which can call put_user() for processing
      task->clear_child_tid.  So at this point we can trigger page-faults and
      task->rss_stat becomes non-zero again.  As a result mm->rss_stat becomes
      inconsistent and check_mm() will print something like this:
      
      | BUG: Bad rss-counter state mm:ffff88020813c380 idx:1 val:-1
      | BUG: Bad rss-counter state mm:ffff88020813c380 idx:2 val:1
      
      This patch moves sync_mm_rss() into mm_release(), and moves mm_release()
      out of do_exit() and calls it earlier.  After mm_release() there should
      be no pagefaults.
      
      [akpm@linux-foundation.org: tweak comment]
      Signed-off-by: NKonstantin Khlebnikov <khlebnikov@openvz.org>
      Reported-by: NMarkus Trippelsdorf <markus@trippelsdorf.de>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: <stable@vger.kernel.org>		[3.4.x]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      40af1bbd
    • J
      btree: catch NULL value before it does harm · 39caa091
      Joern Engel 提交于
      Storing NULL values in the btree is illegal and can lead to memory
      corruption and possible other fun as well.  Catch it on insert, instead
      of waiting for the inevitable.
      Signed-off-by: NJoern Engel <joern@logfs.org>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      39caa091
    • R
      btree: fix tree corruption in btree_get_prev() · cbf8ae32
      Roland Dreier 提交于
      The memory the parameter __key points to is used as an iterator in
      btree_get_prev(), so if we save off a bkey() pointer in retry_key and
      then assign that to __key, we'll end up corrupting the btree internals
      when we do eg
      
      	longcpy(__key, bkey(geo, node, i), geo->keylen);
      
      to return the key value.  What we should do instead is use longcpy() to
      copy the key value that retry_key points to __key.
      
      This can cause a btree to get corrupted by seemingly read-only
      operations such as btree_for_each_safe.
      
      [akpm@linux-foundation.org: avoid the double longcpy()]
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      Acked-by: NJoern Engel <joern@logfs.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cbf8ae32
    • W
      ipc: shm: restore MADV_REMOVE functionality on shared memory segments · 7d8a4569
      Will Deacon 提交于
      Commit 17cf28af ("mm/fs: remove truncate_range") removed the
      truncate_range inode operation in favour of the fallocate file
      operation.
      
      When using SYSV IPC shared memory segments, calling madvise with the
      MADV_REMOVE advice on an area of shared memory will attempt to invoke
      the .fallocate function for the shm_file_operations, which is NULL and
      therefore returns -EOPNOTSUPP to userspace.  The previous behaviour
      would inherit the inode_operations from the underlying tmpfs file and
      invoke truncate_range there.
      
      This patch restores the previous behaviour by wrapping the underlying
      fallocate function in shm_fallocate, as we do for fsync.
      
      [hughd@google.com: use -ENOTSUPP in shm_fallocate()]
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Acked-by: NHugh Dickins <hughd@google.com>
      Signed-off-by: NHugh Dickins <hughd@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7d8a4569
    • B
      drivers/platform/x86/acerhdf.c: correct Boris' mail address · 4e791c98
      Borislav Petkov 提交于
      Correct mail address reference to a mail account which I actually read.
      Signed-off-by: NBorislav Petkov <bp@alien8.de>
      Cc: Peter Feuerer <peter@piie.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4e791c98
    • C
      c/r: prctl: drop VMA flags test on PR_SET_MM_ stack data assignment · 736f24d5
      Cyrill Gorcunov 提交于
      In commit b7643757 ("procfs: mark thread stack correctly in
      proc/<pid>/maps") the stack allocated via clone() is marked in
      /proc/<pid>/maps as [stack:%d] thus it might be out of the former
      mm->start_stack/end_stack values (and even has some custom VMA flags
      set).
      
      So to be able to restore mm->start_stack/end_stack drop vma flags test,
      but still require the underlying VMA to exist.
      
      As always note this feature is under CONFIG_CHECKPOINT_RESTORE and
      requires CAP_SYS_RESOURCE to be granted.
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Acked-by: NKees Cook <keescook@chromium.org>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Cc: Serge Hallyn <serge.hallyn@canonical.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      736f24d5