1. 16 11月, 2013 2 次提交
  2. 15 11月, 2013 22 次提交
  3. 13 11月, 2013 16 次提交
    • M
      ipc, msg: fix message length check for negative values · 4e9b45a1
      Mathias Krause 提交于
      On 64 bit systems the test for negative message sizes is bogus as the
      size, which may be positive when evaluated as a long, will get truncated
      to an int when passed to load_msg().  So a long might very well contain a
      positive value but when truncated to an int it would become negative.
      
      That in combination with a small negative value of msg_ctlmax (which will
      be promoted to an unsigned type for the comparison against msgsz, making
      it a big positive value and therefore make it pass the check) will lead to
      two problems: 1/ The kmalloc() call in alloc_msg() will allocate a too
      small buffer as the addition of alen is effectively a subtraction.  2/ The
      copy_from_user() call in load_msg() will first overflow the buffer with
      userland data and then, when the userland access generates an access
      violation, the fixup handler copy_user_handle_tail() will try to fill the
      remainder with zeros -- roughly 4GB.  That almost instantly results in a
      system crash or reset.
      
        ,-[ Reproducer (needs to be run as root) ]--
        | #include <sys/stat.h>
        | #include <sys/msg.h>
        | #include <unistd.h>
        | #include <fcntl.h>
        |
        | int main(void) {
        |     long msg = 1;
        |     int fd;
        |
        |     fd = open("/proc/sys/kernel/msgmax", O_WRONLY);
        |     write(fd, "-1", 2);
        |     close(fd);
        |
        |     msgsnd(0, &msg, 0xfffffff0, IPC_NOWAIT);
        |
        |     return 0;
        | }
        '---
      
      Fix the issue by preventing msgsz from getting truncated by consistently
      using size_t for the message length.  This way the size checks in
      do_msgsnd() could still be passed with a negative value for msg_ctlmax but
      we would fail on the buffer allocation in that case and error out.
      
      Also change the type of m_ts from int to size_t to avoid similar nastiness
      in other code paths -- it is used in similar constructs, i.e.  signed vs.
      unsigned checks.  It should never become negative under normal
      circumstances, though.
      
      Setting msg_ctlmax to a negative value is an odd configuration and should
      be prevented.  As that might break existing userland, it will be handled
      in a separate commit so it could easily be reverted and reworked without
      reintroducing the above described bug.
      
      Hardening mechanisms for user copy operations would have catched that bug
      early -- e.g.  checking slab object sizes on user copy operations as the
      usercopy feature of the PaX patch does.  Or, for that matter, detect the
      long vs.  int sign change due to truncation, as the size overflow plugin
      of the very same patch does.
      
      [akpm@linux-foundation.org: fix i386 min() warnings]
      Signed-off-by: NMathias Krause <minipli@googlemail.com>
      Cc: Pax Team <pageexec@freemail.hu>
      Cc: Davidlohr Bueso <davidlohr@hp.com>
      Cc: Brad Spengler <spender@grsecurity.net>
      Cc: Manfred Spraul <manfred@colorfullife.com>
      Cc: <stable@vger.kernel.org>	[ v2.3.27+ -- yes, that old ;) ]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4e9b45a1
    • J
      rbtree: fix rbtree_postorder_for_each_entry_safe() iterator · 1310a5a9
      Jan Kara 提交于
      The iterator rbtree_postorder_for_each_entry_safe() relies on pointer
      underflow behavior when testing for loop termination.  In particular it
      expects that
      
        &rb_entry(NULL, type, field)->field
      
      is NULL.  But the result of this expression is not defined by a C standard
      and some gcc versions (e.g.  4.3.4) assume the above expression can never
      be equal to NULL.  The net result is an oops because the iteration is not
      properly terminated.
      
      Fix the problem by modifying the iterator to avoid pointer underflows.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NCody P Schafer <cody@linux.vnet.ibm.com>
      Cc: Michel Lespinasse <walken@google.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Artem Bityutskiy <dedekind1@gmail.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      Cc: Pablo Neira Ayuso <pablo@netfilter.org>
      Cc: Patrick McHardy <kaber@trash.net>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: <stable@vger.kernel.org>		[3.12.x]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1310a5a9
    • K
      exec/ptrace: fix get_dumpable() incorrect tests · d049f74f
      Kees Cook 提交于
      The get_dumpable() return value is not boolean.  Most users of the
      function actually want to be testing for non-SUID_DUMP_USER(1) rather than
      SUID_DUMP_DISABLE(0).  The SUID_DUMP_ROOT(2) is also considered a
      protected state.  Almost all places did this correctly, excepting the two
      places fixed in this patch.
      
      Wrong logic:
          if (dumpable == SUID_DUMP_DISABLE) { /* be protective */ }
              or
          if (dumpable == 0) { /* be protective */ }
              or
          if (!dumpable) { /* be protective */ }
      
      Correct logic:
          if (dumpable != SUID_DUMP_USER) { /* be protective */ }
              or
          if (dumpable != 1) { /* be protective */ }
      
      Without this patch, if the system had set the sysctl fs/suid_dumpable=2, a
      user was able to ptrace attach to processes that had dropped privileges to
      that user.  (This may have been partially mitigated if Yama was enabled.)
      
      The macros have been moved into the file that declares get/set_dumpable(),
      which means things like the ia64 code can see them too.
      
      CVE-2013-2929
      Reported-by: NVasily Kulikov <segoon@openwall.com>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d049f74f
    • S
      rtc: s5m-rtc: add real-time clock driver for s5m8767 · 5bccae6e
      Sangbeom Kim 提交于
      Add real-time clock driver for s5m8767.
      Signed-off-by: NSangbeom Kim <sbkim73@samsung.com>
      Signed-off-by: NSachin Kamat <sachin.kamat@linaro.org>
      Cc: Todd Broch <tbroch@chromium.org>
      Cc: Mark Brown <broonie@kernel.org>
      Acked-by: Lee Jones <lee.jones@linaro.org>	[mfd parts]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5bccae6e
    • G
      init.h: document the existence of __initconst · 65321547
      Geert Uytterhoeven 提交于
      Initdata can be const since more than 5 years, using the __initconst
      keyword.
      Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      65321547
    • O
      list: introduce list_last_entry(), use list_{first,last}_entry() · 93be3c2e
      Oleg Nesterov 提交于
      We already have list_first_entry(), it makes sense to also add
      list_last_entry() for consistency.  And we use both helpers in
      list_for_each_*().
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Cc: Eilon Greenstein <eilong@broadcom.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      93be3c2e
    • O
      list: change list_for_each_entry*() to use list_*_entry() · 8120e2e5
      Oleg Nesterov 提交于
      Now that we have list_{next,prev}_entry() we can change
      list_for_each_entry*() and list_safe_reset_next() to use the new helpers
      to improve the readability.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Cc: Eilon Greenstein <eilong@broadcom.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8120e2e5
    • O
      list: introduce list_next_entry() and list_prev_entry() · 008208c6
      Oleg Nesterov 提交于
      Add two trivial helpers list_next_entry() and list_prev_entry(), they
      can have a lot of users including list.h itself.  In fact the 1st one is
      already defined in events/core.c and bnx2x_sp.c, so the patch simply
      moves the definition to list.h.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Cc: Eilon Greenstein <eilong@broadcom.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      008208c6
    • N
      lib/genalloc: add a helper function for DMA buffer allocation · 684f0d3d
      Nicolin Chen 提交于
      When using pool space for DMA buffer, there might be duplicated calling of
      gen_pool_alloc() and gen_pool_virt_to_phys() in each implementation.
      
      Thus it's better to add a simple helper function, a compatible one to the
      common dma_alloc_coherent(), to save some code.
      Signed-off-by: NNicolin Chen <b42378@freescale.com>
      Cc: "Hans J. Koch" <hjk@hansjkoch.de>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Eric Miao <eric.y.miao@gmail.com>
      Cc: Grant Likely <grant.likely@linaro.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Haojian Zhuang <haojian.zhuang@gmail.com>
      Cc: Jaroslav Kysela <perex@perex.cz>
      Cc: Kevin Hilman <khilman@deeprootsystems.com>
      Cc: Liam Girdwood <lgirdwood@gmail.com>
      Cc: Mark Brown <broonie@kernel.org>
      Cc: Mauro Carvalho Chehab <m.chehab@samsung.com>
      Cc: Rob Herring <rob.herring@calxeda.com>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Sekhar Nori <nsekhar@ti.com>
      Cc: Takashi Iwai <tiwai@suse.de>
      Cc: Vinod Koul <vinod.koul@intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      684f0d3d
    • D
      backlight: lm3630: apply chip revision · 28e64a68
      Daniel Jeong 提交于
      The LM3630 chip was revised by TI and chip name was also changed to
      LM3630A.  And register map, default values and initial sequences are
      changed.  The files, lm3630_bl.{c,h} are replaced by lm3630a_bl.{c,h} You
      can find more information about LM3630A(datasheet, evm etc) at
      http://www.ti.com/product/lm3630aSigned-off-by: NDaniel Jeong <gshark.jeong@gmail.com>
      Signed-off-by: NJingoo Han <jg1.han@samsung.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      28e64a68
    • M
      backlight: lp855x_bl: support new LP8555 device · 5812c13a
      Milo Kim 提交于
      LP8555 is one of the LP855x family devices.
      
      This device needs pre_init_device() and post_init_device() driver
      structure.  It's same as LP8557, so the device configuration code is
      shared with LP8557.  Backlight outputs are generated from dual DC-DC boost
      converters.  It's configurable EPROM settings which are defined in the
      platform data.
      
      Driver documentation and device tree bindings are updated.
      Signed-off-by: NMilo Kim <milo.kim@ti.com>
      Signed-off-by: NJingoo Han <jg1.han@samsung.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5812c13a
    • V
      sched: remove ARCH specific fpu_counter from task_struct · 27f69e68
      Vineet Gupta 提交于
      fpu_counter in task_struct was used only by sh/x86.  Both of these now
      carry it in ARCH specific thread_struct, hence this can now be removed
      from generic task_struct, shrinking it slightly for other arches.
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Paul Mundt <paul.mundt@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      27f69e68
    • R
      jump_label: unlikely(x) > 0 · 261adc9a
      Roel Kluin 提交于
      if (unlikely(x) > 0) doesn't seem to help branch prediction
      Signed-off-by: NRoel Kluin <roel.kluin@gmail.com>
      Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: "H. Peter Anvin" <hpa@linux.intel.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      261adc9a
    • A
      syscalls.h: use gcc alias instead of assembler aliases for syscalls · 83460ec8
      Andi Kleen 提交于
      Use standard gcc __attribute__((alias(foo))) to define the syscall aliases
      instead of custom assembler macros.
      
      This is far cleaner, and also fixes my LTO kernel build.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      83460ec8
    • M
      mm: numa: return the number of base pages altered by protection changes · 72403b4a
      Mel Gorman 提交于
      Commit 0255d491 ("mm: Account for a THP NUMA hinting update as one
      PTE update") was added to account for the number of PTE updates when
      marking pages prot_numa.  task_numa_work was using the old return value
      to track how much address space had been updated.  Altering the return
      value causes the scanner to do more work than it is configured or
      documented to in a single unit of work.
      
      This patch reverts that commit and accounts for the number of THP
      updates separately in vmstat.  It is up to the administrator to
      interpret the pair of values correctly.  This is a straight-forward
      operation and likely to only be of interest when actively debugging NUMA
      balancing problems.
      
      The impact of this patch is that the NUMA PTE scanner will scan slower
      when THP is enabled and workloads may converge slower as a result.  On
      the flip size system CPU usage should be lower than recent tests
      reported.  This is an illustrative example of a short single JVM specjbb
      test
      
      specjbb
                             3.12.0                3.12.0
                            vanilla      acctupdates
      TPut 1      26143.00 (  0.00%)     25747.00 ( -1.51%)
      TPut 7     185257.00 (  0.00%)    183202.00 ( -1.11%)
      TPut 13    329760.00 (  0.00%)    346577.00 (  5.10%)
      TPut 19    442502.00 (  0.00%)    460146.00 (  3.99%)
      TPut 25    540634.00 (  0.00%)    549053.00 (  1.56%)
      TPut 31    512098.00 (  0.00%)    519611.00 (  1.47%)
      TPut 37    461276.00 (  0.00%)    474973.00 (  2.97%)
      TPut 43    403089.00 (  0.00%)    414172.00 (  2.75%)
      
                    3.12.0      3.12.0
                   vanillaacctupdates
      User         5169.64     5184.14
      System        100.45       80.02
      Elapsed       252.75      251.85
      
      Performance is similar but note the reduction in system CPU time.  While
      this showed a performance gain, it will not be universal but at least
      it'll be behaving as documented.  The vmstats are obviously different but
      here is an obvious interpretation of them from mmtests.
      
                                      3.12.0      3.12.0
                                     vanillaacctupdates
      NUMA page range updates        1408326    11043064
      NUMA huge PMD updates                0       21040
      NUMA PTE updates               1408326      291624
      
      "NUMA page range updates" == nr_pte_updates and is the value returned to
      the NUMA pte scanner.  NUMA huge PMD updates were the number of THP
      updates which in combination can be used to calculate how many ptes were
      updated from userspace.
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Reported-by: NAlex Thorlton <athorlton@sgi.com>
      Reviewed-by: NRik van Riel <riel@redhat.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      72403b4a
    • J
      mm: factor commit limit calculation · 00619bcc
      Jerome Marchand 提交于
      The same calculation is currently done in three differents places.
      Factor that code so future changes has to be made at only one place.
      
      [akpm@linux-foundation.org: uninline vm_commit_limit()]
      Signed-off-by: NJerome Marchand <jmarchan@redhat.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      00619bcc