1. 15 11月, 2013 5 次提交
    • K
      mm, thp: move ptl taking inside page_check_address_pmd() · 117b0791
      Kirill A. Shutemov 提交于
      With split page table lock we can't know which lock we need to take
      before we find the relevant pmd.
      
      Let's move lock taking inside the function.
      Signed-off-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Tested-by: NAlex Thorlton <athorlton@sgi.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "Eric W . Biederman" <ebiederm@xmission.com>
      Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Jones <davej@redhat.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Robin Holt <robinmholt@gmail.com>
      Cc: Sedat Dilek <sedat.dilek@gmail.com>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Hugh Dickins <hughd@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      117b0791
    • K
      mm, thp: change pmd_trans_huge_lock() to return taken lock · bf929152
      Kirill A. Shutemov 提交于
      With split ptlock it's important to know which lock
      pmd_trans_huge_lock() took.  This patch adds one more parameter to the
      function to return the lock.
      
      In most places migration to new api is trivial.  Exception is
      move_huge_pmd(): we need to take two locks if pmd tables are different.
      Signed-off-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Tested-by: NAlex Thorlton <athorlton@sgi.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "Eric W . Biederman" <ebiederm@xmission.com>
      Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Jones <davej@redhat.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Robin Holt <robinmholt@gmail.com>
      Cc: Sedat Dilek <sedat.dilek@gmail.com>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Hugh Dickins <hughd@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bf929152
    • K
      mm: introduce api for split page table lock for PMD level · 9a86cb7b
      Kirill A. Shutemov 提交于
      Basic api, backed by mm->page_table_lock for now. Actual implementation
      will be added later.
      Signed-off-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Tested-by: NAlex Thorlton <athorlton@sgi.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "Eric W . Biederman" <ebiederm@xmission.com>
      Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Jones <davej@redhat.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Robin Holt <robinmholt@gmail.com>
      Cc: Sedat Dilek <sedat.dilek@gmail.com>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Hugh Dickins <hughd@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9a86cb7b
    • K
      mm: convert mm->nr_ptes to atomic_long_t · e1f56c89
      Kirill A. Shutemov 提交于
      With split page table lock for PMD level we can't hold mm->page_table_lock
      while updating nr_ptes.
      
      Let's convert it to atomic_long_t to avoid races.
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Tested-by: NAlex Thorlton <athorlton@sgi.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: "Eric W . Biederman" <ebiederm@xmission.com>
      Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Jones <davej@redhat.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Robin Holt <robinmholt@gmail.com>
      Cc: Sedat Dilek <sedat.dilek@gmail.com>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Hugh Dickins <hughd@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e1f56c89
    • K
      mm: rename USE_SPLIT_PTLOCKS to USE_SPLIT_PTE_PTLOCKS · 57c1ffce
      Kirill A. Shutemov 提交于
      We're going to introduce split page table lock for PMD level.  Let's
      rename existing split ptlock for PTE level to avoid confusion.
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Tested-by: NAlex Thorlton <athorlton@sgi.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: "Eric W . Biederman" <ebiederm@xmission.com>
      Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Jones <davej@redhat.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Robin Holt <robinmholt@gmail.com>
      Cc: Sedat Dilek <sedat.dilek@gmail.com>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Hugh Dickins <hughd@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      57c1ffce
  2. 13 11月, 2013 26 次提交
  3. 12 11月, 2013 4 次提交
    • D
      random32: upgrade taus88 generator to taus113 from errata paper · a98814ce
      Daniel Borkmann 提交于
      Since we use prandom*() functions quite often in networking code
      i.e. in UDP port selection, netfilter code, etc, upgrade the PRNG
      from Pierre L'Ecuyer's original paper "Maximally Equidistributed
      Combined Tausworthe Generators", Mathematics of Computation, 65,
      213 (1996), 203--213 to the version published in his errata paper [1].
      
      The Tausworthe generator is a maximally-equidistributed generator,
      that is fast and has good statistical properties [1].
      
      The version presented there upgrades the 3 state LFSR to a 4 state
      LFSR with increased periodicity from about 2^88 to 2^113. The
      algorithm is presented in [1] by the very same author who also
      designed the original algorithm in [2].
      
      Also, by increasing the state, we make it a bit harder for attackers
      to "guess" the PRNGs internal state. See also discussion in [3].
      
      Now, as we use this sort of weak initialization discussed in [3]
      only between core_initcall() until late_initcall() time [*] for
      prandom32*() users, namely in prandom_init(), it is less relevant
      from late_initcall() onwards as we overwrite seeds through
      prandom_reseed() anyways with a seed source of higher entropy, that
      is, get_random_bytes(). In other words, a exhaustive keysearch of
      96 bit would be needed. Now, with the help of this patch, this
      state-search increases further to 128 bit. Initialization needs
      to make sure that s1 > 1, s2 > 7, s3 > 15, s4 > 127.
      
      taus88 and taus113 algorithm is also part of GSL. I added a test
      case in the next patch to verify internal behaviour of this patch
      with GSL and ran tests with the dieharder 3.31.1 RNG test suite:
      
      $ dieharder -g 052 -a -m 10 -s 1 -S 4137730333 #taus88
      $ dieharder -g 054 -a -m 10 -s 1 -S 4137730333 #taus113
      
      With this seed configuration, in order to compare both, we get
      the following differences:
      
      algorithm                 taus88           taus113
      rands/second [**]         1.61e+08         1.37e+08
      sts_serial(4, 1st run)    WEAK             PASSED
      sts_serial(9, 2nd run)    WEAK             PASSED
      rgb_lagged_sum(31)        WEAK             PASSED
      
      We took out diehard_sums test as according to the authors it is
      considered broken and unusable [4]. Despite that and the slight
      decrease in performance (which is acceptable), taus113 here passes
      all 113 tests (only rgb_minimum_distance_5 in WEAK, the rest PASSED).
      In general, taus/taus113 is considered "very good" by the authors
      of dieharder [5].
      
      The papers [1][2] states a single warm-up step is sufficient by
      running quicktaus once on each state to ensure proper initialization
      of ~s_{0}:
      
      Our selection of (s) according to Table 1 of [1] row 1 holds the
      condition L - k <= r - s, that is,
      
        (32 32 32 32) - (31 29 28 25) <= (25 27 15 22) - (18 2 7 13)
      
      with r = k - q and q = (6 2 13 3) as also stated by the paper.
      So according to [2] we are safe with one round of quicktaus for
      initialization. However we decided to include the warm-up phase
      of the PRNG as done in GSL in every case as a safety net. We also
      use the warm up phase to make the output of the RNG easier to
      verify by the GSL output.
      
      In prandom_init(), we also mix random_get_entropy() into it, just
      like drivers/char/random.c does it, jiffies ^ random_get_entropy().
      random-get_entropy() is get_cycles(). xor is entropy preserving so
      it is fine if it is not implemented by some architectures.
      
      Note, this PRNG is *not* used for cryptography in the kernel, but
      rather as a fast PRNG for various randomizations i.e. in the
      networking code, or elsewhere for debugging purposes, for example.
      
      [*]: In order to generate some "sort of pseduo-randomness", since
      get_random_bytes() is not yet available for us, we use jiffies and
      initialize states s1 - s3 with a simple linear congruential generator
      (LCG), that is x <- x * 69069; and derive s2, s3, from the 32bit
      initialization from s1. So the above quote from [3] accounts only
      for the time from core to late initcall, not afterwards.
      [**] Single threaded run on MacBook Air w/ Intel Core i5-3317U
      
       [1] http://www.iro.umontreal.ca/~lecuyer/myftp/papers/tausme2.ps
       [2] http://www.iro.umontreal.ca/~lecuyer/myftp/papers/tausme.ps
       [3] http://thread.gmane.org/gmane.comp.encryption.general/12103/
       [4] http://code.google.com/p/dieharder/source/browse/trunk/libdieharder/diehard_sums.c?spec=svn490&r=490#20
       [5] http://www.phy.duke.edu/~rgb/General/dieharder.php
      
      Joint work with Hannes Frederic Sowa.
      
      Cc: Florian Weimer <fweimer@redhat.com>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a98814ce
    • D
      random32: move rnd_state to linux/random.h · 38e9efcd
      Daniel Borkmann 提交于
      struct rnd_state got mistakenly pulled into uapi header. It is not
      used anywhere and does also not belong there!
      
      Commit 5960164f ("lib/random32: export pseudo-random number
      generator for modules"), the last commit on rnd_state before it
      got moved to uapi, says:
      
        This patch moves the definition of struct rnd_state and the inline
        __seed() function to linux/random.h.  It renames the static __random32()
        function to prandom32() and exports it for use in modules.
      
      Hence, the structure was moved from lib/random32.c to linux/random.h
      so that it can be used within modules (FCoE-related code in this
      case), but not from user space. However, it seems to have been
      mistakenly moved to uapi header through the uapi script. Since no-one
      should make use of it from the linux headers, move the structure back
      to the kernel for internal use, so that it can be modified on demand.
      
      Joint work with Hannes Frederic Sowa.
      
      Cc: Joe Eykholt <jeykholt@cisco.com>
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      38e9efcd
    • H
      random32: add prandom_reseed_late() and call when nonblocking pool becomes initialized · 4af712e8
      Hannes Frederic Sowa 提交于
      The Tausworthe PRNG is initialized at late_initcall time. At that time the
      entropy pool serving get_random_bytes is not filled sufficiently. This
      patch adds an additional reseeding step as soon as the nonblocking pool
      gets marked as initialized.
      
      On some machines it might be possible that late_initcall gets called after
      the pool has been initialized. In this situation we won't reseed again.
      
      (A call to prandom_seed_late blocks later invocations of early reseed
      attempts.)
      
      Joint work with Daniel Borkmann.
      
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Acked-by: N"Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4af712e8
    • D
      random32: fix off-by-one in seeding requirement · 51c37a70
      Daniel Borkmann 提交于
      For properly initialising the Tausworthe generator [1], we have
      a strict seeding requirement, that is, s1 > 1, s2 > 7, s3 > 15.
      
      Commit 697f8d03 ("random32: seeding improvement") introduced
      a __seed() function that imposes boundary checks proposed by the
      errata paper [2] to properly ensure above conditions.
      
      However, we're off by one, as the function is implemented as:
      "return (x < m) ? x + m : x;", and called with __seed(X, 1),
      __seed(X, 7), __seed(X, 15). Thus, an unwanted seed of 1, 7, 15
      would be possible, whereas the lower boundary should actually
      be of at least 2, 8, 16, just as GSL does. Fix this, as otherwise
      an initialization with an unwanted seed could have the effect
      that Tausworthe's PRNG properties cannot not be ensured.
      
      Note that this PRNG is *not* used for cryptography in the kernel.
      
       [1] http://www.iro.umontreal.ca/~lecuyer/myftp/papers/tausme.ps
       [2] http://www.iro.umontreal.ca/~lecuyer/myftp/papers/tausme2.ps
      
      Joint work with Hannes Frederic Sowa.
      
      Fixes: 697f8d03 ("random32: seeding improvement")
      Cc: Stephen Hemminger <stephen@networkplumber.org>
      Cc: Florian Weimer <fweimer@redhat.com>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      51c37a70
  4. 11 11月, 2013 3 次提交
  5. 10 11月, 2013 2 次提交