1. 23 12月, 2014 2 次提交
    • S
      ea174f4c
    • J
      x86: Fix step size adjustment during initial memory mapping · 132978b9
      Jan Beulich 提交于
      The old scheme can lead to failure in certain cases - the
      problem is that after bumping step_size the next (non-final)
      iteration is only guaranteed to make available a memory block
      the size of what step_size was before. E.g. for a memory block
      [0,3004600000) we'd have:
      
       iter	start		end		step		amount
       1	3004400000	30045fffff	 2M		  2M
       2	3004000000	30043fffff	64M		  4M
       3	3000000000	3003ffffff	 2G		 64M
       4	2000000000	2fffffffff	64G		 64G
      
      Yet to map 64G with 4k pages (as happens e.g. under PV Xen) we
      need slightly over 128M, but the first three iterations made
      only about 70M available.
      
      The condition (new_mapped_ram_size > mapped_ram_size) for
      bumping step_size is just not suitable. Instead we want to bump
      it when we know we have enough memory available to cover a block
      of the new step_size. And rather than making that condition more
      complicated than needed, simply adjust step_size by the largest
      possible factor we know we can cover at that point - which is
      shifting it left by one less than the difference between page
      table level shifts. (Interestingly the original STEP_SIZE_SHIFT
      definition had a comment hinting at that having been the
      intention, just that it should have been PUD_SHIFT-PMD_SHIFT-1
      instead of (PUD_SHIFT-PMD_SHIFT)/2, and of course for non-PAE
      32-bit we can't really use these two constants as they're equal
      there.)
      
      Furthermore the comment in get_new_step_size() didn't get
      updated when the bottom-down mapping logic got added. Yet while
      an overflow (flushing step_size to zero) of the shift doesn't
      matter for the top-down method, it does for bottom-up because
      round_up(x, 0) = 0, and an upper range boundary of zero can't
      really work well.
      Signed-off-by: NJan Beulich <jbeulich@suse.com>
      Acked-by: NYinghai Lu <yinghai@kernel.org>
      Link: http://lkml.kernel.org/r/54945C1E020000780005114E@mail.emea.novell.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      132978b9
  2. 21 12月, 2014 1 次提交
    • A
      x86_64, vdso: Fix the vdso address randomization algorithm · 394f56fe
      Andy Lutomirski 提交于
      The theory behind vdso randomization is that it's mapped at a random
      offset above the top of the stack.  To avoid wasting a page of
      memory for an extra page table, the vdso isn't supposed to extend
      past the lowest PMD into which it can fit.  Other than that, the
      address should be a uniformly distributed address that meets all of
      the alignment requirements.
      
      The current algorithm is buggy: the vdso has about a 50% probability
      of being at the very end of a PMD.  The current algorithm also has a
      decent chance of failing outright due to incorrect handling of the
      case where the top of the stack is near the top of its PMD.
      
      This fixes the implementation.  The paxtest estimate of vdso
      "randomisation" improves from 11 bits to 18 bits.  (Disclaimer: I
      don't know what the paxtest code is actually calculating.)
      
      It's worth noting that this algorithm is inherently biased: the vdso
      is more likely to end up near the end of its PMD than near the
      beginning.  Ideally we would either nix the PMD sharing requirement
      or jointly randomize the vdso and the stack to reduce the bias.
      
      In the mean time, this is a considerable improvement with basically
      no risk of compatibility issues, since the allowed outputs of the
      algorithm are unchanged.
      
      As an easy test, doing this:
      
      for i in `seq 10000`
        do grep -P vdso /proc/self/maps |cut -d- -f1
      done |sort |uniq -d
      
      used to produce lots of output (1445 lines on my most recent run).
      A tiny subset looks like this:
      
      7fffdfffe000
      7fffe01fe000
      7fffe05fe000
      7fffe07fe000
      7fffe09fe000
      7fffe0bfe000
      7fffe0dfe000
      
      Note the suspicious fe000 endings.  With the fix, I get a much more
      palatable 76 repeated addresses.
      Reviewed-by: NKees Cook <keescook@chromium.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
      394f56fe
  3. 18 12月, 2014 5 次提交
  4. 16 12月, 2014 32 次提交