1. 19 7月, 2012 1 次提交
  2. 15 7月, 2012 3 次提交
    • T
      random: add new get_random_bytes_arch() function · c2557a30
      Theodore Ts'o 提交于
      Create a new function, get_random_bytes_arch() which will use the
      architecture-specific hardware random number generator if it is
      present.  Change get_random_bytes() to not use the HW RNG, even if it
      is avaiable.
      
      The reason for this is that the hw random number generator is fast (if
      it is present), but it requires that we trust the hardware
      manufacturer to have not put in a back door.  (For example, an
      increasing counter encrypted by an AES key known to the NSA.)
      
      It's unlikely that Intel (for example) was paid off by the US
      Government to do this, but it's impossible for them to prove otherwise
      --- especially since Bull Mountain is documented to use AES as a
      whitener.  Hence, the output of an evil, trojan-horse version of
      RDRAND is statistically indistinguishable from an RDRAND implemented
      to the specifications claimed by Intel.  Short of using a tunnelling
      electronic microscope to reverse engineer an Ivy Bridge chip and
      disassembling and analyzing the CPU microcode, there's no way for us
      to tell for sure.
      
      Since users of get_random_bytes() in the Linux kernel need to be able
      to support hardware systems where the HW RNG is not present, most
      time-sensitive users of this interface have already created their own
      cryptographic RNG interface which uses get_random_bytes() as a seed.
      So it's much better to use the HW RNG to improve the existing random
      number generator, by mixing in any entropy returned by the HW RNG into
      /dev/random's entropy pool, but to always _use_ /dev/random's entropy
      pool.
      
      This way we get almost of the benefits of the HW RNG without any
      potential liabilities.  The only benefits we forgo is the
      speed/performance enhancements --- and generic kernel code can't
      depend on depend on get_random_bytes() having the speed of a HW RNG
      anyway.
      
      For those places that really want access to the arch-specific HW RNG,
      if it is available, we provide get_random_bytes_arch().
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      c2557a30
    • L
      random: create add_device_randomness() interface · a2080a67
      Linus Torvalds 提交于
      Add a new interface, add_device_randomness() for adding data to the
      random pool that is likely to differ between two devices (or possibly
      even per boot).  This would be things like MAC addresses or serial
      numbers, or the read-out of the RTC. This does *not* add any actual
      entropy to the pool, but it initializes the pool to different values
      for devices that might otherwise be identical and have very little
      entropy available to them (particularly common in the embedded world).
      
      [ Modified by tytso to mix in a timestamp, since there may be some
        variability caused by the time needed to detect/configure the hardware
        in question. ]
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      a2080a67
    • T
      random: make 'add_interrupt_randomness()' do something sane · 775f4b29
      Theodore Ts'o 提交于
      We've been moving away from add_interrupt_randomness() for various
      reasons: it's too expensive to do on every interrupt, and flooding the
      CPU with interrupts could theoretically cause bogus floods of entropy
      from a somewhat externally controllable source.
      
      This solves both problems by limiting the actual randomness addition
      to just once a second or after 64 interrupts, whicever comes first.
      During that time, the interrupt cycle data is buffered up in a per-cpu
      pool.  Also, we make sure the the nonblocking pool used by urandom is
      initialized before we start feeding the normal input pool.  This
      assures that /dev/urandom is returning unpredictable data as soon as
      possible.
      
      (Based on an original patch by Linus, but significantly modified by
      tytso.)
      Tested-by: NEric Wustrow <ewust@umich.edu>
      Reported-by: NEric Wustrow <ewust@umich.edu>
      Reported-by: NNadia Heninger <nadiah@cs.ucsd.edu>
      Reported-by: NZakir Durumeric <zakir@umich.edu>
      Reported-by: J. Alex Halderman <jhalderm@umich.edu>.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      775f4b29
  3. 01 7月, 2012 1 次提交
  4. 21 6月, 2012 3 次提交
  5. 19 6月, 2012 1 次提交
  6. 18 6月, 2012 2 次提交
    • S
      ftrace: Make all inline tags also include notrace · 93b3cca1
      Steven Rostedt 提交于
      Commit 5963e317 ("ftrace/x86: Do not change stacks in DEBUG when
      calling lockdep") prevented lockdep calls from the int3 breakpoint handler
      from reseting the stack if a function that was called was in the process
      of being converted for tracing and had a breakpoint on it. The idea is,
      before calling the lockdep code, do a load_idt() to the special IDT that
      kept the breakpoint stack from reseting. This worked well as a quick fix
      for this kernel release, until a certain config caused a lockup in the
      function tracer start up tests.
      
      Investigating it, I found that the load_idt that was used to prevent
      the int3 from changing stacks was itself being traced!
      
      Even though the config had CONFIG_OPTIMIZE_INLINING disabled, and
      all 'inline' tags were set to always inline, there were still cases that
      it did not inline! This was caused by CONFIG_PARAVIRT_GUEST, where it
      would add a pointer to the native_load_idt() which made that function
      to be traced.
      
      Commit 45959ee7 ("ftrace: Do not function trace inlined functions")
      only touched the 'inline' tags when CONFIG_OPMITIZE_INLINING was enabled.
      PARAVIRT_GUEST shows that this was not enough and we need to also
      mark always_inline with notrace as well.
      Reported-by: NFengguang Wu <wfg@linux.intel.com>
      Tested-by: NFengguang Wu <wfg@linux.intel.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      93b3cca1
    • T
      NFSv4.1: Fix umount when filelayout DS is also the MDS · 2a4c8994
      Trond Myklebust 提交于
      Currently there is a 'chicken and egg' issue when the DS is also the mounted
      MDS. The nfs_match_client() reference from nfs4_set_ds_client bumps the
      cl_count, the nfs_client is not freed at umount, and nfs4_deviceid_purge_client
      is not called to dereference the MDS usage of a deviceid which holds a
      reference to the DS nfs_client.  The result is the umount program returns,
      but the nfs_client is not freed, and the cl_session hearbeat continues.
      
      The MDS (and all other nfs mounts) lose their last nfs_client reference in
      nfs_free_server when the last nfs_server (fsid) is umounted.
      The file layout DS lose their last nfs_client reference in destroy_ds
      when the last deviceid referencing the data server is put and destroy_ds is
      called. This is triggered by a call to nfs4_deviceid_purge_client which
      removes references to a pNFS deviceid used by an MDS mount.
      
      The fix is to track how many pnfs enabled filesystems are mounted from
      this server, and then to purge the device id cache once that count reaches
      zero.
      Reported-by: NJorge Mora <Jorge.Mora@netapp.com>
      Reported-by: NAndy Adamson <andros@netapp.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      2a4c8994
  7. 16 6月, 2012 4 次提交
    • R
      vga_switcheroo.h: fix pci_dev warning · f8fee8f5
      Randy Dunlap 提交于
      Fix warnings on some architectures/configs (not on x86):
      
      include/linux/vga_switcheroo.h:28:30: warning: 'struct pci_dev' declared inside parameter list [enabled by default]
      include/linux/vga_switcheroo.h:28:30: warning: its scope is only this definition or declaration, which is probably not what you want [enabled by default]
      Signed-off-by: NRandy Dunlap <rdunlap@xenotime.net>
      Cc:	Takashi Iwai <tiwai@suse.de>
      Reported-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: NDave Airlie <airlied@redhat.com>
      f8fee8f5
    • H
      swap: fix shmem swapping when more than 8 areas · 9b15b817
      Hugh Dickins 提交于
      Minchan Kim reports that when a system has many swap areas, and tmpfs
      swaps out to the ninth or more, shmem_getpage_gfp()'s attempts to read
      back the page cannot locate it, and the read fails with -ENOMEM.
      
      Whoops.  Yes, I blindly followed read_swap_header()'s pte_to_swp_entry(
      swp_entry_to_pte()) technique for determining maximum usable swap
      offset, without stopping to realize that that actually depends upon the
      pte swap encoding shifting swap offset to the higher bits and truncating
      it there.  Whereas our radix_tree swap encoding leaves offset in the
      lower bits: it's swap "type" (that is, index of swap area) that was
      truncated.
      
      Fix it by reducing the SWP_TYPE_SHIFT() in swapops.h, and removing the
      broken radix_to_swp_entry(swp_to_radix_entry()) from read_swap_header().
      
      This does not reduce the usable size of a swap area any further, it
      leaves it as claimed when making the original commit: no change from 3.0
      on x86_64, nor on i386 without PAE; but 3.0's 512GB is reduced to 128GB
      per swapfile on i386 with PAE.  It's not a change I would have risked
      five years ago, but with x86_64 supported for ten years, I believe it's
      appropriate now.
      
      Hmm, and what if some architecture implements its swap pte with offset
      encoded below type? That would equally break the maximum usable swap
      offset check.  Happily, they all follow the same tradition of encoding
      offset above type, but I'll prepare a check on that for next.
      Reported-and-Reviewed-and-Tested-by: NMinchan Kim <minchan@kernel.org>
      Signed-off-by: NHugh Dickins <hughd@google.com>
      Cc: stable@vger.kernel.org [3.1, 3.2, 3.3, 3.4]
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9b15b817
    • E
      net: remove skb_orphan_try() · 62b1a8ab
      Eric Dumazet 提交于
      Orphaning skb in dev_hard_start_xmit() makes bonding behavior
      unfriendly for applications sending big UDP bursts : Once packets
      pass the bonding device and come to real device, they might hit a full
      qdisc and be dropped. Without orphaning, the sender is automatically
      throttled because sk->sk_wmemalloc reaches sk->sk_sndbuf (assuming
      sk_sndbuf is not too big)
      
      We could try to defer the orphaning adding another test in
      dev_hard_start_xmit(), but all this seems of little gain,
      now that BQL tends to make packets more likely to be parked
      in Qdisc queues instead of NIC TX ring, in cases where performance
      matters.
      
      Reverts commits :
      fc6055a5 net: Introduce skb_orphan_try()
      87fd308c net: skb_tx_hash() fix relative to skb_orphan_try()
      and removes SKBTX_DRV_NEEDS_SK_REF flag
      Reported-and-bisected-by: NJean-Michel Hautbois <jhautbois@gmail.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Tested-by: NOliver Hartkopp <socketcan@hartkopp.net>
      Acked-by: NOliver Hartkopp <socketcan@hartkopp.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      62b1a8ab
    • K
      kmsg - kmsg_dump() use iterator to receive log buffer content · e2ae715d
      Kay Sievers 提交于
      Provide an iterator to receive the log buffer content, and convert all
      kmsg_dump() users to it.
      
      The structured data in the kmsg buffer now contains binary data, which
      should no longer be copied verbatim to the kmsg_dump() users.
      
      The iterator should provide reliable access to the buffer data, and also
      supports proper log line-aware chunking of data while iterating.
      Signed-off-by: NKay Sievers <kay@vrfy.org>
      Tested-by: NTony Luck <tony.luck@intel.com>
      Reported-by: NAnton Vorontsov <anton.vorontsov@linaro.org>
      Tested-by: NAnton Vorontsov <anton.vorontsov@linaro.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e2ae715d
  8. 14 6月, 2012 3 次提交
  9. 12 6月, 2012 1 次提交
  10. 11 6月, 2012 2 次提交
  11. 10 6月, 2012 1 次提交
    • P
      net: Make linux/tcp.h C++ friendly (trivial) · 8876d6b5
      Paul Pluzhnikov 提交于
      I originally sent this patch to <trivial@kernel.org>, but Jiri Kosina did
      not feel that this is fully appropriate for the trivial tree.
      
      Using linux/tcp.h from C++ results in:
      
      cat t.cc
      #include <linux/tcp.h>
      int main() { }
      
      g++ -c t.cc
      
      In file included from t.cc:1:
      /usr/include/linux/tcp.h:72: error: '__u32 __fswab32(__u32)' cannot appear in a constant-expression
      /usr/include/linux/tcp.h:72: error: a function call cannot appear in a constant-expression
      ...
      
      Attached trivial patch fixes this problem.
      
      Tested:
      - the t.cc above compiles with g++ and
      - the following program generates the same output before/after
        the patch:
      
      #include <linux/tcp.h>
      #include <stdio.h>
      
      int main ()
      {
      #define P(a) printf("%s: %08x\n", #a, (int)a)
       P(TCP_FLAG_CWR);
       P(TCP_FLAG_ECE);
       P(TCP_FLAG_URG);
       P(TCP_FLAG_ACK);
       P(TCP_FLAG_PSH);
       P(TCP_FLAG_RST);
       P(TCP_FLAG_SYN);
       P(TCP_FLAG_FIN);
       P(TCP_RESERVED_BITS);
       P(TCP_DATA_OFFSET);
      #undef P
       return 0;
      }
      Signed-off-by: NPaul Pluzhnikov <ppluzhnikov@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8876d6b5
  12. 08 6月, 2012 5 次提交
  13. 07 6月, 2012 2 次提交
  14. 06 6月, 2012 5 次提交
  15. 05 6月, 2012 1 次提交
    • T
      NFSv4: Fix an Oops in the open recovery code · 1549210f
      Trond Myklebust 提交于
      The open recovery code does not need to request a new value for the
      mdsthreshold, and so does not allocate a struct nfs4_threshold.
      The problem is that encode_getfattr_open() will still request an
      mdsthreshold, and so we end up Oopsing in decode_attr_mdsthreshold.
      
      This patch fixes encode_getfattr_open so that it doesn't request an
      mdsthreshold when the caller isn't asking for one. It also fixes
      decode_attr_mdsthreshold so that it errors if the server returns
      an mdsthreshold that we didn't ask for (instead of Oopsing).
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      Cc: Andy Adamson <andros@netapp.com>
      1549210f
  16. 04 6月, 2012 3 次提交
  17. 03 6月, 2012 1 次提交
    • L
      tty: Revert the tty locking series, it needs more work · f309532b
      Linus Torvalds 提交于
      This reverts the tty layer change to use per-tty locking, because it's
      not correct yet, and fixing it will require some more deep surgery.
      
      The main revert is d29f3ef3 ("tty_lock: Localise the lock"), but
      there are several smaller commits that built upon it, they also get
      reverted here. The list of reverted commits is:
      
        fde86d31 - tty: add lockdep annotations
        8f6576ad - tty: fix ldisc lock inversion trace
        d3ca8b64 - pty: Fix lock inversion
        b1d679af - tty: drop the pty lock during hangup
        abcefe5f - tty/amiserial: Add missing argument for tty_unlock()
        fd11b42e - cris: fix missing tty arg in wait_event_interruptible_tty call
        d29f3ef3 - tty_lock: Localise the lock
      
      The revert had a trivial conflict in the 68360serial.c staging driver
      that got removed in the meantime.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f309532b
  18. 02 6月, 2012 1 次提交
    • A
      new helper: signal_delivered() · efee984c
      Al Viro 提交于
      Does block_sigmask() + tracehook_signal_handler();  called when
      sigframe has been successfully built.  All architectures converted
      to it; block_sigmask() itself is gone now (merged into this one).
      
      I'm still not too happy with the signature, but that's a separate
      story (IMO we need a structure that would contain signal number +
      siginfo + k_sigaction, so that get_signal_to_deliver() would fill one,
      signal_delivered(), handle_signal() and probably setup...frame() -
      take one).
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      efee984c