1. 04 11月, 2013 4 次提交
    • T
      random: initialize the last_time field in struct timer_rand_state · 644008df
      Theodore Ts'o 提交于
      Since we initialize jiffies to wrap five minutes before boot (see
      INITIAL_JIFFIES defined in include/linux/jiffies.h) it's important to
      make sure the last_time field is initialized to INITIAL_JIFFIES.
      Otherwise, the entropy estimator will overestimate the amount of
      entropy resulting from the first call to add_timer_randomness(),
      generally by about 8 bits.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      644008df
    • T
      random: don't zap entropy count in rand_initialize() · ae9ecd92
      Theodore Ts'o 提交于
      The rand_initialize() function was being run fairly late in the kernel
      boot sequence.  This was unfortunate, since it zero'ed the entropy
      counters, thus throwing away credit that was accumulated earlier in
      the boot sequence, and it also meant that initcall functions run
      before rand_initialize were using a minimally initialized pool.
      
      To fix this, fix init_std_data() to no longer zap the entropy counter;
      it wasn't necessary, and move rand_initialize() to be an early
      initcall.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      ae9ecd92
    • T
      random: printk notifications for urandom pool initialization · 301f0595
      Theodore Ts'o 提交于
      Print a notification to the console when the nonblocking pool is
      initialized.  Also printk a warning when a process tries reading from
      /dev/urandom before it is fully initialized.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      301f0595
    • T
      random: make add_timer_randomness() fill the nonblocking pool first · 40db23e5
      Theodore Ts'o 提交于
      Change add_timer_randomness() so that it directs incoming entropy to
      the nonblocking pool first if it hasn't been fully initialized yet.
      This matches the strategy we use in add_interrupt_randomness(), which
      allows us to push the randomness where we need it the most during when
      the system is first booting up, so that get_random_bytes() and
      /dev/urandom become safe to use as soon as possible.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      40db23e5
  2. 11 10月, 2013 14 次提交
    • T
      random: convert DEBUG_ENT to tracepoints · f80bbd8b
      Theodore Ts'o 提交于
      Instead of using the random driver's ad-hoc DEBUG_ENT() mechanism, use
      tracepoints instead.  This allows for a much more fine-grained control
      of which debugging mechanism which a developer might need, and unifies
      the debugging messages with all of the existing tracepoints.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      f80bbd8b
    • T
      random: push extra entropy to the output pools · 6265e169
      Theodore Ts'o 提交于
      As the input pool gets filled, start transfering entropy to the output
      pools until they get filled.  This allows us to use the output pools
      to store more system entropy.  Waste not, want not....
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      6265e169
    • T
      random: drop trickle mode · 95b709b6
      Theodore Ts'o 提交于
      The add_timer_randomness() used to drop into trickle mode when entropy
      pool was estimated to be 87.5% full.  This was important when
      add_timer_randomness() was used to sample interrupts.  It's not used
      for this any more --- add_interrupt_randomness() now uses fast_mix()
      instead.  By elimitating trickle mode, it allows us to fully utilize
      entropy provided by add_input_randomness() and add_disk_randomness()
      even when the input pool is above the old trickle threshold of 87.5%.
      
      This helps to answer the criticism in [1] in their hypothetical
      scenario where our entropy estimator was inaccurate, even though the
      measurements in [2] seem to indicate that our entropy estimator given
      real-life entropy collection is actually pretty good, albeit on the
      conservative side (which was as it was designed).
      
      [1] http://eprint.iacr.org/2013/338.pdf
      [2] http://eprint.iacr.org/2012/251.pdfSigned-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      95b709b6
    • T
      random: adjust the generator polynomials in the mixing function slightly · 6e9fa2c8
      Theodore Ts'o 提交于
      Our mixing functions were analyzed by Lacharme, Roeck, Strubel, and
      Videau in their paper, "The Linux Pseudorandom Number Generator
      Revisited" (see: http://eprint.iacr.org/2012/251.pdf).
      
      They suggested a slight change to improve our mixing functions
      slightly.  I also adjusted the comments to better explain what is
      going on, and to document why the polynomials were changed.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      6e9fa2c8
    • T
      random: speed up the fast_mix function by a factor of four · 655b2264
      Theodore Ts'o 提交于
      By mixing the entropy in chunks of 32-bit words instead of byte by
      byte, we can speed up the fast_mix function significantly.  Since it
      is called on every single interrupt, on systems with a very heavy
      interrupt load, this can make a noticeable difference.
      
      Also fix a compilation warning in add_interrupt_randomness() and avoid
      xor'ing cycles and jiffies together just in case we have an
      architecture which tries to define random_get_entropy() by returning
      jiffies.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Reported-by: NJörn Engel <joern@logfs.org>
      655b2264
    • T
      random: cap the rate which the /dev/urandom pool gets reseeded · f5c2742c
      Theodore Ts'o 提交于
      In order to avoid draining the input pool of its entropy at too high
      of a rate, enforce a minimum time interval between reseedings of the
      urandom pool.  This is set to 60 seconds by default.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      f5c2742c
    • T
      random: optimize the entropy_store structure · c59974ae
      Theodore Ts'o 提交于
      Use smaller types to slightly shrink the size of the entropy store
      structure.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      c59974ae
    • T
      random: optimize spinlock use in add_device_randomness() · 3ef4cb2d
      Theodore Ts'o 提交于
      The add_device_randomness() function calls mix_pool_bytes() twice for
      the input pool and the non-blocking pool, for a total of four times.
      By using _mix_pool_byte() and taking the spinlock in
      add_device_randomness(), we can halve the number of times we need
      take each pool's spinlock.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      3ef4cb2d
    • T
      random: fix the tracepoint for get_random_bytes(_arch) · 5910895f
      Theodore Ts'o 提交于
      Fix a problem where get_random_bytes_arch() was calling the tracepoint
      get_random_bytes().  So add a new tracepoint for
      get_random_bytes_arch(), and make get_random_bytes() and
      get_random_bytes_arch() call their correct tracepoint.
      
      Also, add a new tracepoint for add_device_randomness()
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      5910895f
    • H
      random: account for entropy loss due to overwrites · 30e37ec5
      H. Peter Anvin 提交于
      When we write entropy into a non-empty pool, we currently don't
      account at all for the fact that we will probabilistically overwrite
      some of the entropy in that pool.  This means that unless the pool is
      fully empty, we are currently *guaranteed* to overestimate the amount
      of entropy in the pool!
      
      Assuming Shannon entropy with zero correlations we end up with an
      exponentally decaying value of new entropy added:
      
      	entropy <- entropy + (pool_size - entropy) *
      		(1 - exp(-add_entropy/pool_size))
      
      However, calculations involving fractional exponentials are not
      practical in the kernel, so apply a piecewise linearization:
      
      	  For add_entropy <= pool_size/2 then
      
      	  (1 - exp(-add_entropy/pool_size)) >= (add_entropy/pool_size)*0.7869...
      
      	  ... so we can approximate the exponential with
      	  3/4*add_entropy/pool_size and still be on the
      	  safe side by adding at most pool_size/2 at a time.
      
      In order for the loop not to take arbitrary amounts of time if a bad
      ioctl is received, terminate if we are within one bit of full.  This
      way the loop is guaranteed to terminate after no more than
      log2(poolsize) iterations, no matter what the input value is.  The
      vast majority of the time the loop will be executed exactly once.
      
      The piecewise linearization is very conservative, approaching 3/4 of
      the usable input value for small inputs, however, our entropy
      estimation is pretty weak at best, especially for small values; we
      have no handle on correlation; and the Shannon entropy measure (Rényi
      entropy of order 1) is not the correct one to use in the first place,
      but rather the correct entropy measure is the min-entropy, the Rényi
      entropy of infinite order.
      
      As such, this conservatism seems more than justified.
      
      This does introduce fractional bit values.  I have left it to have 3
      bits of fraction, so that with a pool of 2^12 bits the multiply in
      credit_entropy_bits() can still fit into an int, as 2*(3+12) < 31.  It
      is definitely possible to allow for more fractional accounting, but
      that multiply then would have to be turned into a 32*32 -> 64 multiply.
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Cc: DJ Johnston <dj.johnston@intel.com>
      30e37ec5
    • H
      random: allow fractional bits to be tracked · a283b5c4
      H. Peter Anvin 提交于
      Allow fractional bits of entropy to be tracked by scaling the entropy
      counter (fixed point).  This will be used in a subsequent patch that
      accounts for entropy lost due to overwrites.
      
      [ Modified by tytso to fix up a few missing places where the
        entropy_count wasn't properly converted from fractional bits to
        bits. ]
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      a283b5c4
    • H
      random: statically compute poolbitshift, poolbytes, poolbits · 9ed17b70
      H. Peter Anvin 提交于
      Use a macro to statically compute poolbitshift (will be used in a
      subsequent patch), poolbytes, and poolbits.  On virtually all
      architectures the cost of a memory load with an offset is the same as
      the one of a memory load.
      
      It is still possible for this to generate worse code since the C
      compiler doesn't know the fixed relationship between these fields, but
      that is somewhat unlikely.
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      9ed17b70
    • T
      random: mix in architectural randomness earlier in extract_buf() · 85a1f777
      Theodore Ts'o 提交于
      Previously if CPU chip had a built-in random number generator (i.e.,
      RDRAND on newer x86 chips), we mixed it in at the very end of
      extract_buf() using an XOR operation.
      
      We now mix it in right after the calculate a hash across the entire
      pool.  This has the advantage that any contribution of entropy from
      the CPU's HWRNG will get mixed back into the pool.  In addition, it
      means that if the HWRNG has any defects (either accidentally or
      maliciously introduced), this will be mitigated via the non-linear
      transform of the SHA-1 hash function before we hand out generated
      output.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      85a1f777
    • T
      random: allow architectures to optionally define random_get_entropy() · 61875f30
      Theodore Ts'o 提交于
      Allow architectures which have a disabled get_cycles() function to
      provide a random_get_entropy() function which provides a fine-grained,
      rapidly changing counter that can be used by the /dev/random driver.
      
      For example, an architecture might have a rapidly changing register
      used to control random TLB cache eviction, or DRAM refresh that
      doesn't meet the requirements of get_cycles(), but which is good
      enough for the needs of the random driver.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      61875f30
  3. 23 9月, 2013 1 次提交
  4. 03 9月, 2013 4 次提交
  5. 31 8月, 2013 15 次提交
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · a8787645
      Linus Torvalds 提交于
      Pull networking fixes from David Miller:
      
       1) There was a simplification in the ipv6 ndisc packet sending
          attempted here, which avoided using memory accounting on the
          per-netns ndisc socket for sending NDISC packets.  It did fix some
          important issues, but it causes regressions so it gets reverted here
          too.  Specifically, the problem with this change is that the IPV6
          output path really depends upon there being a valid skb->sk
          attached.
      
          The reason we want to do this change in some form when we figure out
          how to do it right, is that if a device goes down the ndisc_sk
          socket send queue will fill up and block NDISC packets that we want
          to send to other devices too.  That's really bad behavior.
      
          Hopefully Thomas can come up with a better version of this change.
      
       2) Fix a severe TCP performance regression by reverting a change made
          to dev_pick_tx() quite some time ago.  From Eric Dumazet.
      
       3) TIPC returns wrongly signed error codes, fix from Erik Hugne.
      
       4) Fix OOPS when doing IPSEC over ipv4 tunnels due to orphaning the
          skb->sk too early.  Fix from Li Hongjun.
      
       5) RAW ipv4 sockets can use the wrong routing key during lookup, from
          Chris Clark.
      
       6) Similar to #1 revert an older change that tried to use plain
          alloc_skb() for SYN/ACK TCP packets, this broke the netfilter owner
          mark which needs to see the skb->sk for such frames.  From Phil
          Oester.
      
       7) BNX2x driver bug fixes from Ariel Elior and Yuval Mintz,
          specifically in the handling of virtual functions.
      
       8) IPSEC path error propagations to sockets is not done properly when
          we have v4 in v6, and v6 in v4 type rules.  Fix from Hannes Frederic
          Sowa.
      
       9) Fix missing channel context release in mac80211, from Johannes Berg.
      
      10) Fix network namespace handing wrt.  SCM_RIGHTS, from Andy
          Lutomirski.
      
      11) Fix usage of bogus NAPI weight in jme, netxen, and ps3_gelic
          drivers.  From Michal Schmidt.
      
      12) Hopefully a complete and correct fix for the genetlink dump locking
          and module reference counting.  From Pravin B Shelar.
      
      13) sk_busy_loop() must do a cpu_relax(), from Eliezer Tamir.
      
      14) Fix handling of timestamp offset when restoring a snapshotted TCP
          socket.  From Andrew Vagin.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (44 commits)
        net: fec: fix time stamping logic after napi conversion
        net: bridge: convert MLDv2 Query MRC into msecs_to_jiffies for max_delay
        mISDN: return -EINVAL on error in dsp_control_req()
        net: revert 8728c544 ("net: dev_pick_tx() fix")
        Revert "ipv6: Don't depend on per socket memory for neighbour discovery messages"
        ipv4 tunnels: fix an oops when using ipip/sit with IPsec
        tipc: set sk_err correctly when connection fails
        tcp: tcp_make_synack() should use sock_wmalloc
        bridge: separate querier and query timer into IGMP/IPv4 and MLD/IPv6 ones
        ipv6: Don't depend on per socket memory for neighbour discovery messages
        ipv4: sendto/hdrincl: don't use destination address found in header
        tcp: don't apply tsoffset if rcv_tsecr is zero
        tcp: initialize rcv_tstamp for restored sockets
        net: xilinx: fix memleak
        net: usb: Add HP hs2434 device to ZLP exception table
        net: add cpu_relax to busy poll loop
        net: stmmac: fixed the pbl setting with DT
        genl: Hold reference on correct module while netlink-dump.
        genl: Fix genl dumpit() locking.
        xfrm: Fix potential null pointer dereference in xdst_queue_output
        ...
      a8787645
    • I
      MAINTAINERS: change my DT related maintainer address · de80963e
      Ian Campbell 提交于
      Filtering capabilities on my work email are pretty much non-existent and this
      has turned out to be something of a firehose...
      
      Cc: Stephen Warren <swarren@wwwdotorg.org>
      Cc: Rob Herring <rob.herring@calxeda.com>
      Cc: Olof Johansson <olof@lixom.net>
      Cc: Linus Walleij <linus.walleij@linaro.org>
      Signed-off-by: NIan Campbell <ian.campbell@citrix.com>
      Acked-by: NPawel Moll <pawel.moll@arm.com>
      Acked-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      de80963e
    • L
      Merge tag 'sound-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 936dbcc3
      Linus Torvalds 提交于
      Pull sound fixes from Takashi Iwai:
       "This contains two Oops fixes (opti9xx and HD-audio) and a simple fixup
        for an Acer laptop.  All marked as stable patches"
      
      * tag 'sound-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
        ALSA: opti9xx: Fix conflicting driver object name
        ALSA: hda - Fix NULL dereference with CONFIG_SND_DYNAMIC_MINORS=n
        ALSA: hda - Add inverted digital mic fixup for Acer Aspire One
      936dbcc3
    • L
      Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · d9eda0fa
      Linus Torvalds 提交于
      Pull ARM SoC fixes from Olof Johansson:
       "Two straggling fixes that I had missed as they were posted a couple of
        weeks ago, causing problems with interrupts (breaking them completely)
        on the CSR SiRF platforms"
      
      * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
        arm: prima2: drop nr_irqs in mach as we moved to linear irqdomain
        irqchip: sirf: move from legacy mode to linear irqdomain
      d9eda0fa
    • L
      Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux · 418a95bc
      Linus Torvalds 提交于
      Pull drm fixes from Dave Airlie:
       "Since we are getting to the pointy end, one i915 black screen on some
        machines, and one vmwgfx stop userspace ability to nuke the VM,
      
        There might be one or two ati or nouveau fixes trickle in before
        final, but I think this should pretty much be it"
      
      * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
        drm/vmwgfx: Split GMR2_REMAP commands if they are to large
        drm/i915: ivb: fix edp voltage swing reg val
      418a95bc
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input · 155e3a35
      Linus Torvalds 提交于
      Pull input layer updates from Dmitry Torokhov:
       "Just a couple of new IDs in Wacom and xpad drivers, i8042 is now
        disabled on ARC, and data checks in Elantech driver that were overly
        relaxed by the previous patch are now tightened"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
        Input: i8042 - disable the driver on ARC platforms
        Input: xpad - add signature for Razer Onza Classic Edition
        Input: elantech - fix packet check for v3 and v4 hardware
        Input: wacom - add support for 0x300 and 0x301
      155e3a35
    • R
      net: fec: fix time stamping logic after napi conversion · 0affdf34
      Richard Cochran 提交于
      Commit dc975382 "net: fec: add napi support to improve proformance"
      converted the fec driver to the napi model. However, that commit
      forgot to remove the call to skb_defer_rx_timestamp which is only
      needed in non-napi drivers.
      
      (The function napi_gro_receive eventually calls netif_receive_skb,
      which in turn calls skb_defer_rx_timestamp.)
      
      This patch should also be applied to the 3.9 and 3.10 kernels.
      Signed-off-by: NRichard Cochran <richardcochran@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0affdf34
    • D
      net: bridge: convert MLDv2 Query MRC into msecs_to_jiffies for max_delay · 2d98c29b
      Daniel Borkmann 提交于
      While looking into MLDv1/v2 code, I noticed that bridging code does
      not convert it's max delay into jiffies for MLDv2 messages as we do
      in core IPv6' multicast code.
      
      RFC3810, 5.1.3. Maximum Response Code says:
      
        The Maximum Response Code field specifies the maximum time allowed
        before sending a responding Report. The actual time allowed, called
        the Maximum Response Delay, is represented in units of milliseconds,
        and is derived from the Maximum Response Code as follows: [...]
      
      As we update timers that work with jiffies, we need to convert it.
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Cc: Linus Lüssing <linus.luessing@web.de>
      Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2d98c29b
    • D
      mISDN: return -EINVAL on error in dsp_control_req() · 0d63c27d
      Dan Carpenter 提交于
      If skb->len is too short then we should return an error.  Otherwise we
      read beyond the end of skb->data for several bytes.
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0d63c27d
    • E
      net: revert 8728c544 ("net: dev_pick_tx() fix") · 702821f4
      Eric Dumazet 提交于
      commit 8728c544 ("net: dev_pick_tx() fix") and commit
      b6fe83e9 ("bonding: refine IFF_XMIT_DST_RELEASE capability")
      are quite incompatible : Queue selection is disabled because skb
      dst was dropped before entering bonding device.
      
      This causes major performance regression, mainly because TCP packets
      for a given flow can be sent to multiple queues.
      
      This is particularly visible when using the new FQ packet scheduler
      with MQ + FQ setup on the slaves.
      
      We can safely revert the first commit now that 416186fb
      ("net: Split core bits of netdev_pick_tx into __netdev_pick_tx")
      properly caps the queue_index.
      Reported-by: NXi Wang <xii@google.com>
      Diagnosed-by: NXi Wang <xii@google.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Tom Herbert <therbert@google.com>
      Cc: Alexander Duyck <alexander.h.duyck@intel.com>
      Cc: Denys Fedorysychenko <nuclearcat@nuclearcat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      702821f4
    • D
      Revert "ipv6: Don't depend on per socket memory for neighbour discovery messages" · 25ad6117
      David S. Miller 提交于
      This reverts commit 1f324e38.
      
      It seems to cause regressions, and in particular the output path
      really depends upon there being a socket attached to skb->sk for
      checks such as sk_mc_loop(skb->sk) for example.  See ip6_output_finish2().
      Reported-by: NStephen Warren <swarren@wwwdotorg.org>
      Reported-by: NFabio Estevam <festevam@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      25ad6117
    • L
      ipv4 tunnels: fix an oops when using ipip/sit with IPsec · 737e828b
      Li Hongjun 提交于
      Since commit 3d7b46cd (ip_tunnel: push generic protocol handling to
      ip_tunnel module.), an Oops is triggered when an xfrm policy is configured on
      an IPv4 over IPv4 tunnel.
      
      xfrm4_policy_check() calls __xfrm_policy_check2(), which uses skb_dst(skb). But
      this field is NULL because iptunnel_pull_header() calls skb_dst_drop(skb).
      Signed-off-by: NLi Hongjun <hongjun.li@6wind.com>
      Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      737e828b
    • E
      tipc: set sk_err correctly when connection fails · 2c8d8518
      Erik Hugne 提交于
      Should a connect fail, if the publication/server is unavailable or
      due to some other error, a positive value will be returned and errno
      is never set. If the application code checks for an explicit zero
      return from connect (success) or a negative return (failure), it
      will not catch the error and subsequent send() calls will fail as
      shown from the strace snippet below.
      
      socket(0x1e /* PF_??? */, SOCK_SEQPACKET, 0) = 3
      connect(3, {sa_family=0x1e /* AF_??? */, sa_data="\2\1\322\4\0\0\322\4\0\0\0\0\0\0"}, 16) = 111
      sendto(3, "test", 4, 0, NULL, 0)        = -1 EPIPE (Broken pipe)
      
      The reason for this behaviour is that TIPC wrongly inverts error
      codes set in sk_err.
      Signed-off-by: NErik Hugne <erik.hugne@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2c8d8518
    • P
      tcp: tcp_make_synack() should use sock_wmalloc · eb8895de
      Phil Oester 提交于
      In commit 90ba9b19 (tcp: tcp_make_synack() can use alloc_skb()), Eric changed
      the call to sock_wmalloc in tcp_make_synack to alloc_skb.  In doing so,
      the netfilter owner match lost its ability to block the SYNACK packet on
      outbound listening sockets.  Revert the change, restoring the owner match
      functionality.
      
      This closes netfilter bugzilla #847.
      Signed-off-by: NPhil Oester <kernel@linuxace.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      eb8895de
    • L
      bridge: separate querier and query timer into IGMP/IPv4 and MLD/IPv6 ones · cc0fdd80
      Linus Lüssing 提交于
      Currently we would still potentially suffer multicast packet loss if there
      is just either an IGMP or an MLD querier: For the former case, we would
      possibly drop IPv6 multicast packets, for the latter IPv4 ones. This is
      because we are currently assuming that if either an IGMP or MLD querier
      is present that the other one is present, too.
      
      This patch makes the behaviour and fix added in
      "bridge: disable snooping if there is no querier" (b00589af)
      to also work if there is either just an IGMP or an MLD querier on the
      link: It refines the deactivation of the snooping to be protocol
      specific by using separate timers for the snooped IGMP and MLD queries
      as well as separate timers for our internal IGMP and MLD queriers.
      Signed-off-by: NLinus Lüssing <linus.luessing@web.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cc0fdd80
  6. 30 8月, 2013 2 次提交
    • L
      Merge branch 'for-3.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup · 41615e81
      Linus Torvalds 提交于
      Pull cgroup fix from Tejun Heo:
       "During the percpu reference counting update which was merged during
        v3.11-rc1, the cgroup destruction path was updated so that a cgroup in
        the process of dying may linger on the children list, which was
        necessary as the cgroup should still be included in child/descendant
        iteration while percpu ref is being killed.
      
        Unfortunately, I forgot to update cgroup destruction path accordingly
        and cgroup destruction may fail spuriously with -EBUSY due to
        lingering dying children even when there's no live child left - e.g.
        "rmdir parent/child parent" will usually fail.
      
        This can be easily fixed by iterating through the children list to
        verify that there's no live child left.  While this is very late in
        the release cycle, this bug is very visible to userland and I believe
        the fix is relatively safe.
      
        Thanks Hugh for spotting and providing fix for the issue"
      
      * 'for-3.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
        cgroup: fix rmdir EBUSY regression in 3.11
      41615e81
    • L
      Merge branch 'for-3.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq · ff497452
      Linus Torvalds 提交于
      Pull workqueue fix from Tejun Heo:
       "This contains one fix which could lead to system-wide lockup on
        !PREEMPT kernels.  It's very late in the cycle but this definitely is
        a -stable material.
      
        The problem is that workqueue worker tasks may process unlimited
        number of work items back-to-back without every yielding inbetween.
        This usually isn't noticeable but a work item which re-queues itself
        waiting for someone else to do something can deadlock with
        stop_machine.  stop_machine will ensure nothing else happens on all
        other cpus and the requeueing work item will reqeueue itself
        indefinitely without ever yielding and thus preventing the CPU from
        entering stop_machine.
      
        Kudos to Jamie Liu for spotting and diagnosing the problem.  This can
        be trivially fixed by adding cond_resched() after processing each work
        item"
      
      * 'for-3.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
        workqueue: cond_resched() after processing each work item
      ff497452