1. 04 6月, 2015 1 次提交
  2. 23 5月, 2015 1 次提交
    • E
      tcp: fix a potential deadlock in tcp_get_info() · d654976c
      Eric Dumazet 提交于
      Taking socket spinlock in tcp_get_info() can deadlock, as
      inet_diag_dump_icsk() holds the &hashinfo->ehash_locks[i],
      while packet processing can use the reverse locking order.
      
      We could avoid this locking for TCP_LISTEN states, but lockdep would
      certainly get confused as all TCP sockets share same lockdep classes.
      
      [  523.722504] ======================================================
      [  523.728706] [ INFO: possible circular locking dependency detected ]
      [  523.734990] 4.1.0-dbg-DEV #1676 Not tainted
      [  523.739202] -------------------------------------------------------
      [  523.745474] ss/18032 is trying to acquire lock:
      [  523.750002]  (slock-AF_INET){+.-...}, at: [<ffffffff81669d44>] tcp_get_info+0x2c4/0x360
      [  523.758129]
      [  523.758129] but task is already holding lock:
      [  523.763968]  (&(&hashinfo->ehash_locks[i])->rlock){+.-...}, at: [<ffffffff816bcb75>] inet_diag_dump_icsk+0x1d5/0x6c0
      [  523.774661]
      [  523.774661] which lock already depends on the new lock.
      [  523.774661]
      [  523.782850]
      [  523.782850] the existing dependency chain (in reverse order) is:
      [  523.790326]
      -> #1 (&(&hashinfo->ehash_locks[i])->rlock){+.-...}:
      [  523.796599]        [<ffffffff811126bb>] lock_acquire+0xbb/0x270
      [  523.802565]        [<ffffffff816f5868>] _raw_spin_lock+0x38/0x50
      [  523.808628]        [<ffffffff81665af8>] __inet_hash_nolisten+0x78/0x110
      [  523.815273]        [<ffffffff816819db>] tcp_v4_syn_recv_sock+0x24b/0x350
      [  523.822067]        [<ffffffff81684d41>] tcp_check_req+0x3c1/0x500
      [  523.828199]        [<ffffffff81682d09>] tcp_v4_do_rcv+0x239/0x3d0
      [  523.834331]        [<ffffffff816842fe>] tcp_v4_rcv+0xa8e/0xc10
      [  523.840202]        [<ffffffff81658fa3>] ip_local_deliver_finish+0x133/0x3e0
      [  523.847214]        [<ffffffff81659a9a>] ip_local_deliver+0xaa/0xc0
      [  523.853440]        [<ffffffff816593b8>] ip_rcv_finish+0x168/0x5c0
      [  523.859624]        [<ffffffff81659db7>] ip_rcv+0x307/0x420
      
      Lets use u64_sync infrastructure instead. As a bonus, 64bit
      arches get optimized, as these are nop for them.
      
      Fixes: 0df48c26 ("tcp: add tcpi_bytes_acked to tcp_info")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d654976c
  3. 20 5月, 2015 2 次提交
    • F
      Revert "netfilter: bridge: query conntrack about skb dnat" · faecbb45
      Florian Westphal 提交于
      This reverts commit c055d5b0.
      
      There are two issues:
      'dnat_took_place' made me think that this is related to
      -j DNAT/MASQUERADE.
      
      But thats only one part of the story.  This is also relevant for SNAT
      when we undo snat translation in reverse/reply direction.
      
      Furthermore, I originally wanted to do this mainly to avoid
      storing ipv6 addresses once we make DNAT/REDIRECT work
      for ipv6 on bridges.
      
      However, I forgot about SNPT/DNPT which is stateless.
      
      So we can't escape storing address for ipv6 anyway. Might as
      well do it for ipv4 too.
      Reported-and-tested-by: NBernhard Thaler <bernhard.thaler@wvnet.at>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      faecbb45
    • D
      xen/events: don't bind non-percpu VIRQs with percpu chip · 77bb3dfd
      David Vrabel 提交于
      A non-percpu VIRQ (e.g., VIRQ_CONSOLE) may be freed on a different
      VCPU than it is bound to.  This can result in a race between
      handle_percpu_irq() and removing the action in __free_irq() because
      handle_percpu_irq() does not take desc->lock.  The interrupt handler
      sees a NULL action and oopses.
      
      Only use the percpu chip/handler for per-CPU VIRQs (like VIRQ_TIMER).
      
        # cat /proc/interrupts | grep virq
         40:      87246          0  xen-percpu-virq      timer0
         44:          0          0  xen-percpu-virq      debug0
         47:          0      20995  xen-percpu-virq      timer1
         51:          0          0  xen-percpu-virq      debug1
         69:          0          0   xen-dyn-virq      xen-pcpu
         74:          0          0   xen-dyn-virq      mce
         75:         29          0   xen-dyn-virq      hvc_console
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      Cc: <stable@vger.kernel.org>
      77bb3dfd
  4. 19 5月, 2015 1 次提交
  5. 17 5月, 2015 1 次提交
    • H
      rhashtable: Add cap on number of elements in hash table · 07ee0722
      Herbert Xu 提交于
      We currently have no limit on the number of elements in a hash table.
      This is a problem because some users (tipc) set a ceiling on the
      maximum table size and when that is reached the hash table may
      degenerate.  Others may encounter OOM when growing and if we allow
      insertions when that happens the hash table perofrmance may also
      suffer.
      
      This patch adds a new paramater insecure_max_entries which becomes
      the cap on the table.  If unset it defaults to max_size * 2.  If
      it is also zero it means that there is no cap on the number of
      elements in the table.  However, the table will grow whenever the
      utilisation hits 100% and if that growth fails, you will get ENOMEM
      on insertion.
      
      As allowing oversubscription is potentially dangerous, the name
      contains the word insecure.
      
      Note that the cap is not a hard limit.  This is done for performance
      reasons as enforcing a hard limit will result in use of atomic ops
      that are heavier than the ones we currently use.
      
      The reasoning is that we're only guarding against a gross over-
      subscription of the table, rather than a small breach of the limit.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      07ee0722
  6. 16 5月, 2015 1 次提交
  7. 15 5月, 2015 3 次提交
    • R
      rename RTNH_F_EXTERNAL to RTNH_F_OFFLOAD · eea39946
      Roopa Prabhu 提交于
      RTNH_F_EXTERNAL today is printed as "offload" in iproute2 output.
      
      This patch renames the flag to be consistent with what the user sees.
      Signed-off-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      eea39946
    • J
      uidgid: make uid_valid and gid_valid work with !CONFIG_MULTIUSER · 929aa5b2
      Josh Triplett 提交于
      {u,g}id_valid call {u,g}id_eq, which calls __k{u,g}id_val on both
      arguments and compares.  With !CONFIG_MULTIUSER, __k{u,g}id_val return a
      constant 0, which makes {u,g}id_valid always return false.  Change
      {u,g}id_valid to compare their argument against -1 instead.  That produces
      identical results in the normal CONFIG_MULTIUSER=y case, but with
      !CONFIG_MULTIUSER will make {u,g}id_valid constant-fold into "return
      true;" rather than "return false;".
      
      This fixes uses of devpts without CONFIG_MULTIUSER.
      Signed-off-by: NJosh Triplett <josh@joshtriplett.org>
      Reported-by: Fengguang Wu <fengguang.wu@intel.com>,
      Cc: Peter Hurley <peter@hurleysoftware.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      929aa5b2
    • V
      gfp: add __GFP_NOACCOUNT · 8f4fc071
      Vladimir Davydov 提交于
      Not all kmem allocations should be accounted to memcg.  The following
      patch gives an example when accounting of a certain type of allocations to
      memcg can effectively result in a memory leak.  This patch adds the
      __GFP_NOACCOUNT flag which if passed to kmalloc and friends will force the
      allocation to go through the root cgroup.  It will be used by the next
      patch.
      
      Note, since in case of kmemleak enabled each kmalloc implies yet another
      allocation from the kmemleak_object cache, we add __GFP_NOACCOUNT to
      gfp_kmemleak_mask.
      
      Alternatively, we could introduce a per kmem cache flag disabling
      accounting for all allocations of a particular kind, but (a) we would not
      be able to bypass accounting for kmalloc then and (b) a kmem cache with
      this flag set could not be merged with a kmem cache without this flag,
      which would increase the number of global caches and therefore
      fragmentation even if the memory cgroup controller is not used.
      
      Despite its generic name, currently __GFP_NOACCOUNT disables accounting
      only for kmem allocations while user page allocations are always charged.
      To catch abusing of this flag, a warning is issued on an attempt of
      passing it to mem_cgroup_try_charge.
      Signed-off-by: NVladimir Davydov <vdavydov@parallels.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: <stable@vger.kernel.org>	[4.0.x]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8f4fc071
  8. 13 5月, 2015 4 次提交
  9. 12 5月, 2015 1 次提交
    • S
      HID: hid-sensor-hub: Fix debug lock warning · 2d94e522
      Srinivas Pandruvada 提交于
      When CONFIG_DEBUG_LOCK_ALLOC is defined, mutex magic is compared and
      warned for (l->magic != l), here l is the address of mutex pointer.
      In hid-sensor-hub as part of hsdev creation, a per hsdev mutex is
      initialized during MFD cell creation. This hsdev, which contains, mutex
      is part of platform data for the a cell. But platform_data is copied
      in platform_device_add_data() in platform.c. This copy will copy the
      whole hsdev structure including mutex. But once copied the magic
      will no longer match. So when client driver call
      sensor_hub_input_attr_get_raw_value, this will trigger mutex warning.
      So to avoid this allocate mutex dynamically. This will be same even
      after copy.
      Signed-off-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
      Signed-off-by: NJiri Kosina <jkosina@suse.cz>
      2d94e522
  10. 11 5月, 2015 1 次提交
    • P
      pty: Fix input race when closing · 1a48632f
      Peter Hurley 提交于
      A read() from a pty master may mistakenly indicate EOF (errno == -EIO)
      after the pty slave has closed, even though input data remains to be read.
      For example,
      
             pty slave       |        input worker        |    pty master
                             |                            |
                             |                            |   n_tty_read()
      pty_write()            |                            |     input avail? no
        add data             |                            |     sleep
        schedule worker  --->|                            |     .
                             |---> flush_to_ldisc()       |     .
      pty_close()            |       fill read buffer     |     .
        wait for worker      |       wakeup reader    --->|     .
                             |       read buffer full?    |---> input avail ? yes
                             |<---   yes - exit worker    |     copy 4096 bytes to user
        TTY_OTHER_CLOSED <---|                            |<--- kick worker
                             |                            |
      
      		                **** New read() before worker starts ****
      
                             |                            |   n_tty_read()
                             |                            |     input avail? no
                             |                            |     TTY_OTHER_CLOSED? yes
                             |                            |     return -EIO
      
      Several conditions are required to trigger this race:
      1. the ldisc read buffer must become full so the input worker exits
      2. the read() count parameter must be >= 4096 so the ldisc read buffer
         is empty
      3. the subsequent read() occurs before the kicked worker has processed
         more input
      
      However, the underlying cause of the race is that data is pipelined, while
      tty state is not; ie., data already written by the pty slave end is not
      yet visible to the pty master end, but state changes by the pty slave end
      are visible to the pty master end immediately.
      
      Pipeline the TTY_OTHER_CLOSED state through input worker to the reader.
      1. Introduce TTY_OTHER_DONE which is set by the input worker when
         TTY_OTHER_CLOSED is set and either the input buffers are flushed or
         input processing has completed. Readers/polls are woken when
         TTY_OTHER_DONE is set.
      2. Reader/poll checks TTY_OTHER_DONE instead of TTY_OTHER_CLOSED.
      3. A new input worker is started from pty_close() after setting
         TTY_OTHER_CLOSED, which ensures the TTY_OTHER_DONE state will be
         set if the last input worker is already finished (or just about to
         exit).
      
      Remove tty_flush_to_ldisc(); no in-tree callers.
      
      Fixes: 52bce7f8 ("pty, n_tty: Simplify input processing on final close")
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=96311
      BugLink: http://bugs.launchpad.net/bugs/1429756
      Cc: <stable@vger.kernel.org> # 3.19+
      Reported-by: NAndy Whitcroft <apw@canonical.com>
      Reported-by: NH.J. Lu <hjl.tools@gmail.com>
      Signed-off-by: NPeter Hurley <peter@hurleysoftware.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1a48632f
  11. 10 5月, 2015 1 次提交
  12. 09 5月, 2015 1 次提交
  13. 08 5月, 2015 1 次提交
  14. 07 5月, 2015 1 次提交
  15. 06 5月, 2015 5 次提交
  16. 05 5月, 2015 2 次提交
    • T
      RDMA/core: Enable the iWarp Port Mapper to provide the actual address of the... · 6eec1774
      Tatyana Nikolova 提交于
      RDMA/core: Enable the iWarp Port Mapper to provide the actual address of the connecting peer to its clients
      
      Add functionality to enable the port mapper on the passive side to provide to its
      clients the actual (non-mapped) ip/tcp address information of the connecting peer
      
      1) Adding remote_info_cb() to process the address info of the connecting peer
         The address info is provided by the user space port mapper service when
         the connection is initiated by the peer
      2) Adding a hash list to store the remote address info
      3) Adding functionality to add/remove the remote address info
         After the info has been provided to the port mapper client,
         it is removed from the hash list
      Signed-off-by: NTatyana Nikolova <tatyana.e.nikolova@intel.com>
      Reviewed-by: NSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      6eec1774
    • S
      blk-mq: fix FUA request hang · b2387ddc
      Shaohua Li 提交于
      When a FUA request enters its DATA stage of flush pipeline, the
      request is added to mq requeue list, the request will then be added to
      ctx->rq_list. blk_mq_attempt_merge() might merge the request with a bio.
      Later when the request is finished the flush pipeline, the
      request->__data_len is 0. Then I only saw the bio gets endio called, the
      original request never finish.
      
      Adding REQ_FLUSH_SEQ into REQ_NOMERGE_FLAGS looks an easy fix.
      
      stable: 3.15+
      Signed-off-by: NShaohua Li <shli@fb.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      b2387ddc
  17. 04 5月, 2015 3 次提交
    • R
      mac80211: fix 90 kernel-doc warnings · ff419b3f
      Randy Dunlap 提交于
      Eliminate 90 of these warnings:
      
      Warning(..//include/net/mac80211.h:1682): No description found for parameter 'drv_priv[0]'
      Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      ff419b3f
    • D
      lib: make memzero_explicit more robust against dead store elimination · 7829fb09
      Daniel Borkmann 提交于
      In commit 0b053c95 ("lib: memzero_explicit: use barrier instead
      of OPTIMIZER_HIDE_VAR"), we made memzero_explicit() more robust in
      case LTO would decide to inline memzero_explicit() and eventually
      find out it could be elimiated as dead store.
      
      While using barrier() works well for the case of gcc, recent efforts
      from LLVMLinux people suggest to use llvm as an alternative to gcc,
      and there, Stephan found in a simple stand-alone user space example
      that llvm could nevertheless optimize and thus elimitate the memset().
      A similar issue has been observed in the referenced llvm bug report,
      which is regarded as not-a-bug.
      
      Based on some experiments, icc is a bit special on its own, while it
      doesn't seem to eliminate the memset(), it could do so with an own
      implementation, and then result in similar findings as with llvm.
      
      The fix in this patch now works for all three compilers (also tested
      with more aggressive optimization levels). Arguably, in the current
      kernel tree it's more of a theoretical issue, but imho, it's better
      to be pedantic about it.
      
      It's clearly visible with gcc/llvm though, with the below code: if we
      would have used barrier() only here, llvm would have omitted clearing,
      not so with barrier_data() variant:
      
        static inline void memzero_explicit(void *s, size_t count)
        {
          memset(s, 0, count);
          barrier_data(s);
        }
      
        int main(void)
        {
          char buff[20];
          memzero_explicit(buff, sizeof(buff));
          return 0;
        }
      
        $ gcc -O2 test.c
        $ gdb a.out
        (gdb) disassemble main
        Dump of assembler code for function main:
         0x0000000000400400  <+0>: lea   -0x28(%rsp),%rax
         0x0000000000400405  <+5>: movq  $0x0,-0x28(%rsp)
         0x000000000040040e <+14>: movq  $0x0,-0x20(%rsp)
         0x0000000000400417 <+23>: movl  $0x0,-0x18(%rsp)
         0x000000000040041f <+31>: xor   %eax,%eax
         0x0000000000400421 <+33>: retq
        End of assembler dump.
      
        $ clang -O2 test.c
        $ gdb a.out
        (gdb) disassemble main
        Dump of assembler code for function main:
         0x00000000004004f0  <+0>: xorps  %xmm0,%xmm0
         0x00000000004004f3  <+3>: movaps %xmm0,-0x18(%rsp)
         0x00000000004004f8  <+8>: movl   $0x0,-0x8(%rsp)
         0x0000000000400500 <+16>: lea    -0x18(%rsp),%rax
         0x0000000000400505 <+21>: xor    %eax,%eax
         0x0000000000400507 <+23>: retq
        End of assembler dump.
      
      As gcc, clang, but also icc defines __GNUC__, it's sufficient to define
      this in compiler-gcc.h only to be picked up. For a fallback or otherwise
      unsupported compiler, we define it as a barrier. Similarly, for ecc which
      does not support gcc inline asm.
      
      Reference: https://llvm.org/bugs/show_bug.cgi?id=15495Reported-by: NStephan Mueller <smueller@chronox.de>
      Tested-by: NStephan Mueller <smueller@chronox.de>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: Stephan Mueller <smueller@chronox.de>
      Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Cc: mancha security <mancha1@zoho.com>
      Cc: Mark Charlebois <charlebm@gmail.com>
      Cc: Behan Webster <behanw@converseincode.com>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      7829fb09
    • E
      codel: fix maxpacket/mtu confusion · a5d28090
      Eric Dumazet 提交于
      Under presence of TSO/GSO/GRO packets, codel at low rates can be quite
      useless. In following example, not a single packet was ever dropped,
      while average delay in codel queue is ~100 ms !
      
      qdisc codel 0: parent 1:12 limit 16000p target 5.0ms interval 100.0ms
       Sent 134376498 bytes 88797 pkt (dropped 0, overlimits 0 requeues 0)
       backlog 13626b 3p requeues 0
        count 0 lastcount 0 ldelay 96.9ms drop_next 0us
        maxpacket 9084 ecn_mark 0 drop_overlimit 0
      
      This comes from a confusion of what should be the minimal backlog. It is
      pretty clear it is not 64KB or whatever max GSO packet ever reached the
      qdisc.
      
      codel intent was to use MTU of the device.
      
      After the fix, we finally drop some packets, and rtt/cwnd of my single
      TCP flow are meeting our expectations.
      
      qdisc codel 0: parent 1:12 limit 16000p target 5.0ms interval 100.0ms
       Sent 102798497 bytes 67912 pkt (dropped 1365, overlimits 0 requeues 0)
       backlog 6056b 3p requeues 0
        count 1 lastcount 1 ldelay 36.3ms drop_next 0us
        maxpacket 10598 ecn_mark 0 drop_overlimit 0
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Kathleen Nichols <nichols@pollere.com>
      Cc: Dave Taht <dave.taht@gmail.com>
      Cc: Van Jacobson <vanj@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a5d28090
  18. 02 5月, 2015 1 次提交
  19. 01 5月, 2015 2 次提交
  20. 30 4月, 2015 6 次提交
    • E
      tcp: add TCP_CC_INFO socket option · 6e9250f5
      Eric Dumazet 提交于
      Some Congestion Control modules can provide per flow information,
      but current way to get this information is to use netlink.
      
      Like TCP_INFO, let's add TCP_CC_INFO so that applications can
      issue a getsockopt() if they have a socket file descriptor,
      instead of playing complex netlink games.
      
      Sample usage would be :
      
        union tcp_cc_info info;
        socklen_t len = sizeof(info);
      
        if (getsockopt(fd, SOL_TCP, TCP_CC_INFO, &info, &len) == -1)
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6e9250f5
    • E
      tcp: prepare CC get_info() access from getsockopt() · 64f40ff5
      Eric Dumazet 提交于
      We would like that optional info provided by Congestion Control
      modules using netlink can also be read using getsockopt()
      
      This patch changes get_info() to put this information in a buffer,
      instead of skb, like tcp_get_info(), so that following patch
      can reuse this common infrastructure.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      64f40ff5
    • E
      tcp: add tcpi_bytes_received to tcp_info · bdd1f9ed
      Eric Dumazet 提交于
      This patch tracks total number of payload bytes received on a TCP socket.
      This is the sum of all changes done to tp->rcv_nxt
      
      RFC4898 named this : tcpEStatsAppHCThruOctetsReceived
      
      This is a 64bit field, and can be fetched both from TCP_INFO
      getsockopt() if one has a handle on a TCP socket, or from inet_diag
      netlink facility (iproute2/ss patch will follow)
      
      Note that tp->bytes_received was placed near tp->rcv_nxt for
      best data locality and minimal performance impact.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Matt Mathis <mattmathis@google.com>
      Cc: Eric Salo <salo@google.com>
      Cc: Martin Lau <kafai@fb.com>
      Cc: Chris Rapier <rapier@psc.edu>
      Acked-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bdd1f9ed
    • E
      tcp: add tcpi_bytes_acked to tcp_info · 0df48c26
      Eric Dumazet 提交于
      This patch tracks total number of bytes acked for a TCP socket.
      This is the sum of all changes done to tp->snd_una, and allows
      for precise tracking of delivered data.
      
      RFC4898 named this : tcpEStatsAppHCThruOctetsAcked
      
      This is a 64bit field, and can be fetched both from TCP_INFO
      getsockopt() if one has a handle on a TCP socket, or from inet_diag
      netlink facility (iproute2/ss patch will follow)
      
      Note that tp->bytes_acked was placed near tp->snd_una for
      best data locality and minimal performance impact.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NYuchung Cheng <ycheng@google.com>
      Cc: Matt Mathis <mattmathis@google.com>
      Cc: Eric Salo <salo@google.com>
      Cc: Martin Lau <kafai@fb.com>
      Cc: Chris Rapier <rapier@psc.edu>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0df48c26
    • N
      bridge/nl: remove wrong use of NLM_F_MULTI · 46c264da
      Nicolas Dichtel 提交于
      NLM_F_MULTI must be used only when a NLMSG_DONE message is sent. In fact,
      it is sent only at the end of a dump.
      
      Libraries like libnl will wait forever for NLMSG_DONE.
      
      Fixes: e5a55a89 ("net: create generic bridge ops")
      Fixes: 815cccbf ("ixgbe: add setlink, getlink support to ixgbe and ixgbevf")
      CC: John Fastabend <john.r.fastabend@intel.com>
      CC: Sathya Perla <sathya.perla@emulex.com>
      CC: Subbu Seetharaman <subbu.seetharaman@emulex.com>
      CC: Ajit Khaparde <ajit.khaparde@emulex.com>
      CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
      CC: intel-wired-lan@lists.osuosl.org
      CC: Jiri Pirko <jiri@resnulli.us>
      CC: Scott Feldman <sfeldma@gmail.com>
      CC: Stephen Hemminger <stephen@networkplumber.org>
      CC: bridge@lists.linux-foundation.org
      Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      46c264da
    • B
      xen: Suspend ticks on all CPUs during suspend · 2b953a5e
      Boris Ostrovsky 提交于
      Commit 77e32c89 ("clockevents: Manage device's state separately for
      the core") decouples clockevent device's modes from states. With this
      change when a Xen guest tries to resume, it won't be calling its
      set_mode op which needs to be done on each VCPU in order to make the
      hypervisor aware that we are in oneshot mode.
      
      This happens because clockevents_tick_resume() (which is an intermediate
      step of resuming ticks on a processor) doesn't call clockevents_set_state()
      anymore and because during suspend clockevent devices on all VCPUs (except
      for the one doing the suspend) are left in ONESHOT state. As result, during
      resume the clockevents state machine will assume that device is already
      where it should be and doesn't need to be updated.
      
      To avoid this problem we should suspend ticks on all VCPUs during
      suspend.
      Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      2b953a5e
  21. 29 4月, 2015 1 次提交
    • P
      ALSA: emu10k1: Emu10k2 32 bit DMA mode · 7241ea55
      Peter Zubaj 提交于
      Looks like audigy emu10k2 (probably emu10k1 - sb live too) support two
      modes for DMA. Second mode is useful for 64 bit os with more then 2 GB
      of ram (fixes problems with big soundfont loading)
      
      1) 32MB from 2 GB address space using 8192 pages (used now as default)
      2) 16MB from 4 GB address space using 4096 pages
      
      Mode is set using HCFG_EXPANDED_MEM flag in HCFG register.
      Also format of emu10k2 page table is then different.
      Signed-off-by: NPeter Zubaj <pzubaj@marticonet.sk>
      Tested-by: NTakashi Iwai <tiwai@suse.de>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NTakashi Iwai <tiwai@suse.de>
      7241ea55