1. 20 10月, 2014 1 次提交
  2. 19 10月, 2014 3 次提交
  3. 18 10月, 2014 17 次提交
  4. 16 10月, 2014 3 次提交
  5. 15 10月, 2014 16 次提交
    • M
      9p/trans_virtio: enable VQs early · 64b4cc39
      Michael S. Tsirkin 提交于
      virtio spec requires drivers to set DRIVER_OK before using VQs.
      This is set automatically after probe returns, but virtio 9p device
      adds self to channel list within probe, at which point VQ can be
      used in violation of the spec.
      
      To fix, call virtio_device_ready before using VQs.
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      64b4cc39
    • E
      tcp: TCP Small Queues and strange attractors · 9b462d02
      Eric Dumazet 提交于
      TCP Small queues tries to keep number of packets in qdisc
      as small as possible, and depends on a tasklet to feed following
      packets at TX completion time.
      Choice of tasklet was driven by latencies requirements.
      
      Then, TCP stack tries to avoid reorders, by locking flows with
      outstanding packets in qdisc in a given TX queue.
      
      What can happen is that many flows get attracted by a low performing
      TX queue, and cpu servicing TX completion has to feed packets for all of
      them, making this cpu 100% busy in softirq mode.
      
      This became particularly visible with latest skb->xmit_more support
      
      Strategy adopted in this patch is to detect when tcp_wfree() is called
      from ksoftirqd and let the outstanding queue for this flow being drained
      before feeding additional packets, so that skb->ooo_okay can be set
      to allow select_queue() to select the optimal queue :
      
      Incoming ACKS are normally handled by different cpus, so this patch
      gives more chance for these cpus to take over the burden of feeding
      qdisc with future packets.
      
      Tested:
      
      lpaa23:~# ./super_netperf 1400 --google-pacing-rate 3028000 -H lpaa24 -l 3600 &
      
      lpaa23:~# sar -n DEV 1 10 | grep eth1
      06:16:18 AM      eth1 595448.00 1190564.00  38381.09 1760253.12      0.00      0.00      1.00
      06:16:19 AM      eth1 594858.00 1189686.00  38340.76 1758952.72      0.00      0.00      0.00
      06:16:20 AM      eth1 597017.00 1194019.00  38480.79 1765370.29      0.00      0.00      1.00
      06:16:21 AM      eth1 595450.00 1190936.00  38380.19 1760805.05      0.00      0.00      0.00
      06:16:22 AM      eth1 596385.00 1193096.00  38442.56 1763976.29      0.00      0.00      1.00
      06:16:23 AM      eth1 598155.00 1195978.00  38552.97 1768264.60      0.00      0.00      0.00
      06:16:24 AM      eth1 594405.00 1188643.00  38312.57 1757414.89      0.00      0.00      1.00
      06:16:25 AM      eth1 593366.00 1187154.00  38252.16 1755195.83      0.00      0.00      0.00
      06:16:26 AM      eth1 593188.00 1186118.00  38232.88 1753682.57      0.00      0.00      1.00
      06:16:27 AM      eth1 596301.00 1192241.00  38440.94 1762733.09      0.00      0.00      0.00
      Average:         eth1 595457.30 1190843.50  38381.69 1760664.84      0.00      0.00      0.50
      lpaa23:~# ./tc -s -d qd sh dev eth1 | grep backlog
       backlog 7606336b 2513p requeues 167982
       backlog 224072b 74p requeues 566
       backlog 581376b 192p requeues 5598
       backlog 181680b 60p requeues 1070
       backlog 5305056b 1753p requeues 110166    // Here, this TX queue is attracting flows
       backlog 157456b 52p requeues 1758
       backlog 672216b 222p requeues 3025
       backlog 60560b 20p requeues 24541
       backlog 448144b 148p requeues 21258
      
      lpaa23:~# echo 1 >/proc/sys/net/ipv4/tcp_tsq_enable_tcp_wfree_ksoftirqd_detect
      
      Immediate jump to full bandwidth, and traffic is properly
      shard on all tx queues.
      
      lpaa23:~# sar -n DEV 1 10 | grep eth1
      06:16:46 AM      eth1 1397632.00 2795397.00  90081.87 4133031.26      0.00      0.00      1.00
      06:16:47 AM      eth1 1396874.00 2793614.00  90032.99 4130385.46      0.00      0.00      0.00
      06:16:48 AM      eth1 1395842.00 2791600.00  89966.46 4127409.67      0.00      0.00      1.00
      06:16:49 AM      eth1 1395528.00 2791017.00  89946.17 4126551.24      0.00      0.00      0.00
      06:16:50 AM      eth1 1397891.00 2795716.00  90098.74 4133497.39      0.00      0.00      1.00
      06:16:51 AM      eth1 1394951.00 2789984.00  89908.96 4125022.51      0.00      0.00      0.00
      06:16:52 AM      eth1 1394608.00 2789190.00  89886.90 4123851.36      0.00      0.00      1.00
      06:16:53 AM      eth1 1395314.00 2790653.00  89934.33 4125983.09      0.00      0.00      0.00
      06:16:54 AM      eth1 1396115.00 2792276.00  89984.25 4128411.21      0.00      0.00      1.00
      06:16:55 AM      eth1 1396829.00 2793523.00  90030.19 4130250.28      0.00      0.00      0.00
      Average:         eth1 1396158.40 2792297.00  89987.09 4128439.35      0.00      0.00      0.50
      
      lpaa23:~# tc -s -d qd sh dev eth1 | grep backlog
       backlog 7900052b 2609p requeues 173287
       backlog 878120b 290p requeues 589
       backlog 1068884b 354p requeues 5621
       backlog 996212b 329p requeues 1088
       backlog 984100b 325p requeues 115316
       backlog 956848b 316p requeues 1781
       backlog 1080996b 357p requeues 3047
       backlog 975016b 322p requeues 24571
       backlog 990156b 327p requeues 21274
      
      (All 8 TX queues get a fair share of the traffic)
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9b462d02
    • D
      net: Trap attempts to call sock_kfree_s() with a NULL pointer. · e53da5fb
      David S. Miller 提交于
      Unlike normal kfree() it is never right to call sock_kfree_s() with
      a NULL pointer, because sock_kfree_s() also has the side effect of
      discharging the memory from the sockets quota.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e53da5fb
    • C
      rds: avoid calling sock_kfree_s() on allocation failure · dee49f20
      Cong Wang 提交于
      It is okay to free a NULL pointer but not okay to mischarge the socket optmem
      accounting. Compile test only.
      
      Reported-by: rucsoftsec@gmail.com
      Cc: Chien Yen <chien.yen@oracle.com>
      Cc: Stephen Hemminger <stephen@networkplumber.org>
      Signed-off-by: NCong Wang <cwang@twopensource.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dee49f20
    • F
      caif_usb: use target structure member in memset · 91c4467e
      Fabian Frederick 提交于
      parent cfusbl was used instead of first structure member 'layer'
      Suggested-by: NJoe Perches <joe@perches.com>
      Signed-off-by: NFabian Frederick <fabf@skynet.be>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      91c4467e
    • F
      caif_usb: remove redundant memory message · 7970f191
      Fabian Frederick 提交于
      Let MM subsystem display out of memory messages.
      Signed-off-by: NFabian Frederick <fabf@skynet.be>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7970f191
    • F
      caif: replace kmalloc/memset 0 by kzalloc · 6ff1e1e3
      Fabian Frederick 提交于
      Also add blank line after declaration
      Signed-off-by: NFabian Frederick <fabf@skynet.be>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6ff1e1e3
    • J
      ipv4: fix nexthop attlen check in fib_nh_match · f76936d0
      Jiri Pirko 提交于
      fib_nh_match does not match nexthops correctly. Example:
      
      ip route add 172.16.10/24 nexthop via 192.168.122.12 dev eth0 \
                                nexthop via 192.168.122.13 dev eth0
      ip route del 172.16.10/24 nexthop via 192.168.122.14 dev eth0 \
                                nexthop via 192.168.122.15 dev eth0
      
      Del command is successful and route is removed. After this patch
      applied, the route is correctly matched and result is:
      RTNETLINK answers: No such process
      
      Please consider this for stable trees as well.
      
      Fixes: 4e902c57 ("[IPv4]: FIB configuration using struct fib_config")
      Signed-off-by: NJiri Pirko <jiri@resnulli.us>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f76936d0
    • E
      tcp: fix tcp_ack() performance problem · ad971f61
      Eric Dumazet 提交于
      We worked hard to improve tcp_ack() performance, by not accessing
      skb_shinfo() in fast path (cd7d8498 tcp: change tcp_skb_pcount()
      location)
      
      We still have one spurious access because of ACK timestamping,
      added in commit e1c8a607 ("net-timestamp: ACK timestamp for
      bytestreams")
      
      By checking if sk_tsflags has SOF_TIMESTAMPING_TX_ACK set,
      we can avoid two cache line misses for the common case.
      
      While we are at it, add two prefetchw() :
      
      One in tcp_ack() to bring skb at the head of write queue.
      
      One in tcp_clean_rtx_queue() loop to bring following skb,
      as we will delete skb from the write queue and dirty skb->next->prev.
      
      Add a couple of [un]likely() clauses.
      
      After this patch, tcp_ack() is no longer the most consuming
      function in tcp stack.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Willem de Bruijn <willemb@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Van Jacobson <vanj@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ad971f61
    • I
      libceph: ceph-msgr workqueue needs a resque worker · f9865f06
      Ilya Dryomov 提交于
      Commit f363e45f ("net/ceph: make ceph_msgr_wq non-reentrant")
      effectively removed WQ_MEM_RECLAIM flag from ceph_msgr_wq.  This is
      wrong - libceph is very much a memory reclaim path, so restore it.
      
      Cc: stable@vger.kernel.org # needs backporting for < 3.12
      Signed-off-by: NIlya Dryomov <idryomov@redhat.com>
      Tested-by: NMicha Krause <micha@krausam.de>
      Reviewed-by: NSage Weil <sage@redhat.com>
      f9865f06
    • I
      libceph: separate multiple ops with commas in debugfs output · 25f89777
      Ilya Dryomov 提交于
      For requests with multiple ops, separate ops with commas instead of \t,
      which is a field separator here.
      Signed-off-by: NIlya Dryomov <idryomov@redhat.com>
      Reviewed-by: NSage Weil <sage@redhat.com>
      25f89777
    • I
      libceph: sync osd op definitions in rados.h · 70b5bfa3
      Ilya Dryomov 提交于
      Bring in missing osd ops and strings, use macros to eliminate multiple
      points of maintenance.
      Signed-off-by: NIlya Dryomov <idryomov@redhat.com>
      Reviewed-by: NSage Weil <sage@redhat.com>
      70b5bfa3
    • Y
      libceph: reference counting pagelist · e4339d28
      Yan, Zheng 提交于
      this allow pagelist to present data that may be sent multiple times.
      Signed-off-by: NYan, Zheng <zyan@redhat.com>
      Reviewed-by: NSage Weil <sage@redhat.com>
      e4339d28
    • L
      ipv6: remove aca_lock spinlock from struct ifacaddr6 · 02ea8074
      Li RongQing 提交于
      no user uses this lock.
      Signed-off-by: NLi RongQing <roy.qing.li@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      02ea8074
    • E
      tcp: fix ooo_okay setting vs Small Queues · b2532eb9
      Eric Dumazet 提交于
      TCP Small Queues (tcp_tsq_handler()) can hold one reference on
      sk->sk_wmem_alloc, preventing skb->ooo_okay being set.
      
      We should relax test done to set skb->ooo_okay to take care
      of this extra reference.
      
      Minimal truesize of skb containing one byte of payload is
      SKB_TRUESIZE(1)
      
      Without this fix, we have more chance locking flows into the wrong
      transmit queue.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b2532eb9
    • I
      libceph: don't try checking queue_work() return value · 91883cd2
      Ilya Dryomov 提交于
      queue_work() doesn't "fail to queue", it returns false if work was
      already on a queue, which can't happen here since we allocate
      event_work right before we queue it.  So don't bother at all.
      Signed-off-by: NIlya Dryomov <ilya.dryomov@inktank.com>
      Reviewed-by: NAlex Elder <elder@linaro.org>
      91883cd2