1. 12 12月, 2012 2 次提交
    • E
      pkt_sched: avoid requeues if possible · 1abbe139
      Eric Dumazet 提交于
      With BQL being deployed, we can more likely have following behavior :
      
      We dequeue a packet from qdisc in dequeue_skb(), then we realize target
      tx queue is in XOFF state in sch_direct_xmit(), and we have to hold the
      skb into gso_skb for later.
      
      This shows in stats (tc -s qdisc dev eth0) as requeues.
      
      Problem of these requeues is that high priority packets can not be
      dequeued as long as this (possibly low prio and big TSO packet) is not
      removed from gso_skb.
      
      At 1Gbps speed, a full size TSO packet is 500 us of extra latency.
      
      In some cases, we know that all packets dequeued from a qdisc are
      for a particular and known txq :
      
      - If device is non multi queue
      - For all MQ/MQPRIO slave qdiscs
      
      This patch introduces a new qdisc flag, TCQ_F_ONETXQUEUE to mark
      this capability, so that dequeue_skb() is allowed to dequeue a packet
      only if the associated txq is not stopped.
      
      This indeed reduce latencies for high prio packets (or improve fairness
      with sfq/fq_codel), and almost remove qdisc 'requeues'.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: John Fastabend <john.r.fastabend@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1abbe139
    • E
      net: fix a race in gro_cell_poll() · f8e8f97c
      Eric Dumazet 提交于
      Dmitry Kravkov reported packet drops for GRE packets since GRO support
      was added.
      
      There is a race in gro_cell_poll() because we call napi_complete()
      without any synchronization with a concurrent gro_cells_receive()
      
      Once bug was triggered, we queued packets but did not schedule NAPI
      poll.
      
      We can fix this issue using the spinlock protected the napi_skbs queue,
      as we have to hold it to perform skb dequeue anyway.
      
      As we open-code skb_dequeue(), we no longer need to mask IRQS, as both
      producer and consumer run under BH context.
      
      Bug added in commit c9e6bc64 (net: add gro_cells infrastructure)
      Reported-by: NDmitry Kravkov <dmitry@broadcom.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Tested-by: NDmitry Kravkov <dmitry@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f8e8f97c
  2. 11 12月, 2012 1 次提交
  3. 09 12月, 2012 2 次提交
    • J
      virtio_net: multiqueue support · 986a4f4d
      Jason Wang 提交于
      This patch adds the multiqueue (VIRTIO_NET_F_MQ) support to virtio_net
      driver. VIRTIO_NET_F_MQ capable device could allow the driver to do packet
      transmission and reception through multiple queue pairs and does the packet
      steering to get better performance. By default, one one queue pair is used, user
      could change the number of queue pairs by ethtool in the next patch.
      
      When multiple queue pairs is used and the number of queue pairs is equal to the
      number of vcpus. Driver does the following optimizations to implement per-cpu
      virt queue pairs:
      
      - select the txq based on the smp processor id.
      - smp affinity hint to the cpu that owns the queue pairs.
      
      This could be used with the flow steering support of the device to guarantee the
      packets of a single flow is handled by the same cpu.
      Signed-off-by: NKrishna Kumar <krkumar2@in.ibm.com>
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      986a4f4d
    • J
      net: Add support for hardware-offloaded encapsulation · 6a674e9c
      Joseph Gasparakis 提交于
      This patch adds support in the kernel for offloading in the NIC Tx and Rx
      checksumming for encapsulated packets (such as VXLAN and IP GRE).
      
      For Tx encapsulation offload, the driver will need to set the right bits
      in netdev->hw_enc_features. The protocol driver will have to set the
      skb->encapsulation bit and populate the inner headers, so the NIC driver will
      use those inner headers to calculate the csum in hardware.
      
      For Rx encapsulation offload, the driver will need to set again the
      skb->encapsulation flag and the skb->ip_csum to CHECKSUM_UNNECESSARY.
      In that case the protocol driver should push the decapsulated packet up
      to the stack, again with CHECKSUM_UNNECESSARY. In ether case, the protocol
      driver should set the skb->encapsulation flag back to zero. Finally the
      protocol driver should have NETIF_F_RXCSUM flag set in its features.
      Signed-off-by: NJoseph Gasparakis <joseph.gasparakis@intel.com>
      Signed-off-by: NPeter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6a674e9c
  4. 08 12月, 2012 3 次提交
  5. 07 12月, 2012 7 次提交
  6. 06 12月, 2012 2 次提交
    • J
      wireless: fix VHT max AMPDU exponent definition · 01331040
      Johannes Berg 提交于
      This is really a 3-bit field, not a single bit,
      so declare a mask and shift. Also fix hwsim, it
      advertises the maximum possible.
      
      While at it reindent all the defines using tabs
      instead of spaces.
      
      Change-Id: I7cd81c0d72f76deb5010aba5bfa3dd312006e898
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      01331040
    • D
      bridge: implement multicast fast leave · c2d3babf
      David S. Miller 提交于
      V3: make it a flag
      V2: make the toggle per-port
      
      Fast leave allows bridge to immediately stops the multicast
      traffic on the port receives IGMP Leave when IGMP snooping is enabled,
      no timeouts are observed.
      
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Stephen Hemminger <shemminger@vyatta.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: NCong Wang <amwang@redhat.com>
      c2d3babf
  7. 05 12月, 2012 4 次提交
  8. 04 12月, 2012 6 次提交
    • M
      tun: only queue packets on device · 5d097109
      Michael S. Tsirkin 提交于
      Historically tun supported two modes of operation:
      - in default mode, a small number of packets would get queued
        at the device, the rest would be queued in qdisc
      - in one queue mode, all packets would get queued at the device
      
      This might have made sense up to a point where we made the
      queue depth for both modes the same and set it to
      a huge value (500) so unless the consumer
      is stuck the chance of losing packets is small.
      
      Thus in practice both modes behave the same, but the
      default mode has some problems:
      - if packets are never consumed, fragments are never orphaned
        which cases a DOS for sender using zero copy transmit
      - overrun errors are hard to diagnose: fifo error is incremented
        only once so you can not distinguish between
        userspace that is stuck and a transient failure,
        tcpdump on the device does not show any traffic
      
      Userspace solves this simply by enabling IFF_ONE_QUEUE
      but there seems to be little point in not doing the
      right thing for everyone, by default.
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5d097109
    • M
      sctp: Add support to per-association statistics via a new SCTP_GET_ASSOC_STATS call · 196d6759
      Michele Baldessari 提交于
      The current SCTP stack is lacking a mechanism to have per association
      statistics. This is an implementation modeled after OpenSolaris'
      SCTP_GET_ASSOC_STATS.
      
      Userspace part will follow on lksctp if/when there is a general ACK on
      this.
      V4:
      - Move ipackets++ before q->immediate.func() for consistency reasons
      - Move sctp_max_rto() at the end of sctp_transport_update_rto() to avoid
        returning bogus RTO values
      - return asoc->rto_min when max_obs_rto value has not changed
      
      V3:
      - Increase ictrlchunks in sctp_assoc_bh_rcv() as well
      - Move ipackets++ to sctp_inq_push()
      - return 0 when no rto updates took place since the last call
      
      V2:
      - Implement partial retrieval of stat struct to cope for future expansion
      - Kill the rtxpackets counter as it cannot be precise anyway
      - Rename outseqtsns to outofseqtsns to make it clearer that these are out
        of sequence unexpected TSNs
      - Move asoc->ipackets++ under a lock to avoid potential miscounts
      - Fold asoc->opackets++ into the already existing asoc check
      - Kill unneeded (q->asoc) test when increasing rtxchunks
      - Do not count octrlchunks if sending failed (SCTP_XMIT_OK != 0)
      - Don't count SHUTDOWNs as SACKs
      - Move SCTP_GET_ASSOC_STATS to the private space API
      - Adjust the len check in sctp_getsockopt_assoc_stats() to allow for
        future struct growth
      - Move association statistics in their own struct
      - Update idupchunks when we send a SACK with dup TSNs
      - return min_rto in max_rto when RTO has not changed. Also return the
        transport when max_rto last changed.
      
      Signed-off: Michele Baldessari <michele@acksyn.org>
      Acked-by: NVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      196d6759
    • A
      Bluetooth: trivial: Change NO_FCS_RECV to RECV_NO_FCS · f2592d3e
      Andrei Emeltchenko 提交于
      Make code more readable by changing CONF_NO_FCS_RECV which is read
      as "No L2CAP FCS option received" to CONF_RECV_NO_FCS which means
      "Received L2CAP option NO_FCS". This flag really means that we have
      received L2CAP FRAME CHECK SEQUENCE (FCS) OPTION with value "No FCS".
      Signed-off-by: NAndrei Emeltchenko <andrei.emeltchenko@intel.com>
      Signed-off-by: NGustavo Padovan <gustavo.padovan@collabora.co.uk>
      f2592d3e
    • A
      Bluetooth: AMP: Check that AMP is present and active · 5d05416e
      Andrei Emeltchenko 提交于
      Before starting quering remote AMP controllers make sure
      that there is local active AMP controller.
      Signed-off-by: NAndrei Emeltchenko <andrei.emeltchenko@intel.com>
      Signed-off-by: NGustavo Padovan <gustavo.padovan@collabora.co.uk>
      5d05416e
    • G
      Bluetooth: Move double negation to macros · ffa88e02
      Gustavo Padovan 提交于
      Some comparisons needs to double negation(!!) in order to make the value
      of the field boolean. Add it to the macro makes the code more readable.
      Signed-off-by: NGustavo Padovan <gustavo.padovan@collabora.co.uk>
      ffa88e02
    • F
      Bluetooth: Implement deferred sco socket setup · 20714bfe
      Frédéric Dalleau 提交于
      In order to authenticate and configure an incoming SCO connection, the
      BT_DEFER_SETUP option was added. This option is intended to defer reply
      to Connect Request on SCO sockets.
      When a connection is requested, the listening socket is unblocked but
      the effective connection setup happens only on first recv. Any send
      between accept and recv fails with -ENOTCONN.
      Signed-off-by: NFrédéric Dalleau <frederic.dalleau@linux.intel.com>
      Signed-off-by: NGustavo Padovan <gustavo.padovan@collabora.co.uk>
      20714bfe
  9. 03 12月, 2012 6 次提交
  10. 02 12月, 2012 2 次提交
  11. 01 12月, 2012 3 次提交
    • E
      net: move inet_dport/inet_num in sock_common · ce43b03e
      Eric Dumazet 提交于
      commit 68835aba (net: optimize INET input path further)
      moved some fields used for tcp/udp sockets lookup in the first cache
      line of struct sock_common.
      
      This patch moves inet_dport/inet_num as well, filling a 32bit hole
      on 64 bit arches and reducing number of cache line misses in lookups.
      
      Also change INET_MATCH()/INET_TW_MATCH() to perform the ports match
      before addresses match, as this check is more discriminant.
      
      Remove the hash check from MATCH() macros because we dont need to
      re validate the hash value after taking a refcount on socket, and
      use likely/unlikely compiler hints, as the sk_hash/hash check
      makes the following conditional tests 100% predicted by cpu.
      
      Introduce skc_addrpair/skc_portpair pair values to better
      document the alignment requirements of the port/addr pairs
      used in the various MATCH() macros, and remove some casts.
      
      The namespace check can also be done at last.
      
      This slightly improves TCP/UDP lookup times.
      
      IP/TCP early demux needs inet->rx_dst_ifindex and
      TCP needs inet->min_ttl, lets group them together in same cache line.
      
      With help from Ben Hutchings & Joe Perches.
      
      Idea of this patch came after Ling Ma proposal to move skc_hash
      to the beginning of struct sock_common, and should allow him
      to submit a final version of his patch. My tests show an improvement
      doing so.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Ben Hutchings <bhutchings@solarflare.com>
      Cc: Joe Perches <joe@perches.com>
      Cc: Ling Ma <ling.ma.program@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ce43b03e
    • H
      ssb: extif: fix compile errors · 0362063b
      Hauke Mehrtens 提交于
      If CONFIG_SSB_EMBEDDED or CONFIG_SSB_DRIVER_MIPS is set and
      CONFIG_SSB_DRIVER_EXTIF is not set, it will cause compile problems
      because of missing functions. This patch fixes these problems.
      
      The mips driver now also uses ssb_chipco_available() instead of
      checking bus->chipco.dev manually.
      Signed-off-by: NHauke Mehrtens <hauke@hauke-m.de>
      Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
      0362063b
    • R
      rtnelink: remove unused parameter from rtnl_create_link(). · c0713563
      Rami Rosen 提交于
      This patch removes an unused parameter (src_net) from rtnl_create_link()
      method and from the method single invocation, in veth.
      This parameter was used in the past when calling
      ops->get_tx_queues(src_net, tb) in rtnl_create_link().
      The get_tx_queues() member of rtnl_link_ops was replaced by two methods,
      get_num_tx_queues() and get_num_rx_queues(), which do not get any
      parameter. This was done in commit d40156aa by
      Jiri Pirko ("rtnl: allow to specify different num for rx and tx queue count").
      Signed-off-by: NRami Rosen <ramirose@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c0713563
  12. 30 11月, 2012 2 次提交