1. 05 2月, 2015 13 次提交
  2. 04 2月, 2015 4 次提交
    • D
      Merge branch 'virtio_net_ufo' · 4c122f4c
      David S. Miller 提交于
      Vladislav Yasevich says:
      
      ====================
      Restore UFO support to virtio_net devices
      
      commit 3d0ad094
      Author: Ben Hutchings <ben@decadent.org.uk>
      Date:   Thu Oct 30 18:27:12 2014 +0000
      
          drivers/net: Disable UFO through virtio
      
      Turned off UFO support to virtio-net based devices due to issues
      with IPv6 fragment id generation for UFO packets.  The issue
      was that IPv6 UFO/GSO implementation expects the fragment id
      to be supplied in skb_shinfo().  However, for packets generated
      by the VMs, the fragment id is not supplied which causes all
      IPv6 fragments to have the id of 0.
      
      The problem is that turning off UFO support on tap/macvtap
      as well as virtio devices caused issues with migrations.
      Migrations would fail when moving a vm from a kernel supporting
      expecting UFO to work to the newer kernels that disabled UFO.
      
      This series provides a partial solution to address the migration
      issue.  The series allows us to track whether skb_shinfo()->ip6_frag_id
      has been set by treating value of 0 as unset.
      This lets GSO code to generate fragment ids if they are necessary
      (ex: packet was generated by VM or packet socket).
      
      Since v3:
        - Resolved build issue when IPv6 is a module.
        - Removed trailing white space.
      
      Since v2:
        - Rebase and rebuild to make sure everything works.  No changes
          to the patches were done.
      
      Since v1:
        - Removed the skb bit and use value of 0 as tracker.
        - Used Eric's suggestion to set fragment id as 0x80000000 if id
          generation procedure yeilded a 0 result.
        - Consolidated ipv6 id genration code.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4c122f4c
    • V
      Revert "drivers/net: Disable UFO through virtio" · e3e3c423
      Vlad Yasevich 提交于
      This reverts commit 3d0ad094.
      
      Now that GSO functionality can correctly track if the fragment
      id has been selected and select a fragment id if necessary,
      we can re-enable UFO on tap/macvap and virtio devices.
      Signed-off-by: NVladislav Yasevich <vyasevic@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e3e3c423
    • V
      Revert "drivers/net, ipv6: Select IPv6 fragment idents for virtio UFO packets" · 72f65107
      Vlad Yasevich 提交于
      This reverts commit 5188cd44.
      
      Now that GSO layer can track if fragment id has been selected
      and can allocate one if necessary, we don't need to do this in
      tap and macvtap.  This reverts most of the code and only keeps
      the new ipv6 fragment id generation function that is still needed.
      
      Fixes: 3d0ad094 (drivers/net: Disable UFO through virtio)
      Signed-off-by: NVladislav Yasevich <vyasevic@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      72f65107
    • V
      ipv6: Select fragment id during UFO segmentation if not set. · 0508c07f
      Vlad Yasevich 提交于
      If the IPv6 fragment id has not been set and we perform
      fragmentation due to UFO, select a new fragment id.
      We now consider a fragment id of 0 as unset and if id selection
      process returns 0 (after all the pertrubations), we set it to
      0x80000000, thus giving us ample space not to create collisions
      with the next packet we may have to fragment.
      
      When doing UFO integrity checking, we also select the
      fragment id if it has not be set yet.   This is stored into
      the skb_shinfo() thus allowing UFO to function correclty.
      
      This patch also removes duplicate fragment id generation code
      and moves ipv6_select_ident() into the header as it may be
      used during GSO.
      Signed-off-by: NVladislav Yasevich <vyasevic@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0508c07f
  3. 03 2月, 2015 7 次提交
  4. 02 2月, 2015 1 次提交
    • E
      ipv4: tcp: get rid of ugly unicast_sock · bdbbb852
      Eric Dumazet 提交于
      In commit be9f4a44 ("ipv4: tcp: remove per net tcp_sock")
      I tried to address contention on a socket lock, but the solution
      I chose was horrible :
      
      commit 3a7c384f ("ipv4: tcp: unicast_sock should not land outside
      of TCP stack") addressed a selinux regression.
      
      commit 0980e56e ("ipv4: tcp: set unicast_sock uc_ttl to -1")
      took care of another regression.
      
      commit b5ec8eea ("ipv4: fix ip_send_skb()") fixed another regression.
      
      commit 811230cd ("tcp: ipv4: initialize unicast_sock sk_pacing_rate")
      was another shot in the dark.
      
      Really, just use a proper socket per cpu, and remove the skb_orphan()
      call, to re-enable flow control.
      
      This solves a serious problem with FQ packet scheduler when used in
      hostile environments, as we do not want to allocate a flow structure
      for every RST packet sent in response to a spoofed packet.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bdbbb852
  5. 01 2月, 2015 2 次提交
    • E
      net: sched: fix panic in rate estimators · 0d32ef8c
      Eric Dumazet 提交于
      Doing the following commands on a non idle network device
      panics the box instantly, because cpu_bstats gets overwritten
      by stats.
      
      tc qdisc add dev eth0 root <your_favorite_qdisc>
      ... some traffic (one packet is enough) ...
      tc qdisc replace dev eth0 root est 1sec 4sec <your_favorite_qdisc>
      
      [  325.355596] BUG: unable to handle kernel paging request at ffff8841dc5a074c
      [  325.362609] IP: [<ffffffff81541c9e>] __gnet_stats_copy_basic+0x3e/0x90
      [  325.369158] PGD 1fa7067 PUD 0
      [  325.372254] Oops: 0000 [#1] SMP
      [  325.375514] Modules linked in: ...
      [  325.398346] CPU: 13 PID: 14313 Comm: tc Not tainted 3.19.0-smp-DEV #1163
      [  325.412042] task: ffff8800793ab5d0 ti: ffff881ff2fa4000 task.ti: ffff881ff2fa4000
      [  325.419518] RIP: 0010:[<ffffffff81541c9e>]  [<ffffffff81541c9e>] __gnet_stats_copy_basic+0x3e/0x90
      [  325.428506] RSP: 0018:ffff881ff2fa7928  EFLAGS: 00010286
      [  325.433824] RAX: 000000000000000c RBX: ffff881ff2fa796c RCX: 000000000000000c
      [  325.440988] RDX: ffff8841dc5a0744 RSI: 0000000000000060 RDI: 0000000000000060
      [  325.448120] RBP: ffff881ff2fa7948 R08: ffffffff81cd4f80 R09: 0000000000000000
      [  325.455268] R10: ffff883ff223e400 R11: 0000000000000000 R12: 000000015cba0744
      [  325.462405] R13: ffffffff81cd4f80 R14: ffff883ff223e460 R15: ffff883feea0722c
      [  325.469536] FS:  00007f2ee30fa700(0000) GS:ffff88407fa20000(0000) knlGS:0000000000000000
      [  325.477630] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  325.483380] CR2: ffff8841dc5a074c CR3: 0000003feeae9000 CR4: 00000000001407e0
      [  325.490510] Stack:
      [  325.492524]  ffff883feea0722c ffff883fef719dc0 ffff883feea0722c ffff883ff223e4a0
      [  325.499990]  ffff881ff2fa79a8 ffffffff815424ee ffff883ff223e49c 000000015cba0744
      [  325.507460]  00000000f2fa7978 0000000000000000 ffff881ff2fa79a8 ffff883ff223e4a0
      [  325.514956] Call Trace:
      [  325.517412]  [<ffffffff815424ee>] gen_new_estimator+0x8e/0x230
      [  325.523250]  [<ffffffff815427aa>] gen_replace_estimator+0x4a/0x60
      [  325.529349]  [<ffffffff815718ab>] tc_modify_qdisc+0x52b/0x590
      [  325.535117]  [<ffffffff8155edd0>] rtnetlink_rcv_msg+0xa0/0x240
      [  325.540963]  [<ffffffff8155ed30>] ? __rtnl_unlock+0x20/0x20
      [  325.546532]  [<ffffffff8157f811>] netlink_rcv_skb+0xb1/0xc0
      [  325.552145]  [<ffffffff8155b355>] rtnetlink_rcv+0x25/0x40
      [  325.557558]  [<ffffffff8157f0d8>] netlink_unicast+0x168/0x220
      [  325.563317]  [<ffffffff8157f47c>] netlink_sendmsg+0x2ec/0x3e0
      
      Lets play safe and not use an union : percpu 'pointers' are mostly read
      anyway, and we have typically few qdiscs per host.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: John Fastabend <john.fastabend@gmail.com>
      Fixes: 22e0f8b9 ("net: sched: make bstats per cpu and estimator RCU safe")
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0d32ef8c
    • H
      hyperv: Fix the error processing in netvsc_send() · d953ca4d
      Haiyang Zhang 提交于
      The existing code frees the skb in EAGAIN case, in which the skb will be
      retried from upper layer and used again.
      Also, the existing code doesn't free send buffer slot in error case, because
      there is no completion message for unsent packets.
      This patch fixes these problems.
      
      (Please also include this patch for stable trees. Thanks!)
      Signed-off-by: NHaiyang Zhang <haiyangz@microsoft.com>
      Reviewed-by: NK. Y. Srinivasan <kys@microsoft.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d953ca4d
  6. 31 1月, 2015 9 次提交
    • I
      drivers: net: xgene: fix: Out of order descriptor bytes read · ecf6ba83
      Iyappan Subramanian 提交于
      This patch fixes the following kernel crash,
      
      	WARNING: CPU: 2 PID: 0 at net/ipv4/tcp_input.c:3079 tcp_clean_rtx_queue+0x658/0x80c()
      	Call trace:
      	[<fffffe0000096b7c>] dump_backtrace+0x0/0x184
      	[<fffffe0000096d10>] show_stack+0x10/0x1c
      	[<fffffe0000685ea0>] dump_stack+0x74/0x98
      	[<fffffe00000b44e0>] warn_slowpath_common+0x88/0xb0
      	[<fffffe00000b461c>] warn_slowpath_null+0x14/0x20
      	[<fffffe00005b5c1c>] tcp_clean_rtx_queue+0x654/0x80c
      	[<fffffe00005b6228>] tcp_ack+0x454/0x688
      	[<fffffe00005b6ca8>] tcp_rcv_established+0x4a4/0x62c
      	[<fffffe00005bf4b4>] tcp_v4_do_rcv+0x16c/0x350
      	[<fffffe00005c225c>] tcp_v4_rcv+0x8e8/0x904
      	[<fffffe000059d470>] ip_local_deliver_finish+0x100/0x26c
      	[<fffffe000059dad8>] ip_local_deliver+0xac/0xc4
      	[<fffffe000059d6c4>] ip_rcv_finish+0xe8/0x328
      	[<fffffe000059dd3c>] ip_rcv+0x24c/0x38c
      	[<fffffe0000563950>] __netif_receive_skb_core+0x29c/0x7c8
      	[<fffffe0000563ea4>] __netif_receive_skb+0x28/0x7c
      	[<fffffe0000563f54>] netif_receive_skb_internal+0x5c/0xe0
      	[<fffffe0000564810>] napi_gro_receive+0xb4/0x110
      	[<fffffe0000482a2c>] xgene_enet_process_ring+0x144/0x338
      	[<fffffe0000482d18>] xgene_enet_napi+0x1c/0x50
      	[<fffffe0000565454>] net_rx_action+0x154/0x228
      	[<fffffe00000b804c>] __do_softirq+0x110/0x28c
      	[<fffffe00000b8424>] irq_exit+0x8c/0xc0
      	[<fffffe0000093898>] handle_IRQ+0x44/0xa8
      	[<fffffe000009032c>] gic_handle_irq+0x38/0x7c
      	[...]
      
      Software writes poison data into the descriptor bytes[15:8] and upon
      receiving the interrupt, if those bytes are overwritten by the hardware with
      the valid data, software also reads bytes[7:0] and executes receive/tx
      completion logic.
      
      If the CPU executes the above two reads in out of order fashion, then the
      bytes[7:0] will have older data and causing the kernel panic.  We have to
      force the order of the reads and thus this patch introduces read memory
      barrier between these reads.
      Signed-off-by: NIyappan Subramanian <isubramanian@apm.com>
      Signed-off-by: NKeyur Chudgar <kchudgar@apm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ecf6ba83
    • D
      Merge branch 'vlan_get_protocol' · 08178e5a
      David S. Miller 提交于
      Toshiaki Makita says:
      
      ====================
      Fix checksum error when using stacked vlan
      
      When I was testing 802.1ad, I found several drivers don't take into
      account 802.1ad or multiple vlans when retrieving L3 (IP/IPv6) or
      L4 (TCP/UDP) protocol for checksum offload.
      
      It is mainly due to vlan_get_protocol(), which extracts ether type only
      when it is tagged with single 802.1Q. When 802.1ad is used or there are
      multiple vlans, it extracts vlan protocol and drivers cannot determine
      which L3/L4 protocol is used.
      
      Those drivers, most of which have IP_CSUM/IPV6_CSUM features, get L3/L4
      header-offset by software, so it seems that their checksum offload works
      with multiple vlans if we can parse protocols correctly.
      (They know mac header length, and probably don't care about what is in it.)
      
      And another thing, some of Intel's drivers seem to use skb->protocol where
      vlan_get_protocol() is more suitable.
      
      I tested that at least igb/igbvf on I350 works with this patch set.
      
      Note:
      We can hand a double tagged packet with CHECKSUM_PARTIAL to a HW driver
      by creating a vlan device on a bridge device and enabling vlan_filtering
      of the bridge with 802.1ad protocol.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      08178e5a
    • T
      ixgbevf: Fix checksum error when using stacked vlan · 10e4fb33
      Toshiaki Makita 提交于
      When a skb has multiple vlans and it is CHECKSUM_PARTIAL,
      ixgbevf_tx_csum() fails to get the network protocol and checksum related
      descriptor fields are not configured correctly because skb->protocol
      doesn't show the L3 protocol in this case.
      
      Use first->protocol instead of skb->protocol to get the proper network
      protocol.
      Signed-off-by: NToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      10e4fb33
    • T
      ixgbe: Fix checksum error when using stacked vlan · 0213668f
      Toshiaki Makita 提交于
      When a skb has multiple vlans and it is CHECKSUM_PARTIAL,
      ixgbe_tx_csum() fails to get the network protocol and checksum related
      descriptor fields are not configured correctly because skb->protocol
      doesn't show the L3 protocol in this case.
      
      Use vlan_get_protocol() to get the proper network protocol.
      Signed-off-by: NToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0213668f
    • T
      igbvf: Fix checksum error when using stacked vlan · 72b14059
      Toshiaki Makita 提交于
      When a skb has multiple vlans and it is CHECKSUM_PARTIAL,
      igbvf_tx_csum() fails to get the network protocol and checksum related
      descriptor fields are not configured correctly because skb->protocol
      doesn't show the L3 protocol in this case.
      
      Use vlan_get_protocol() to get the proper network protocol.
      Signed-off-by: NToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      72b14059
    • T
      net: Fix vlan_get_protocol for stacked vlan · d4bcef3f
      Toshiaki Makita 提交于
      vlan_get_protocol() could not get network protocol if a skb has a 802.1ad
      vlan tag or multiple vlans, which caused incorrect checksum calculation
      in several drivers.
      
      Fix vlan_get_protocol() to retrieve network protocol instead of incorrect
      vlan protocol.
      
      As the logic is the same as skb_network_protocol(), create a common helper
      function __vlan_get_protocol() and call it from existing functions.
      Signed-off-by: NToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d4bcef3f
    • S
      net: sctp: fix passing wrong parameter header to param_type2af in sctp_process_param · cfbf654e
      Saran Maruti Ramanara 提交于
      When making use of RFC5061, section 4.2.4. for setting the primary IP
      address, we're passing a wrong parameter header to param_type2af(),
      resulting always in NULL being returned.
      
      At this point, param.p points to a sctp_addip_param struct, containing
      a sctp_paramhdr (type = 0xc004, length = var), and crr_id as a correlation
      id. Followed by that, as also presented in RFC5061 section 4.2.4., comes
      the actual sctp_addr_param, which also contains a sctp_paramhdr, but
      this time with the correct type SCTP_PARAM_IPV{4,6}_ADDRESS that
      param_type2af() can make use of. Since we already hold a pointer to
      addr_param from previous line, just reuse it for param_type2af().
      
      Fixes: d6de3097 ("[SCTP]: Add the handling of "Set Primary IP Address" parameter to INIT")
      Signed-off-by: NSaran Maruti Ramanara <saran.neti@telus.com>
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Acked-by: NVlad Yasevich <vyasevich@gmail.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cfbf654e
    • P
      netlink: fix wrong subscription bitmask to group mapping in · 8b7c36d8
      Pablo Neira 提交于
      The subscription bitmask passed via struct sockaddr_nl is converted to
      the group number when calling the netlink_bind() and netlink_unbind()
      callbacks.
      
      The conversion is however incorrect since bitmask (1 << 0) needs to be
      mapped to group number 1. Note that you cannot specify the group number 0
      (usually known as _NONE) from setsockopt() using NETLINK_ADD_MEMBERSHIP
      since this is rejected through -EINVAL.
      
      This problem became noticeable since 97840cb6 ("netfilter: nfnetlink:
      fix insufficient validation in nfnetlink_bind") when binding to bitmask
      (1 << 0) in ctnetlink.
      Reported-by: NAndre Tomt <andre@tomt.net>
      Reported-by: NIvan Delalande <colona@arista.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8b7c36d8
    • P
      netfilter: nf_tables: fix leaks in error path of nf_tables_newchain() · f5553c19
      Pablo Neira Ayuso 提交于
      Release statistics and module refcount on memory allocation problems.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      f5553c19
  7. 30 1月, 2015 4 次提交