1. 16 12月, 2014 1 次提交
    • T
      gre: fix the inner mac header in nbma tunnel xmit path · 8a0033a9
      Timo Teräs 提交于
      The NBMA GRE tunnels temporarily push GRE header that contain the
      per-packet NBMA destination on the skb via header ops early in xmit
      path. It is the later pulled before the real GRE header is constructed.
      
      The inner mac was thus set differently in nbma case: the GRE header
      has been pushed by neighbor layer, and mac header points to beginning
      of the temporary gre header (set by dev_queue_xmit).
      
      Now that the offloads expect mac header to point to the gre payload,
      fix the xmit patch to:
       - pull first the temporary gre header away
       - and reset mac header to point to gre payload
      
      This fixes tso to work again with nbma tunnels.
      
      Fixes: 14051f04 ("gre: Use inner mac length when computing tunnel length")
      Signed-off-by: NTimo Teräs <timo.teras@iki.fi>
      Cc: Tom Herbert <therbert@google.com>
      Cc: Alexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8a0033a9
  2. 06 11月, 2014 1 次提交
  3. 08 10月, 2014 1 次提交
    • E
      net: better IFF_XMIT_DST_RELEASE support · 02875878
      Eric Dumazet 提交于
      Testing xmit_more support with netperf and connected UDP sockets,
      I found strange dst refcount false sharing.
      
      Current handling of IFF_XMIT_DST_RELEASE is not optimal.
      
      Dropping dst in validate_xmit_skb() is certainly too late in case
      packet was queued by cpu X but dequeued by cpu Y
      
      The logical point to take care of drop/force is in __dev_queue_xmit()
      before even taking qdisc lock.
      
      As Julian Anastasov pointed out, need for skb_dst() might come from some
      packet schedulers or classifiers.
      
      This patch adds new helper to cleanly express needs of various drivers
      or qdiscs/classifiers.
      
      Drivers that need skb_dst() in their ndo_start_xmit() should call
      following helper in their setup instead of the prior :
      
      	dev->priv_flags &= ~IFF_XMIT_DST_RELEASE;
      ->
      	netif_keep_dst(dev);
      
      Instead of using a single bit, we use two bits, one being
      eventually rebuilt in bonding/team drivers.
      
      The other one, is permanent and blocks IFF_XMIT_DST_RELEASE being
      rebuilt in bonding/team. Eventually, we could add something
      smarter later.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Julian Anastasov <ja@ssi.bg>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      02875878
  4. 02 10月, 2014 1 次提交
  5. 20 9月, 2014 1 次提交
  6. 11 6月, 2014 1 次提交
  7. 24 4月, 2014 1 次提交
    • N
      gre: add x-netns support · b57708ad
      Nicolas Dichtel 提交于
      This patch allows to switch the netns when packet is encapsulated or
      decapsulated. In other word, the encapsulated packet is received in a netns,
      where the lookup is done to find the tunnel. Once the tunnel is found, the
      packet is decapsulated and injecting into the corresponding interface which
      stands to another netns.
      
      When one of the two netns is removed, the tunnel is destroyed.
      Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b57708ad
  8. 13 4月, 2014 1 次提交
    • N
      gre: don't allow to add the same tunnel twice · 5a455275
      Nicolas Dichtel 提交于
      Before the patch, it was possible to add two times the same tunnel:
      ip l a gre1 type gre remote 10.16.0.121 local 10.16.0.249
      ip l a gre2 type gre remote 10.16.0.121 local 10.16.0.249
      
      It was possible, because ip_tunnel_newlink() calls ip_tunnel_find() with the
      argument dev->type, which was set only later (when calling ndo_init handler
      in register_netdevice()). Let's set this type in the setup handler, which is
      called before newlink handler.
      
      Introduced by commit c5441932 ("GRE: Refactor GRE tunneling code.").
      
      CC: Pravin B Shelar <pshelar@nicira.com>
      Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5a455275
  9. 28 1月, 2014 1 次提交
    • D
      net: gre: use icmp_hdr() to get inner ip header · c0c0c50f
      Duan Jiong 提交于
      When dealing with icmp messages, the skb->data points the
      ip header that triggered the sending of the icmp message.
      
      In gre_cisco_err(), the parse_gre_header() is called, and the
      iptunnel_pull_header() is called to pull the skb at the end of
      the parse_gre_header(), so the skb->data doesn't point the
      inner ip header.
      
      Unfortunately, the ipgre_err still needs those ip addresses in
      inner ip header to look up tunnel by ip_tunnel_lookup().
      
      So just use icmp_hdr() to get inner ip header instead of skb->data.
      Signed-off-by: NDuan Jiong <duanj.fnst@cn.fujitsu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c0c0c50f
  10. 19 1月, 2014 1 次提交
  11. 19 12月, 2013 1 次提交
  12. 15 8月, 2013 1 次提交
    • N
      ipip: add x-netns support · 6c742e71
      Nicolas Dichtel 提交于
      This patch allows to switch the netns when packet is encapsulated or
      decapsulated. In other word, the encapsulated packet is received in a netns,
      where the lookup is done to find the tunnel. Once the tunnel is found, the
      packet is decapsulated and injecting into the corresponding interface which
      stands to another netns.
      
      When one of the two netns is removed, the tunnel is destroyed.
      Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6c742e71
  13. 10 8月, 2013 1 次提交
  14. 02 7月, 2013 1 次提交
    • C
      gre: fix a regression in ioctl · 6c734fb8
      Cong Wang 提交于
      When testing GRE tunnel, I got:
      
       # ip tunnel show
       get tunnel gre0 failed: Invalid argument
       get tunnel gre1 failed: Invalid argument
      
      This is a regression introduced by commit c5441932
      ("GRE: Refactor GRE tunneling code.") because previously we
      only check the parameters for SIOCADDTUNNEL and SIOCCHGTUNNEL,
      after that commit, the check is moved for all commands.
      
      So, just check for SIOCADDTUNNEL and SIOCCHGTUNNEL.
      
      After this patch I got:
      
       # ip tunnel show
       gre0: gre/ip  remote any  local any  ttl inherit  nopmtudisc
       gre1: gre/ip  remote 192.168.122.101  local 192.168.122.45  ttl inherit
      
      Cc: Pravin B Shelar <pshelar@nicira.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: NCong Wang <amwang@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6c734fb8
  15. 20 6月, 2013 3 次提交
  16. 01 6月, 2013 1 次提交
  17. 20 5月, 2013 1 次提交
  18. 25 4月, 2013 1 次提交
  19. 09 4月, 2013 1 次提交
  20. 31 3月, 2013 1 次提交
    • A
      ip_gre: don't overwrite iflink during net_dev init · 537aadc3
      Antonio Quartulli 提交于
      iflink is currently set to 0 in __gre_tunnel_init(). This
      function is invoked in gre_tap_init() and
      ipgre_tunnel_init() which are both used to initialise the
      ndo_init field of the respective net_device_ops structs
      (ipgre.. and gre_tap..) used by GRE interfaces.
      
      However, in netdevice_register() iflink is first set to -1,
      then ndo_init is invoked and then iflink is assigned to a
      proper value if and only if it still was -1.
      
      Assigning 0 to iflink in ndo_init is therefore first
      preventing netdev_register() to correctly assign it a proper
      value and then breaking iflink at all since 0 has not
      correct meaning.
      
      Fix this by removing the iflink assignment in
      __gre_tunnel_init().
      
      Introduced by c5441932
      ("GRE: Refactor GRE tunneling code.")
      Reported-by: NFengguang Wu <fengguang.wu@intel.com>
      Cc: Pravin B Shelar <pshelar@nicira.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: NAntonio Quartulli <ordex@autistici.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      537aadc3
  21. 27 3月, 2013 1 次提交
    • P
      GRE: Refactor GRE tunneling code. · c5441932
      Pravin B Shelar 提交于
      Following patch refactors GRE code into ip tunneling code and GRE
      specific code. Common tunneling code is moved to ip_tunnel module.
      ip_tunnel module is written as generic library which can be used
      by different tunneling implementations.
      
      ip_tunnel module contains following components:
       - packet xmit and rcv generic code. xmit flow looks like
         (gre_xmit/ipip_xmit)->ip_tunnel_xmit->ip_local_out.
       - hash table of all devices.
       - lookup for tunnel devices.
       - control plane operations like device create, destroy, ioctl, netlink
         operations code.
       - registration for tunneling modules, like gre, ipip etc.
       - define single pcpu_tstats dev->tstats.
       - struct tnl_ptk_info added to pass parsed tunnel packet parameters.
      
      ipip.h header is renamed to ip_tunnel.h
      Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c5441932
  22. 17 3月, 2013 1 次提交
  23. 10 3月, 2013 1 次提交
  24. 26 2月, 2013 3 次提交
    • P
      Revert "ip_gre: propogate target device GSO capability to the tunnel device" · 7992ae6d
      Pravin B Shelar 提交于
      This reverts commit eb6b9a8c.
      
      Above commit limits GSO capability of gre device to just TSO, but
      software GRE-GSO is capable of handling all GSO capabilities.
      
      This patch also fixes following panic which reverted commit introduced:-
      
      BUG: unable to handle kernel NULL pointer dereference at 00000000000000a2
      IP: [<ffffffffa0680fd1>] ipgre_tunnel_bind_dev+0x161/0x1f0 [ip_gre]
      PGD 42bc19067 PUD 42bca9067 PMD 0
      Oops: 0000 [#1] SMP
      Pid: 2636, comm: ip Tainted: GF            3.8.0+ #83 Dell Inc. PowerEdge R620/0KCKR5
      RIP: 0010:[<ffffffffa0680fd1>]  [<ffffffffa0680fd1>] ipgre_tunnel_bind_dev+0x161/0x1f0 [ip_gre]
      RSP: 0018:ffff88042bfcb708  EFLAGS: 00010246
      RAX: 00000000000005b6 RBX: ffff88042d2fa000 RCX: 0000000000000044
      RDX: 0000000000000018 RSI: 0000000000000078 RDI: 0000000000000060
      RBP: ffff88042bfcb748 R08: 0000000000000018 R09: 000000000000000c
      R10: 0000000000000020 R11: 000000000101010a R12: ffff88042d2fa800
      R13: 0000000000000000 R14: ffff88042d2fa800 R15: ffff88042cd7f650
      FS:  00007fa784f55700(0000) GS:ffff88043fd20000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00000000000000a2 CR3: 000000042d8b9000 CR4: 00000000000407e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process ip (pid: 2636, threadinfo ffff88042bfca000, task ffff88042d142a80)
      Stack:
       0000000100000000 002f000000000000 0a01010100000000 000000000b010101
       ffff88042d2fa800 ffff88042d2fa000 ffff88042bfcb858 ffff88042f418c00
       ffff88042bfcb798 ffffffffa068199a ffff88042bfcb798 ffff88042d2fa830
      Call Trace:
       [<ffffffffa068199a>] ipgre_newlink+0xca/0x160 [ip_gre]
       [<ffffffff8143b692>] rtnl_newlink+0x532/0x5f0
       [<ffffffff8143b2fc>] ? rtnl_newlink+0x19c/0x5f0
       [<ffffffff81438978>] rtnetlink_rcv_msg+0x2c8/0x340
       [<ffffffff814386b0>] ? rtnetlink_rcv+0x40/0x40
       [<ffffffff814560f9>] netlink_rcv_skb+0xa9/0xd0
       [<ffffffff81438695>] rtnetlink_rcv+0x25/0x40
       [<ffffffff81455ddc>] netlink_unicast+0x1ac/0x230
       [<ffffffff81456a45>] netlink_sendmsg+0x265/0x380
       [<ffffffff814138c0>] sock_sendmsg+0xb0/0xe0
       [<ffffffff8141141e>] ? move_addr_to_kernel+0x4e/0x90
       [<ffffffff81420445>] ? verify_iovec+0x85/0xf0
       [<ffffffff81414ffd>] __sys_sendmsg+0x3fd/0x420
       [<ffffffff8114b701>] ? handle_mm_fault+0x251/0x3b0
       [<ffffffff8114f39f>] ? vma_link+0xcf/0xe0
       [<ffffffff81415239>] sys_sendmsg+0x49/0x90
       [<ffffffff814ffd19>] system_call_fastpath+0x16/0x1b
      
      CC: Dmitry Kravkov <dmitry@broadcom.com>
      Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
      Acked-by: NDmitry Kravkov <dmitry@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7992ae6d
    • P
      IP_GRE: Fix GRE_CSUM case. · 8f10098f
      Pravin B Shelar 提交于
      commit "ip_gre: allow CSUM capable devices to handle packets"
      aa0e51cd, broke GRE_CSUM case.
      GRE_CSUM needs checksum computed for inner packet. Therefore
      csum-calculation can not be offloaded if tunnel device requires
      GRE_CSUM.  Following patch fixes it by computing inner packet checksum
      for GRE_CSUM type, for all other type of GRE devices csum is offloaded.
      
      CC: Dmitry Kravkov <dmitry@broadcom.com>
      Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
      Acked-by: NDmitry Kravkov <dmitry@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8f10098f
    • P
      IP_GRE: Fix IP-Identification. · 490ab081
      Pravin B Shelar 提交于
      GRE-GSO generates ip fragments with id 0,2,3,4... for every
      GSO packet, which is not correct. Following patch fixes it
      by setting ip-header id unique id of fragments are allowed.
      As Eric Dumazet suggested it is optimized by using inner ip-header
      whenever inner packet is ipv4.
      Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      490ab081
  25. 20 2月, 2013 1 次提交
  26. 19 2月, 2013 2 次提交
  27. 16 2月, 2013 1 次提交
    • P
      v4 GRE: Add TCP segmentation offload for GRE · 68c33163
      Pravin B Shelar 提交于
      Following patch adds GRE protocol offload handler so that
      skb_gso_segment() can segment GRE packets.
      SKB GSO CB is added to keep track of total header length so that
      skb_segment can push entire header. e.g. in case of GRE, skb_segment
      need to push inner and outer headers to every segment.
      New NETIF_F_GRE_GSO feature is added for devices which support HW
      GRE TSO offload. Currently none of devices support it therefore GRE GSO
      always fall backs to software GSO.
      
      [ Compute pkt_len before ip_local_out() invocation. -DaveM ]
      Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      68c33163
  28. 30 1月, 2013 1 次提交
    • D
      ip_gre: When TOS is inherited, use configured TOS value for non-IP packets · 040468a0
      David Ward 提交于
      A GRE tunnel can be configured so that outgoing tunnel packets inherit
      the value of the TOS field from the inner IP header. In doing so, when
      a non-IP packet is transmitted through the tunnel, the TOS field will
      always be set to 0.
      
      Instead, the user should be able to configure a different TOS value as
      the fallback to use for non-IP packets. This is helpful when the non-IP
      packets are all control packets and should be handled by routers outside
      the tunnel as having Internet Control precedence. One example of this is
      the NHRP packets that control a DMVPN-compatible mGRE tunnel; they are
      encapsulated directly by GRE and do not contain an inner IP header.
      
      Under the existing behavior, the IFLA_GRE_TOS parameter must be set to
      '1' for the TOS value to be inherited. Now, only the least significant
      bit of this parameter must be set to '1', and when a non-IP packet is
      sent through the tunnel, the upper 6 bits of this same parameter will be
      copied into the TOS field. (The ECN bits get masked off as before.)
      
      This behavior is backwards-compatible with existing configurations and
      iproute2 versions.
      Signed-off-by: NDavid Ward <david.ward@ll.mit.edu>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      040468a0
  29. 28 1月, 2013 2 次提交
    • E
      net: fix possible wrong checksum generation · cef401de
      Eric Dumazet 提交于
      Pravin Shelar mentioned that GSO could potentially generate
      wrong TX checksum if skb has fragments that are overwritten
      by the user between the checksum computation and transmit.
      
      He suggested to linearize skbs but this extra copy can be
      avoided for normal tcp skbs cooked by tcp_sendmsg().
      
      This patch introduces a new SKB_GSO_SHARED_FRAG flag, set
      in skb_shinfo(skb)->gso_type if at least one frag can be
      modified by the user.
      
      Typical sources of such possible overwrites are {vm}splice(),
      sendfile(), and macvtap/tun/virtio_net drivers.
      
      Tested:
      
      $ netperf -H 7.7.8.84
      MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
      7.7.8.84 () port 0 AF_INET
      Recv   Send    Send
      Socket Socket  Message  Elapsed
      Size   Size    Size     Time     Throughput
      bytes  bytes   bytes    secs.    10^6bits/sec
      
       87380  16384  16384    10.00    3959.52
      
      $ netperf -H 7.7.8.84 -t TCP_SENDFILE
      TCP SENDFILE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 7.7.8.84 ()
      port 0 AF_INET
      Recv   Send    Send
      Socket Socket  Message  Elapsed
      Size   Size    Size     Time     Throughput
      bytes  bytes   bytes    secs.    10^6bits/sec
      
       87380  16384  16384    10.00    3216.80
      
      Performance of the SENDFILE is impacted by the extra allocation and
      copy, and because we use order-0 pages, while the TCP_STREAM uses
      bigger pages.
      Reported-by: NPravin Shelar <pshelar@nicira.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cef401de
    • P
      IP_GRE: Fix kernel panic in IP_GRE with GRE csum. · 5465740a
      Pravin B Shelar 提交于
      Due to IP_GRE GSO support, GRE can recieve non linear skb which
      results in panic in case of GRE_CSUM.  Following patch fixes it by
      using correct csum API.
      
      Bug introduced in commit 6b78f16e (gre: add GSO support)
      Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5465740a
  30. 27 12月, 2012 1 次提交
  31. 22 12月, 2012 2 次提交
  32. 19 11月, 2012 1 次提交
    • E
      net: Allow userns root to control ipv4 · 52e804c6
      Eric W. Biederman 提交于
      Allow an unpriviled user who has created a user namespace, and then
      created a network namespace to effectively use the new network
      namespace, by reducing capable(CAP_NET_ADMIN) and
      capable(CAP_NET_RAW) calls to be ns_capable(net->user_ns,
      CAP_NET_ADMIN), or capable(net->user_ns, CAP_NET_RAW) calls.
      
      Settings that merely control a single network device are allowed.
      Either the network device is a logical network device where
      restrictions make no difference or the network device is hardware NIC
      that has been explicity moved from the initial network namespace.
      
      In general policy and network stack state changes are allowed
      while resource control is left unchanged.
      
      Allow creating raw sockets.
      Allow the SIOCSARP ioctl to control the arp cache.
      Allow the SIOCSIFFLAG ioctl to allow setting network device flags.
      Allow the SIOCSIFADDR ioctl to allow setting a netdevice ipv4 address.
      Allow the SIOCSIFBRDADDR ioctl to allow setting a netdevice ipv4 broadcast address.
      Allow the SIOCSIFDSTADDR ioctl to allow setting a netdevice ipv4 destination address.
      Allow the SIOCSIFNETMASK ioctl to allow setting a netdevice ipv4 netmask.
      Allow the SIOCADDRT and SIOCDELRT ioctls to allow adding and deleting ipv4 routes.
      
      Allow the SIOCADDTUNNEL, SIOCCHGTUNNEL and SIOCDELTUNNEL ioctls for
      adding, changing and deleting gre tunnels.
      
      Allow the SIOCADDTUNNEL, SIOCCHGTUNNEL and SIOCDELTUNNEL ioctls for
      adding, changing and deleting ipip tunnels.
      
      Allow the SIOCADDTUNNEL, SIOCCHGTUNNEL and SIOCDELTUNNEL ioctls for
      adding, changing and deleting ipsec virtual tunnel interfaces.
      
      Allow setting the MRT_INIT, MRT_DONE, MRT_ADD_VIF, MRT_DEL_VIF, MRT_ADD_MFC,
      MRT_DEL_MFC, MRT_ASSERT, MRT_PIM, MRT_TABLE socket options on multicast routing
      sockets.
      
      Allow setting and receiving IPOPT_CIPSO, IP_OPT_SEC, IP_OPT_SID and
      arbitrary ip options.
      
      Allow setting IP_SEC_POLICY/IP_XFRM_POLICY ipv4 socket option.
      Allow setting the IP_TRANSPARENT ipv4 socket option.
      Allow setting the TCP_REPAIR socket option.
      Allow setting the TCP_CONGESTION socket option.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      52e804c6
  33. 15 11月, 2012 1 次提交