1. 26 2月, 2013 3 次提交
    • P
      Revert "ip_gre: propogate target device GSO capability to the tunnel device" · 7992ae6d
      Pravin B Shelar 提交于
      This reverts commit eb6b9a8c.
      
      Above commit limits GSO capability of gre device to just TSO, but
      software GRE-GSO is capable of handling all GSO capabilities.
      
      This patch also fixes following panic which reverted commit introduced:-
      
      BUG: unable to handle kernel NULL pointer dereference at 00000000000000a2
      IP: [<ffffffffa0680fd1>] ipgre_tunnel_bind_dev+0x161/0x1f0 [ip_gre]
      PGD 42bc19067 PUD 42bca9067 PMD 0
      Oops: 0000 [#1] SMP
      Pid: 2636, comm: ip Tainted: GF            3.8.0+ #83 Dell Inc. PowerEdge R620/0KCKR5
      RIP: 0010:[<ffffffffa0680fd1>]  [<ffffffffa0680fd1>] ipgre_tunnel_bind_dev+0x161/0x1f0 [ip_gre]
      RSP: 0018:ffff88042bfcb708  EFLAGS: 00010246
      RAX: 00000000000005b6 RBX: ffff88042d2fa000 RCX: 0000000000000044
      RDX: 0000000000000018 RSI: 0000000000000078 RDI: 0000000000000060
      RBP: ffff88042bfcb748 R08: 0000000000000018 R09: 000000000000000c
      R10: 0000000000000020 R11: 000000000101010a R12: ffff88042d2fa800
      R13: 0000000000000000 R14: ffff88042d2fa800 R15: ffff88042cd7f650
      FS:  00007fa784f55700(0000) GS:ffff88043fd20000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00000000000000a2 CR3: 000000042d8b9000 CR4: 00000000000407e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process ip (pid: 2636, threadinfo ffff88042bfca000, task ffff88042d142a80)
      Stack:
       0000000100000000 002f000000000000 0a01010100000000 000000000b010101
       ffff88042d2fa800 ffff88042d2fa000 ffff88042bfcb858 ffff88042f418c00
       ffff88042bfcb798 ffffffffa068199a ffff88042bfcb798 ffff88042d2fa830
      Call Trace:
       [<ffffffffa068199a>] ipgre_newlink+0xca/0x160 [ip_gre]
       [<ffffffff8143b692>] rtnl_newlink+0x532/0x5f0
       [<ffffffff8143b2fc>] ? rtnl_newlink+0x19c/0x5f0
       [<ffffffff81438978>] rtnetlink_rcv_msg+0x2c8/0x340
       [<ffffffff814386b0>] ? rtnetlink_rcv+0x40/0x40
       [<ffffffff814560f9>] netlink_rcv_skb+0xa9/0xd0
       [<ffffffff81438695>] rtnetlink_rcv+0x25/0x40
       [<ffffffff81455ddc>] netlink_unicast+0x1ac/0x230
       [<ffffffff81456a45>] netlink_sendmsg+0x265/0x380
       [<ffffffff814138c0>] sock_sendmsg+0xb0/0xe0
       [<ffffffff8141141e>] ? move_addr_to_kernel+0x4e/0x90
       [<ffffffff81420445>] ? verify_iovec+0x85/0xf0
       [<ffffffff81414ffd>] __sys_sendmsg+0x3fd/0x420
       [<ffffffff8114b701>] ? handle_mm_fault+0x251/0x3b0
       [<ffffffff8114f39f>] ? vma_link+0xcf/0xe0
       [<ffffffff81415239>] sys_sendmsg+0x49/0x90
       [<ffffffff814ffd19>] system_call_fastpath+0x16/0x1b
      
      CC: Dmitry Kravkov <dmitry@broadcom.com>
      Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
      Acked-by: NDmitry Kravkov <dmitry@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7992ae6d
    • P
      IP_GRE: Fix GRE_CSUM case. · 8f10098f
      Pravin B Shelar 提交于
      commit "ip_gre: allow CSUM capable devices to handle packets"
      aa0e51cd, broke GRE_CSUM case.
      GRE_CSUM needs checksum computed for inner packet. Therefore
      csum-calculation can not be offloaded if tunnel device requires
      GRE_CSUM.  Following patch fixes it by computing inner packet checksum
      for GRE_CSUM type, for all other type of GRE devices csum is offloaded.
      
      CC: Dmitry Kravkov <dmitry@broadcom.com>
      Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
      Acked-by: NDmitry Kravkov <dmitry@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8f10098f
    • P
      IP_GRE: Fix IP-Identification. · 490ab081
      Pravin B Shelar 提交于
      GRE-GSO generates ip fragments with id 0,2,3,4... for every
      GSO packet, which is not correct. Following patch fixes it
      by setting ip-header id unique id of fragments are allowed.
      As Eric Dumazet suggested it is optimized by using inner ip-header
      whenever inner packet is ipv4.
      Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      490ab081
  2. 23 2月, 2013 3 次提交
  3. 22 2月, 2013 2 次提交
  4. 20 2月, 2013 2 次提交
  5. 19 2月, 2013 6 次提交
  6. 16 2月, 2013 2 次提交
  7. 14 2月, 2013 4 次提交
    • P
      net: Fix possible wrong checksum generation. · c9af6db4
      Pravin B Shelar 提交于
      Patch cef401de (net: fix possible wrong checksum
      generation) fixed wrong checksum calculation but it broke TSO by
      defining new GSO type but not a netdev feature for that type.
      net_gso_ok() would not allow hardware checksum/segmentation
      offload of such packets without the feature.
      
      Following patch fixes TSO and wrong checksum. This patch uses
      same logic that Eric Dumazet used. Patch introduces new flag
      SKBTX_SHARED_FRAG if at least one frag can be modified by
      the user. but SKBTX_SHARED_FRAG flag is kept in skb shared
      info tx_flags rather than gso_type.
      
      tx_flags is better compared to gso_type since we can have skb with
      shared frag without gso packet. It does not link SHARED_FRAG to
      GSO, So there is no need to define netdev feature for this.
      Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c9af6db4
    • A
      tcp: send packets with a socket timestamp · ee684b6f
      Andrey Vagin 提交于
      A socket timestamp is a sum of the global tcp_time_stamp and
      a per-socket offset.
      
      A socket offset is added in places where externally visible
      tcp timestamp option is parsed/initialized.
      
      Connections in the SYN_RECV state are not supported, global
      tcp_time_stamp is used for them, because repair mode doesn't support
      this state. In a future it can be implemented by the similar way
      as for TIME_WAIT sockets.
      
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
      Cc: James Morris <jmorris@namei.org>
      Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
      Cc: Patrick McHardy <kaber@trash.net>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Signed-off-by: NAndrey Vagin <avagin@openvz.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ee684b6f
    • A
      tcp: set and get per-socket timestamp · 93be6ce0
      Andrey Vagin 提交于
      A timestamp can be set, only if a socket is in the repair mode.
      
      This patch adds a new socket option TCP_TIMESTAMP, which allows to
      get and set current tcp times stamp.
      
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
      Cc: James Morris <jmorris@namei.org>
      Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
      Cc: Patrick McHardy <kaber@trash.net>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Signed-off-by: NAndrey Vagin <avagin@openvz.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      93be6ce0
    • A
      tcp: adding a per-socket timestamp offset · ceaa1fef
      Andrey Vagin 提交于
      This functionality is used for restoring tcp sockets. A tcp timestamp
      depends on how long a system has been running, so it's differ for each
      host. The solution is to set a per-socket offset.
      
      A per-socket offset for a TIME_WAIT socket is inherited from a proper
      tcp socket.
      
      tcp_request_sock doesn't have a timestamp offset, because the repair
      mode for them are not implemented.
      
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
      Cc: James Morris <jmorris@namei.org>
      Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
      Cc: Patrick McHardy <kaber@trash.net>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Signed-off-by: NAndrey Vagin <avagin@openvz.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ceaa1fef
  8. 11 2月, 2013 1 次提交
  9. 07 2月, 2013 1 次提交
  10. 06 2月, 2013 4 次提交
  11. 05 2月, 2013 3 次提交
  12. 04 2月, 2013 2 次提交
  13. 01 2月, 2013 1 次提交
  14. 30 1月, 2013 6 次提交
    • N
      tcp: Increment LISTENOVERFLOW and LISTENDROPS in tcp_v4_conn_request() · 2aeef18d
      Nivedita Singhvi 提交于
      We drop a connection request if the accept backlog is full and there are
      sufficient packets in the syn queue to warrant starting drops. Increment the
      appropriate counters so this isn't silent, for accurate stats and help in
      debugging.
      
      This patch assumes LINUX_MIB_LISTENDROPS is a superset of/includes the
      counter LINUX_MIB_LISTENOVERFLOWS.
      Signed-off-by: NNivedita Singhvi <niv@us.ibm.com>
      Acked-by: NVijay Subramanian <subramanian.vijay@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2aeef18d
    • D
      ip_gre: When TOS is inherited, use configured TOS value for non-IP packets · 040468a0
      David Ward 提交于
      A GRE tunnel can be configured so that outgoing tunnel packets inherit
      the value of the TOS field from the inner IP header. In doing so, when
      a non-IP packet is transmitted through the tunnel, the TOS field will
      always be set to 0.
      
      Instead, the user should be able to configure a different TOS value as
      the fallback to use for non-IP packets. This is helpful when the non-IP
      packets are all control packets and should be handled by routers outside
      the tunnel as having Internet Control precedence. One example of this is
      the NHRP packets that control a DMVPN-compatible mGRE tunnel; they are
      encapsulated directly by GRE and do not contain an inner IP header.
      
      Under the existing behavior, the IFLA_GRE_TOS parameter must be set to
      '1' for the TOS value to be inherited. Now, only the least significant
      bit of this parameter must be set to '1', and when a non-IP packet is
      sent through the tunnel, the upper 6 bits of this same parameter will be
      copied into the TOS field. (The ECN bits get masked off as before.)
      
      This behavior is backwards-compatible with existing configurations and
      iproute2 versions.
      Signed-off-by: NDavid Ward <david.ward@ll.mit.edu>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      040468a0
    • J
      ipv4: introduce address lifetime · 5c766d64
      Jiri Pirko 提交于
      There are some usecase when lifetime of ipv4 addresses might be helpful.
      For example:
      1) initramfs networkmanager uses a DHCP daemon to learn network
      configuration parameters
      2) initramfs networkmanager addresses, routes and DNS configuration
      3) initramfs networkmanager is requested to stop
      4) initramfs networkmanager stops all daemons including dhclient
      5) there are addresses and routes configured but no daemon running. If
      the system doesn't start networkmanager for some reason, addresses and
      routes will be used forever, which violates RFC 2131.
      
      This patch is essentially a backport of ivp6 address lifetime mechanism
      for ipv4 addresses.
      
      Current "ip" tool supports this without any patch (since it does not
      distinguish between ipv4 and ipv6 addresses in this perspective.
      
      Also, this should be back-compatible with all current netlink users.
      Reported-by: NPavel Šimerda <psimerda@redhat.com>
      Signed-off-by: NJiri Pirko <jiri@resnulli.us>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5c766d64
    • J
      net: frag, move LRU list maintenance outside of rwlock · 3ef0eb0d
      Jesper Dangaard Brouer 提交于
      Updating the fragmentation queues LRU (Least-Recently-Used) list,
      required taking the hash writer lock.  However, the LRU list isn't
      tied to the hash at all, so we can use a separate lock for it.
      Original-idea-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3ef0eb0d
    • J
      net: use lib/percpu_counter API for fragmentation mem accounting · 6d7b857d
      Jesper Dangaard Brouer 提交于
      Replace the per network namespace shared atomic "mem" accounting
      variable, in the fragmentation code, with a lib/percpu_counter.
      
      Getting percpu_counter to scale to the fragmentation code usage
      requires some tweaks.
      
      At first view, percpu_counter looks superfast, but it does not
      scale on multi-CPU/NUMA machines, because the default batch size
      is too small, for frag code usage.  Thus, I have adjusted the
      batch size by using __percpu_counter_add() directly, instead of
      percpu_counter_sub() and percpu_counter_add().
      
      The batch size is increased to 130.000, based on the largest 64K
      fragment memory usage.  This does introduce some imprecise
      memory accounting, but its does not need to be strict for this
      use-case.
      
      It is also essential, that the percpu_counter, does not
      share cacheline with other writers, to make this scale.
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6d7b857d
    • J
      net: frag helper functions for mem limit tracking · d433673e
      Jesper Dangaard Brouer 提交于
      This change is primarily a preparation to ease the extension of memory
      limit tracking.
      
      The change does reduce the number atomic operation, during freeing of
      a frag queue.  This does introduce a some performance improvement, as
      these atomic operations are at the core of the performance problems
      seen on NUMA systems.
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d433673e
新手
引导
客服 返回
顶部