1. 22 8月, 2010 1 次提交
    • D
      PPTP: PPP over IPv4 (Point-to-Point Tunneling Protocol) · 00959ade
      Dmitry Kozlov 提交于
      PPP: introduce "pptp" module which implements point-to-point tunneling protocol using pppox framework
      NET: introduce the "gre" module for demultiplexing GRE packets on version criteria
           (required to pptp and ip_gre may coexists)
      NET: ip_gre: update to use the "gre" module
      
      This patch introduces then pptp support to the linux kernel which
      dramatically speeds up pptp vpn connections and decreases cpu usage in
      comparison of existing user-space implementation
      (poptop/pptpclient). There is accel-pptp project
      (https://sourceforge.net/projects/accel-pptp/) to utilize this module,
      it contains plugin for pppd to use pptp in client-mode and modified
      pptpd (poptop) to build high-performance pptp NAS.
      
      There was many changes from initial submitted patch, most important are:
      1. using rcu instead of read-write locks
      2. using static bitmap instead of dynamically allocated
      3. using vmalloc for memory allocation instead of BITS_PER_LONG + __get_free_pages
      4. fixed many coding style issues
      Thanks to Eric Dumazet.
      Signed-off-by: NDmitry Kozlov <xeb@mail.ru>
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      00959ade
  2. 20 8月, 2010 2 次提交
  3. 19 8月, 2010 1 次提交
  4. 08 8月, 2010 1 次提交
  5. 03 8月, 2010 2 次提交
  6. 02 8月, 2010 4 次提交
  7. 31 7月, 2010 1 次提交
  8. 23 7月, 2010 4 次提交
  9. 22 7月, 2010 1 次提交
  10. 20 7月, 2010 1 次提交
  11. 16 7月, 2010 1 次提交
  12. 15 7月, 2010 1 次提交
  13. 13 7月, 2010 2 次提交
  14. 09 7月, 2010 1 次提交
    • S
      gre: propagate ipv6 transport class · dd4ba83d
      Stephen Hemminger 提交于
      This patch makes IPV6 over IPv4 GRE tunnel propagate the transport
      class field from the underlying IPV6 header to the IPV4 Type Of Service
      field. Without the patch, all IPV6 packets in tunnel look the same to QoS.
      
      This assumes that IPV6 transport class is exactly the same
      as IPv4 TOS. Not sure if that is always the case?  Maybe need
      to mask off some bits.
      
      The mask and shift to get tclass is copied from ipv6/datagram.c
      Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dd4ba83d
  15. 08 7月, 2010 1 次提交
  16. 06 7月, 2010 1 次提交
  17. 05 7月, 2010 3 次提交
  18. 01 7月, 2010 2 次提交
    • C
      fragment: add fast path for in-order fragments · d6bebca9
      Changli Gao 提交于
      add fast path for in-order fragments
      
      As the fragments are sent in order in most of OSes, such as Windows, Darwin and
      FreeBSD, it is likely the new fragments are at the end of the inet_frag_queue.
      In the fast path, we check if the skb at the end of the inet_frag_queue is the
      prev we expect.
      Signed-off-by: NChangli Gao <xiaosuo@gmail.com>
      ----
       include/net/inet_frag.h |    1 +
       net/ipv4/ip_fragment.c  |   12 ++++++++++++
       net/ipv6/reassembly.c   |   11 +++++++++++
       3 files changed, 24 insertions(+)
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d6bebca9
    • E
      snmp: 64bit ipstats_mib for all arches · 4ce3c183
      Eric Dumazet 提交于
      /proc/net/snmp and /proc/net/netstat expose SNMP counters.
      
      Width of these counters is either 32 or 64 bits, depending on the size
      of "unsigned long" in kernel.
      
      This means user program parsing these files must already be prepared to
      deal with 64bit values, regardless of user program being 32 or 64 bit.
      
      This patch introduces 64bit snmp values for IPSTAT mib, where some
      counters can wrap pretty fast if they are 32bit wide.
      
      # netstat -s|egrep "InOctets|OutOctets"
          InOctets: 244068329096
          OutOctets: 244069348848
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4ce3c183
  19. 29 6月, 2010 2 次提交
  20. 28 6月, 2010 2 次提交
  21. 27 6月, 2010 2 次提交
  22. 26 6月, 2010 2 次提交
  23. 25 6月, 2010 1 次提交
    • K
      tcp: do not send reset to already closed sockets · 565b7b2d
      Konstantin Khorenko 提交于
      i've found that tcp_close() can be called for an already closed
      socket, but still sends reset in this case (tcp_send_active_reset())
      which seems to be incorrect.  Moreover, a packet with reset is sent
      with different source port as original port number has been already
      cleared on socket.  Besides that incrementing stat counter for
      LINUX_MIB_TCPABORTONCLOSE also does not look correct in this case.
      
      Initially this issue was found on 2.6.18-x RHEL5 kernel, but the same
      seems to be true for the current mainstream kernel (checked on
      2.6.35-rc3).  Please, correct me if i missed something.
      
      How that happens:
      
      1) the server receives a packet for socket in TCP_CLOSE_WAIT state
         that triggers a tcp_reset():
      
      Call Trace:
       <IRQ>  [<ffffffff8025b9b9>] tcp_reset+0x12f/0x1e8
       [<ffffffff80046125>] tcp_rcv_state_process+0x1c0/0xa08
       [<ffffffff8003eb22>] tcp_v4_do_rcv+0x310/0x37a
       [<ffffffff80028bea>] tcp_v4_rcv+0x74d/0xb43
       [<ffffffff8024ef4c>] ip_local_deliver_finish+0x0/0x259
       [<ffffffff80037131>] ip_local_deliver+0x200/0x2f4
       [<ffffffff8003843c>] ip_rcv+0x64c/0x69f
       [<ffffffff80021d89>] netif_receive_skb+0x4c4/0x4fa
       [<ffffffff80032eca>] process_backlog+0x90/0xec
       [<ffffffff8000cc50>] net_rx_action+0xbb/0x1f1
       [<ffffffff80012d3a>] __do_softirq+0xf5/0x1ce
       [<ffffffff8001147a>] handle_IRQ_event+0x56/0xb0
       [<ffffffff8006334c>] call_softirq+0x1c/0x28
       [<ffffffff80070476>] do_softirq+0x2c/0x85
       [<ffffffff80070441>] do_IRQ+0x149/0x152
       [<ffffffff80062665>] ret_from_intr+0x0/0xa
       <EOI>  [<ffffffff80008a2e>] __handle_mm_fault+0x6cd/0x1303
       [<ffffffff80008903>] __handle_mm_fault+0x5a2/0x1303
       [<ffffffff80033a9d>] cache_free_debugcheck+0x21f/0x22e
       [<ffffffff8006a263>] do_page_fault+0x49a/0x7dc
       [<ffffffff80066487>] thread_return+0x89/0x174
       [<ffffffff800c5aee>] audit_syscall_exit+0x341/0x35c
       [<ffffffff80062e39>] error_exit+0x0/0x84
      
      tcp_rcv_state_process()
      ...  // (sk_state == TCP_CLOSE_WAIT here)
      ...
              /* step 2: check RST bit */
              if(th->rst) {
                      tcp_reset(sk);
                      goto discard;
              }
      ...
      ---------------------------------
      tcp_rcv_state_process
       tcp_reset
        tcp_done
         tcp_set_state(sk, TCP_CLOSE);
           inet_put_port
            __inet_put_port
             inet_sk(sk)->num = 0;
      
         sk->sk_shutdown = SHUTDOWN_MASK;
      
      2) After that the process (socket owner) tries to write something to
         that socket and "inet_autobind" sets a _new_ (which differs from
         the original!) port number for the socket:
      
       Call Trace:
        [<ffffffff80255a12>] inet_bind_hash+0x33/0x5f
        [<ffffffff80257180>] inet_csk_get_port+0x216/0x268
        [<ffffffff8026bcc9>] inet_autobind+0x22/0x8f
        [<ffffffff80049140>] inet_sendmsg+0x27/0x57
        [<ffffffff8003a9d9>] do_sock_write+0xae/0xea
        [<ffffffff80226ac7>] sock_writev+0xdc/0xf6
        [<ffffffff800680c7>] _spin_lock_irqsave+0x9/0xe
        [<ffffffff8001fb49>] __pollwait+0x0/0xdd
        [<ffffffff8008d533>] default_wake_function+0x0/0xe
        [<ffffffff800a4f10>] autoremove_wake_function+0x0/0x2e
        [<ffffffff800f0b49>] do_readv_writev+0x163/0x274
        [<ffffffff80066538>] thread_return+0x13a/0x174
        [<ffffffff800145d8>] tcp_poll+0x0/0x1c9
        [<ffffffff800c56d3>] audit_syscall_entry+0x180/0x1b3
        [<ffffffff800f0dd0>] sys_writev+0x49/0xe4
        [<ffffffff800622dd>] tracesys+0xd5/0xe0
      
      3) sendmsg fails at last with -EPIPE (=> 'write' returns -EPIPE in userspace):
      
      F: tcp_sendmsg1 -EPIPE: sk=ffff81000bda00d0, sport=49847, old_state=7, new_state=7, sk_err=0, sk_shutdown=3
      
      Call Trace:
       [<ffffffff80027557>] tcp_sendmsg+0xcb/0xe87
       [<ffffffff80033300>] release_sock+0x10/0xae
       [<ffffffff8016f20f>] vgacon_cursor+0x0/0x1a7
       [<ffffffff8026bd32>] inet_autobind+0x8b/0x8f
       [<ffffffff8003a9d9>] do_sock_write+0xae/0xea
       [<ffffffff80226ac7>] sock_writev+0xdc/0xf6
       [<ffffffff800680c7>] _spin_lock_irqsave+0x9/0xe
       [<ffffffff8001fb49>] __pollwait+0x0/0xdd
       [<ffffffff8008d533>] default_wake_function+0x0/0xe
       [<ffffffff800a4f10>] autoremove_wake_function+0x0/0x2e
       [<ffffffff800f0b49>] do_readv_writev+0x163/0x274
       [<ffffffff80066538>] thread_return+0x13a/0x174
       [<ffffffff800145d8>] tcp_poll+0x0/0x1c9
       [<ffffffff800c56d3>] audit_syscall_entry+0x180/0x1b3
       [<ffffffff800f0dd0>] sys_writev+0x49/0xe4
       [<ffffffff800622dd>] tracesys+0xd5/0xe0
      
      tcp_sendmsg()
      ...
              /* Wait for a connection to finish. */
              if ((1 << sk->sk_state) & ~(TCPF_ESTABLISHED | TCPF_CLOSE_WAIT)) {
                      int old_state = sk->sk_state;
                      if ((err = sk_stream_wait_connect(sk, &timeo)) != 0) {
      if (f_d && (err == -EPIPE)) {
              printk("F: tcp_sendmsg1 -EPIPE: sk=%p, sport=%u, old_state=%d, new_state=%d, "
                      "sk_err=%d, sk_shutdown=%d\n",
                      sk, ntohs(inet_sk(sk)->sport), old_state, sk->sk_state,
                      sk->sk_err, sk->sk_shutdown);
              dump_stack();
      }
                              goto out_err;
                      }
              }
      ...
      
      4) Then the process (socket owner) understands that it's time to close
         that socket and does that (and thus triggers sending reset packet):
      
      Call Trace:
      ...
       [<ffffffff80032077>] dev_queue_xmit+0x343/0x3d6
       [<ffffffff80034698>] ip_output+0x351/0x384
       [<ffffffff80251ae9>] dst_output+0x0/0xe
       [<ffffffff80036ec6>] ip_queue_xmit+0x567/0x5d2
       [<ffffffff80095700>] vprintk+0x21/0x33
       [<ffffffff800070f0>] check_poison_obj+0x2e/0x206
       [<ffffffff80013587>] poison_obj+0x36/0x45
       [<ffffffff8025dea6>] tcp_send_active_reset+0x15/0x14d
       [<ffffffff80023481>] dbg_redzone1+0x1c/0x25
       [<ffffffff8025dea6>] tcp_send_active_reset+0x15/0x14d
       [<ffffffff8000ca94>] cache_alloc_debugcheck_after+0x189/0x1c8
       [<ffffffff80023405>] tcp_transmit_skb+0x764/0x786
       [<ffffffff8025df8a>] tcp_send_active_reset+0xf9/0x14d
       [<ffffffff80258ff1>] tcp_close+0x39a/0x960
       [<ffffffff8026be12>] inet_release+0x69/0x80
       [<ffffffff80059b31>] sock_release+0x4f/0xcf
       [<ffffffff80059d4c>] sock_close+0x2c/0x30
       [<ffffffff800133c9>] __fput+0xac/0x197
       [<ffffffff800252bc>] filp_close+0x59/0x61
       [<ffffffff8001eff6>] sys_close+0x85/0xc7
       [<ffffffff800622dd>] tracesys+0xd5/0xe0
      
      So, in brief:
      
      * a received packet for socket in TCP_CLOSE_WAIT state triggers
        tcp_reset() which clears inet_sk(sk)->num and put socket into
        TCP_CLOSE state
      
      * an attempt to write to that socket forces inet_autobind() to get a
        new port (but the write itself fails with -EPIPE)
      
      * tcp_close() called for socket in TCP_CLOSE state sends an active
        reset via socket with newly allocated port
      
      This adds an additional check in tcp_close() for already closed
      sockets. We do not want to send anything to closed sockets.
      Signed-off-by: NKonstantin Khorenko <khorenko@openvz.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      565b7b2d
  24. 24 6月, 2010 1 次提交