1. 08 9月, 2012 3 次提交
  2. 07 9月, 2012 1 次提交
    • E
      tcp: fix TFO regression · 7ab4551f
      Eric Dumazet 提交于
      Fengguang Wu reported various panics and bisected to commit
      8336886f (tcp: TCP Fast Open Server - support TFO listeners)
      
      Fix this by making sure socket is a TCP socket before accessing TFO data
      structures.
      
      [  233.046014] kfree_debugcheck: out of range ptr ea6000000bb8h.
      [  233.047399] ------------[ cut here ]------------
      [  233.048393] kernel BUG at /c/kernel-tests/src/stable/mm/slab.c:3074!
      [  233.048393] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
      [  233.048393] Modules linked in:
      [  233.048393] CPU 0
      [  233.048393] Pid: 3929, comm: trinity-watchdo Not tainted 3.6.0-rc3+
      #4192 Bochs Bochs
      [  233.048393] RIP: 0010:[<ffffffff81169653>]  [<ffffffff81169653>]
      kfree_debugcheck+0x27/0x2d
      [  233.048393] RSP: 0018:ffff88000facbca8  EFLAGS: 00010092
      [  233.048393] RAX: 0000000000000031 RBX: 0000ea6000000bb8 RCX:
      00000000a189a188
      [  233.048393] RDX: 000000000000a189 RSI: ffffffff8108ad32 RDI:
      ffffffff810d30f9
      [  233.048393] RBP: ffff88000facbcb8 R08: 0000000000000002 R09:
      ffffffff843846f0
      [  233.048393] R10: ffffffff810ae37c R11: 0000000000000908 R12:
      0000000000000202
      [  233.048393] R13: ffffffff823dbd5a R14: ffff88000ec5bea8 R15:
      ffffffff8363c780
      [  233.048393] FS:  00007faa6899c700(0000) GS:ffff88001f200000(0000)
      knlGS:0000000000000000
      [  233.048393] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [  233.048393] CR2: 00007faa6841019c CR3: 0000000012c82000 CR4:
      00000000000006f0
      [  233.048393] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
      0000000000000000
      [  233.048393] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
      0000000000000400
      [  233.048393] Process trinity-watchdo (pid: 3929, threadinfo
      ffff88000faca000, task ffff88000faec600)
      [  233.048393] Stack:
      [  233.048393]  0000000000000000 0000ea6000000bb8 ffff88000facbce8
      ffffffff8116ad81
      [  233.048393]  ffff88000ff588a0 ffff88000ff58850 ffff88000ff588a0
      0000000000000000
      [  233.048393]  ffff88000facbd08 ffffffff823dbd5a ffffffff823dbcb0
      ffff88000ff58850
      [  233.048393] Call Trace:
      [  233.048393]  [<ffffffff8116ad81>] kfree+0x5f/0xca
      [  233.048393]  [<ffffffff823dbd5a>] inet_sock_destruct+0xaa/0x13c
      [  233.048393]  [<ffffffff823dbcb0>] ? inet_sk_rebuild_header
      +0x319/0x319
      [  233.048393]  [<ffffffff8231c307>] __sk_free+0x21/0x14b
      [  233.048393]  [<ffffffff8231c4bd>] sk_free+0x26/0x2a
      [  233.048393]  [<ffffffff825372db>] sctp_close+0x215/0x224
      [  233.048393]  [<ffffffff810d6835>] ? lock_release+0x16f/0x1b9
      [  233.048393]  [<ffffffff823daf12>] inet_release+0x7e/0x85
      [  233.048393]  [<ffffffff82317d15>] sock_release+0x1f/0x77
      [  233.048393]  [<ffffffff82317d94>] sock_close+0x27/0x2b
      [  233.048393]  [<ffffffff81173bbe>] __fput+0x101/0x20a
      [  233.048393]  [<ffffffff81173cd5>] ____fput+0xe/0x10
      [  233.048393]  [<ffffffff810a3794>] task_work_run+0x5d/0x75
      [  233.048393]  [<ffffffff8108da70>] do_exit+0x290/0x7f5
      [  233.048393]  [<ffffffff82707415>] ? retint_swapgs+0x13/0x1b
      [  233.048393]  [<ffffffff8108e23f>] do_group_exit+0x7b/0xba
      [  233.048393]  [<ffffffff8108e295>] sys_exit_group+0x17/0x17
      [  233.048393]  [<ffffffff8270de10>] tracesys+0xdd/0xe2
      [  233.048393] Code: 59 01 5d c3 55 48 89 e5 53 41 50 0f 1f 44 00 00 48
      89 fb e8 d4 b0 f0 ff 84 c0 75 11 48 89 de 48 c7 c7 fc fa f7 82 e8 0d 0f
      57 01 <0f> 0b 5f 5b 5d c3 55 48 89 e5 0f 1f 44 00 00 48 63 87 d8 00 00
      [  233.048393] RIP  [<ffffffff81169653>] kfree_debugcheck+0x27/0x2d
      [  233.048393]  RSP <ffff88000facbca8>
      Reported-by: NFengguang Wu <wfg@linux.intel.com>
      Tested-by: NFengguang Wu <wfg@linux.intel.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: "H.K. Jerry Chu" <hkchu@google.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Acked-by: NH.K. Jerry Chu <hkchu@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7ab4551f
  3. 06 9月, 2012 5 次提交
    • N
      ipv6: fix handling of blackhole and prohibit routes · ef2c7d7b
      Nicolas Dichtel 提交于
      When adding a blackhole or a prohibit route, they were handling like classic
      routes. Moreover, it was only possible to add this kind of routes by specifying
      an interface.
      
      Bug already reported here:
        http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=498498
      
      Before the patch:
        $ ip route add blackhole 2001::1/128
        RTNETLINK answers: No such device
        $ ip route add blackhole 2001::1/128 dev eth0
        $ ip -6 route | grep 2001
        2001::1 dev eth0  metric 1024
      
      After:
        $ ip route add blackhole 2001::1/128
        $ ip -6 route | grep 2001
        blackhole 2001::1 dev lo  metric 1024  error -22
      
      v2: wrong patch
      v3: add a field fc_type in struct fib6_config to store RTN_* type
      Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ef2c7d7b
    • E
      net: qdisc busylock needs lockdep annotations · 23d3b8bf
      Eric Dumazet 提交于
      It seems we need to provide ability for stacked devices
      to use specific lock_class_key for sch->busylock
      
      We could instead default l2tpeth tx_queue_len to 0 (no qdisc), but
      a user might use a qdisc anyway.
      
      (So same fixes are probably needed on non LLTX stacked drivers)
      
      Noticed while stressing L2TPV3 setup :
      
      ======================================================
       [ INFO: possible circular locking dependency detected ]
       3.6.0-rc3+ #788 Not tainted
       -------------------------------------------------------
       netperf/4660 is trying to acquire lock:
        (l2tpsock){+.-...}, at: [<ffffffffa0208db2>] l2tp_xmit_skb+0x172/0xa50 [l2tp_core]
      
       but task is already holding lock:
        (&(&sch->busylock)->rlock){+.-...}, at: [<ffffffff81596595>] dev_queue_xmit+0xd75/0xe00
      
       which lock already depends on the new lock.
      
       the existing dependency chain (in reverse order) is:
      
       -> #1 (&(&sch->busylock)->rlock){+.-...}:
              [<ffffffff810a5df0>] lock_acquire+0x90/0x200
              [<ffffffff817499fc>] _raw_spin_lock_irqsave+0x4c/0x60
              [<ffffffff81074872>] __wake_up+0x32/0x70
              [<ffffffff8136d39e>] tty_wakeup+0x3e/0x80
              [<ffffffff81378fb3>] pty_write+0x73/0x80
              [<ffffffff8136cb4c>] tty_put_char+0x3c/0x40
              [<ffffffff813722b2>] process_echoes+0x142/0x330
              [<ffffffff813742ab>] n_tty_receive_buf+0x8fb/0x1230
              [<ffffffff813777b2>] flush_to_ldisc+0x142/0x1c0
              [<ffffffff81062818>] process_one_work+0x198/0x760
              [<ffffffff81063236>] worker_thread+0x186/0x4b0
              [<ffffffff810694d3>] kthread+0x93/0xa0
              [<ffffffff81753e24>] kernel_thread_helper+0x4/0x10
      
       -> #0 (l2tpsock){+.-...}:
              [<ffffffff810a5288>] __lock_acquire+0x1628/0x1b10
              [<ffffffff810a5df0>] lock_acquire+0x90/0x200
              [<ffffffff817498c1>] _raw_spin_lock+0x41/0x50
              [<ffffffffa0208db2>] l2tp_xmit_skb+0x172/0xa50 [l2tp_core]
              [<ffffffffa021a802>] l2tp_eth_dev_xmit+0x32/0x60 [l2tp_eth]
              [<ffffffff815952b2>] dev_hard_start_xmit+0x502/0xa70
              [<ffffffff815b63ce>] sch_direct_xmit+0xfe/0x290
              [<ffffffff81595a05>] dev_queue_xmit+0x1e5/0xe00
              [<ffffffff815d9d60>] ip_finish_output+0x3d0/0x890
              [<ffffffff815db019>] ip_output+0x59/0xf0
              [<ffffffff815da36d>] ip_local_out+0x2d/0xa0
              [<ffffffff815da5a3>] ip_queue_xmit+0x1c3/0x680
              [<ffffffff815f4192>] tcp_transmit_skb+0x402/0xa60
              [<ffffffff815f4a94>] tcp_write_xmit+0x1f4/0xa30
              [<ffffffff815f5300>] tcp_push_one+0x30/0x40
              [<ffffffff815e6672>] tcp_sendmsg+0xe82/0x1040
              [<ffffffff81614495>] inet_sendmsg+0x125/0x230
              [<ffffffff81576cdc>] sock_sendmsg+0xdc/0xf0
              [<ffffffff81579ece>] sys_sendto+0xfe/0x130
              [<ffffffff81752c92>] system_call_fastpath+0x16/0x1b
        Possible unsafe locking scenario:
      
              CPU0                    CPU1
              ----                    ----
         lock(&(&sch->busylock)->rlock);
                                      lock(l2tpsock);
                                      lock(&(&sch->busylock)->rlock);
         lock(l2tpsock);
      
        *** DEADLOCK ***
      
       5 locks held by netperf/4660:
        #0:  (sk_lock-AF_INET){+.+.+.}, at: [<ffffffff815e581c>] tcp_sendmsg+0x2c/0x1040
        #1:  (rcu_read_lock){.+.+..}, at: [<ffffffff815da3e0>] ip_queue_xmit+0x0/0x680
        #2:  (rcu_read_lock_bh){.+....}, at: [<ffffffff815d9ac5>] ip_finish_output+0x135/0x890
        #3:  (rcu_read_lock_bh){.+....}, at: [<ffffffff81595820>] dev_queue_xmit+0x0/0xe00
        #4:  (&(&sch->busylock)->rlock){+.-...}, at: [<ffffffff81596595>] dev_queue_xmit+0xd75/0xe00
      
       stack backtrace:
       Pid: 4660, comm: netperf Not tainted 3.6.0-rc3+ #788
       Call Trace:
        [<ffffffff8173dbf8>] print_circular_bug+0x1fb/0x20c
        [<ffffffff810a5288>] __lock_acquire+0x1628/0x1b10
        [<ffffffff810a334b>] ? check_usage+0x9b/0x4d0
        [<ffffffff810a3f44>] ? __lock_acquire+0x2e4/0x1b10
        [<ffffffff810a5df0>] lock_acquire+0x90/0x200
        [<ffffffffa0208db2>] ? l2tp_xmit_skb+0x172/0xa50 [l2tp_core]
        [<ffffffff817498c1>] _raw_spin_lock+0x41/0x50
        [<ffffffffa0208db2>] ? l2tp_xmit_skb+0x172/0xa50 [l2tp_core]
        [<ffffffffa0208db2>] l2tp_xmit_skb+0x172/0xa50 [l2tp_core]
        [<ffffffffa021a802>] l2tp_eth_dev_xmit+0x32/0x60 [l2tp_eth]
        [<ffffffff815952b2>] dev_hard_start_xmit+0x502/0xa70
        [<ffffffff81594e0e>] ? dev_hard_start_xmit+0x5e/0xa70
        [<ffffffff81595961>] ? dev_queue_xmit+0x141/0xe00
        [<ffffffff815b63ce>] sch_direct_xmit+0xfe/0x290
        [<ffffffff81595a05>] dev_queue_xmit+0x1e5/0xe00
        [<ffffffff81595820>] ? dev_hard_start_xmit+0xa70/0xa70
        [<ffffffff815d9d60>] ip_finish_output+0x3d0/0x890
        [<ffffffff815d9ac5>] ? ip_finish_output+0x135/0x890
        [<ffffffff815db019>] ip_output+0x59/0xf0
        [<ffffffff815da36d>] ip_local_out+0x2d/0xa0
        [<ffffffff815da5a3>] ip_queue_xmit+0x1c3/0x680
        [<ffffffff815da3e0>] ? ip_local_out+0xa0/0xa0
        [<ffffffff815f4192>] tcp_transmit_skb+0x402/0xa60
        [<ffffffff815fa25e>] ? tcp_md5_do_lookup+0x18e/0x1a0
        [<ffffffff815f4a94>] tcp_write_xmit+0x1f4/0xa30
        [<ffffffff815f5300>] tcp_push_one+0x30/0x40
        [<ffffffff815e6672>] tcp_sendmsg+0xe82/0x1040
        [<ffffffff81614495>] inet_sendmsg+0x125/0x230
        [<ffffffff81614370>] ? inet_create+0x6b0/0x6b0
        [<ffffffff8157e6e2>] ? sock_update_classid+0xc2/0x3b0
        [<ffffffff8157e750>] ? sock_update_classid+0x130/0x3b0
        [<ffffffff81576cdc>] sock_sendmsg+0xdc/0xf0
        [<ffffffff81162579>] ? fget_light+0x3f9/0x4f0
        [<ffffffff81579ece>] sys_sendto+0xfe/0x130
        [<ffffffff810a69ad>] ? trace_hardirqs_on+0xd/0x10
        [<ffffffff8174a0b0>] ? _raw_spin_unlock_irq+0x30/0x50
        [<ffffffff810757e3>] ? finish_task_switch+0x83/0xf0
        [<ffffffff810757a6>] ? finish_task_switch+0x46/0xf0
        [<ffffffff81752cb7>] ? sysret_check+0x1b/0x56
        [<ffffffff81752c92>] system_call_fastpath+0x16/0x1b
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      23d3b8bf
    • S
      netfilter: ipv6: using csum_ipv6_magic requires net/ip6_checksum.h · 2c969322
      Stephen Rothwell 提交于
      Fixes this build error:
      
      net/ipv6/netfilter/nf_nat_l3proto_ipv6.c: In function 'nf_nat_ipv6_csum_recalc':
      net/ipv6/netfilter/nf_nat_l3proto_ipv6.c:144:4: error: implicit declaration of function 'csum_ipv6_magic' [-Werror=implicit-function-declaration]
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2c969322
    • N
      net: add unknown state to sysfs NIC duplex export · c6c13965
      Nikolay Aleksandrov 提交于
      Currently when the NIC duplex state is DUPLEX_UNKNOWN it is exported as
      full through sysfs, this patch adds support for DUPLEX_UNKNOWN. It is
      handled the same way as in ethtool.
      Signed-off-by: NNikolay Aleksandrov <naleksan@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c6c13965
    • J
      tcp: add generic netlink support for tcp_metrics · d23ff701
      Julian Anastasov 提交于
      Add support for genl "tcp_metrics". No locking
      is changed, only that now we can unlink and delete
      entries after grace period. We implement get/del for
      single entry and dump to support show/flush filtering
      in user space. Del without address attribute causes
      flush for all addresses, sadly under genl_mutex.
      
      v2:
      - remove rcu_assign_pointer as suggested by Eric Dumazet,
      it is not needed because there are no other writes under lock
      - move the flushing code in tcp_metrics_flush_all
      
      v3:
      - remove synchronize_rcu on flush as suggested by Eric Dumazet
      Signed-off-by: NJulian Anastasov <ja@ssi.bg>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d23ff701
  4. 05 9月, 2012 3 次提交
    • M
      net: Providing protocol type via system.sockprotoname xattr of /proc/PID/fd entries · 600e1779
      Masatake YAMATO 提交于
      lsof reports some of socket descriptors as "can't identify protocol" like:
      
          [yamato@localhost]/tmp% sudo lsof | grep dbus | grep iden
          dbus-daem   652          dbus    6u     sock ... 17812 can't identify protocol
          dbus-daem   652          dbus   34u     sock ... 24689 can't identify protocol
          dbus-daem   652          dbus   42u     sock ... 24739 can't identify protocol
          dbus-daem   652          dbus   48u     sock ... 22329 can't identify protocol
          ...
      
      lsof cannot resolve the protocol used in a socket because procfs
      doesn't provide the map between inode number on sockfs and protocol
      type of the socket.
      
      For improving the situation this patch adds an extended attribute named
      'system.sockprotoname' in which the protocol name for
      /proc/PID/fd/SOCKET is stored. So lsof can know the protocol for a
      given /proc/PID/fd/SOCKET with getxattr system call.
      
      A few weeks ago I submitted a patch for the same purpose. The patch
      was introduced /proc/net/sockfs which enumerates inodes and protocols
      of all sockets alive on a system. However, it was rejected because (1)
      a global lock was needed, and (2) the layout of struct socket was
      changed with the patch.
      
      This patch doesn't use any global lock; and doesn't change the layout
      of any structs.
      
      In this patch, a protocol name is stored to dentry->d_name of sockfs
      when new socket is associated with a file descriptor. Before this
      patch dentry->d_name was not used; it was just filled with empty
      string. lsof may use an extended attribute named
      'system.sockprotoname' to retrieve the value of dentry->d_name.
      
      It is nice if we can see the protocol name with ls -l
      /proc/PID/fd. However, "socket:[#INODE]", the name format returned
      from sockfs_dname() was already defined. To keep the compatibility
      between kernel and user land, the extended attribute is used to
      prepare the value of dentry->d_name.
      Signed-off-by: NMasatake YAMATO <yamato@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      600e1779
    • D
      net: Add INET dependency on aes crypto for the sake of TCP fastopen. · 798b2cbf
      David S. Miller 提交于
      Stephen Rothwell says:
      
      ====================
      After merging the final tree, today's linux-next build (powerpc
      ppc44x_defconfig) failed like this:
      
      net/built-in.o: In function `tcp_fastopen_ctx_free':
      tcp_fastopen.c:(.text+0x5cc5c): undefined reference to `crypto_destroy_tfm'
      net/built-in.o: In function `tcp_fastopen_reset_cipher':
      (.text+0x5cccc): undefined reference to `crypto_alloc_base'
      net/built-in.o: In function `tcp_fastopen_reset_cipher':
      (.text+0x5cd6c): undefined reference to `crypto_destroy_tfm'
      
      Presumably caused by commit 10467163 ("tcp: TCP Fast Open Server -
      header & support functions") from the net-next tree.  I assume that some
      dependency on the CRYPTO infrastructure is missing.
      
      I have reverted commit 1bed966c ("Merge branch
      'tcp_fastopen_server'") for today.
      ====================
      Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      798b2cbf
    • W
      sctp: use list_move_tail instead of list_del/list_add_tail · 54a27924
      Wei Yongjun 提交于
      Using list_move_tail() instead of list_del() + list_add_tail().
      
      spatch with a semantic match is used to found this problem.
      (http://coccinelle.lip6.fr/)
      Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      54a27924
  5. 04 9月, 2012 4 次提交
  6. 03 9月, 2012 7 次提交
    • J
      netfilter: properly annotate ipv4_netfilter_{init,fini}() · ce9f3f31
      Jan Beulich 提交于
      Despite being just a few bytes of code, they should still have proper
      annotations.
      Signed-off-by: NJan Beulich <jbeulich@suse.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      ce9f3f31
    • M
      netfilter: pass 'nf_hook_ops' instead of 'list_head' to nf_queue() · 1c15b677
      Michael Wang 提交于
      Since 'list_for_each_continue_rcu' has already been replaced by
      'list_for_each_entry_continue_rcu', pass 'list_head' to nf_queue() as a
      parameter can not benefit us any more.
      
      This patch will replace 'list_head' with 'nf_hook_ops' as the parameter of
      nf_queue() and __nf_queue() to save code.
      Signed-off-by: NMichael Wang <wangyun@linux.vnet.ibm.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      1c15b677
    • M
      netfilter: pass 'nf_hook_ops' instead of 'list_head' to nf_iterate() · 2a6decfd
      Michael Wang 提交于
      Since 'list_for_each_continue_rcu' has already been replaced by
      'list_for_each_entry_continue_rcu', pass 'list_head' to nf_iterate() as a
      parameter can not benefit us any more.
      
      This patch will replace 'list_head' with 'nf_hook_ops' as the parameter of
      nf_iterate() to save code.
      Signed-off-by: NMichael Wang <wangyun@linux.vnet.ibm.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      2a6decfd
    • C
      netfilter: remove xt_NOTRACK · 96550501
      Cong Wang 提交于
      It was scheduled to be removed for a long time.
      
      Cc: Pablo Neira Ayuso <pablo@netfilter.org>
      Cc: Patrick McHardy <kaber@trash.net>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: netfilter@vger.kernel.org
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      96550501
    • P
      netfilter: nf_conntrack: add nf_ct_timeout_lookup · 84b5ee93
      Pablo Neira Ayuso 提交于
      This patch adds the new nf_ct_timeout_lookup function to encapsulate
      the timeout policy attachment that is called in the nf_conntrack_in
      path.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      84b5ee93
    • P
      netfilter: xt_CT: refactorize xt_ct_tg_check · 236df005
      Pablo Neira Ayuso 提交于
      This patch adds xt_ct_set_helper and xt_ct_set_timeout to reduce
      the size of xt_ct_tg_check.
      
      This aims to improve code mantainability by splitting xt_ct_tg_check
      in smaller chunks.
      
      Suggested by Eric Dumazet.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      236df005
    • P
      netfilter: xt_socket: fix compilation warnings with gcc 4.7 · 6703aa74
      Pablo Neira Ayuso 提交于
      This patch fixes compilation warnings in xt_socket with gcc-4.7.
      
      In file included from net/netfilter/xt_socket.c:22:0:
      net/netfilter/xt_socket.c: In function ‘socket_mt6_v1’:
      include/net/netfilter/nf_tproxy_core.h:175:23: warning: ‘sport’ may be used uninitialized in this function [-Wmaybe-uninitialized]
      net/netfilter/xt_socket.c:265:16: note: ‘sport’ was declared here
      In file included from net/netfilter/xt_socket.c:22:0:
      include/net/netfilter/nf_tproxy_core.h:175:23: warning: ‘dport’ may be used uninitialized in this function [-Wmaybe-uninitialized]
      net/netfilter/xt_socket.c:265:9: note: ‘dport’ was declared here
      In file included from net/netfilter/xt_socket.c:22:0:
      include/net/netfilter/nf_tproxy_core.h:175:6: warning: ‘saddr’ may be used uninitialized in this function [-Wmaybe-uninitialized]
      net/netfilter/xt_socket.c:264:27: note: ‘saddr’ was declared here
      In file included from net/netfilter/xt_socket.c:22:0:
      include/net/netfilter/nf_tproxy_core.h:175:6: warning: ‘daddr’ may be used uninitialized in this function [-Wmaybe-uninitialized]
      net/netfilter/xt_socket.c:264:19: note: ‘daddr’ was declared here
      In file included from net/netfilter/xt_socket.c:22:0:
      net/netfilter/xt_socket.c: In function ‘socket_match.isra.4’:
      include/net/netfilter/nf_tproxy_core.h:75:2: warning: ‘protocol’ may be used uninitialized in this function [-Wmaybe-uninitialized]
      net/netfilter/xt_socket.c:113:5: note: ‘protocol’ was declared here
      In file included from include/net/tcp.h:37:0,
                       from net/netfilter/xt_socket.c:17:
      include/net/inet_hashtables.h:356:45: warning: ‘sport’ may be used uninitialized in this function [-Wmaybe-uninitialized]
      net/netfilter/xt_socket.c:112:16: note: ‘sport’ was declared here
      In file included from net/netfilter/xt_socket.c:22:0:
      include/net/netfilter/nf_tproxy_core.h:106:23: warning: ‘dport’ may be used uninitialized in this function [-Wmaybe-uninitialized]
      net/netfilter/xt_socket.c:112:9: note: ‘dport’ was declared here
      In file included from include/net/tcp.h:37:0,
                       from net/netfilter/xt_socket.c:17:
      include/net/inet_hashtables.h:356:15: warning: ‘saddr’ may be used uninitialized in this function [-Wmaybe-uninitialized]
      net/netfilter/xt_socket.c:111:16: note: ‘saddr’ was declared here
      In file included from include/net/tcp.h:37:0,
                       from net/netfilter/xt_socket.c:17:
      include/net/inet_hashtables.h:356:15: warning: ‘daddr’ may be used uninitialized in this function [-Wmaybe-uninitialized]
      net/netfilter/xt_socket.c:111:9: note: ‘daddr’ was declared here
      In file included from net/netfilter/xt_socket.c:22:0:
      net/netfilter/xt_socket.c: In function ‘socket_mt6_v1’:
      include/net/netfilter/nf_tproxy_core.h:175:23: warning: ‘sport’ may be used uninitialized in this function [-Wmaybe-uninitialized]
      net/netfilter/xt_socket.c:268:16: note: ‘sport’ was declared here
      In file included from net/netfilter/xt_socket.c:22:0:
      include/net/netfilter/nf_tproxy_core.h:175:23: warning: ‘dport’ may be used uninitialized in this function [-Wmaybe-uninitialized]
      net/netfilter/xt_socket.c:268:9: note: ‘dport’ was declared here
      In file included from net/netfilter/xt_socket.c:22:0:
      include/net/netfilter/nf_tproxy_core.h:175:6: warning: ‘saddr’ may be used uninitialized in this function [-Wmaybe-uninitialized]
      net/netfilter/xt_socket.c:267:27: note: ‘saddr’ was declared here
      In file included from net/netfilter/xt_socket.c:22:0:
      include/net/netfilter/nf_tproxy_core.h:175:6: warning: ‘daddr’ may be used uninitialized in this function [-Wmaybe-uninitialized]
      net/netfilter/xt_socket.c:267:19: note: ‘daddr’ was declared here
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      6703aa74
  7. 02 9月, 2012 2 次提交
  8. 01 9月, 2012 10 次提交
    • J
      tcp: TCP Fast Open Server - main code path · 168a8f58
      Jerry Chu 提交于
      This patch adds the main processing path to complete the TFO server
      patches.
      
      A TFO request (i.e., SYN+data packet with a TFO cookie option) first
      gets processed in tcp_v4_conn_request(). If it passes the various TFO
      checks by tcp_fastopen_check(), a child socket will be created right
      away to be accepted by applications, rather than waiting for the 3WHS
      to finish.
      
      In additon to the use of TFO cookie, a simple max_qlen based scheme
      is put in place to fend off spoofed TFO attack.
      
      When a valid ACK comes back to tcp_rcv_state_process(), it will cause
      the state of the child socket to switch from either TCP_SYN_RECV to
      TCP_ESTABLISHED, or TCP_FIN_WAIT1 to TCP_FIN_WAIT2. At this time
      retransmission will resume for any unack'ed (data, FIN,...) segments.
      Signed-off-by: NH.K. Jerry Chu <hkchu@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Tom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      168a8f58
    • J
      tcp: TCP Fast Open Server - support TFO listeners · 8336886f
      Jerry Chu 提交于
      This patch builds on top of the previous patch to add the support
      for TFO listeners. This includes -
      
      1. allocating, properly initializing, and managing the per listener
      fastopen_queue structure when TFO is enabled
      
      2. changes to the inet_csk_accept code to support TFO. E.g., the
      request_sock can no longer be freed upon accept(), not until 3WHS
      finishes
      
      3. allowing a TCP_SYN_RECV socket to properly poll() and sendmsg()
      if it's a TFO socket
      
      4. properly closing a TFO listener, and a TFO socket before 3WHS
      finishes
      
      5. supporting TCP_FASTOPEN socket option
      
      6. modifying tcp_check_req() to use to check a TFO socket as well
      as request_sock
      
      7. supporting TCP's TFO cookie option
      
      8. adding a new SYN-ACK retransmit handler to use the timer directly
      off the TFO socket rather than the listener socket. Note that TFO
      server side will not retransmit anything other than SYN-ACK until
      the 3WHS is completed.
      
      The patch also contains an important function
      "reqsk_fastopen_remove()" to manage the somewhat complex relation
      between a listener, its request_sock, and the corresponding child
      socket. See the comment above the function for the detail.
      Signed-off-by: NH.K. Jerry Chu <hkchu@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Tom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8336886f
    • J
      tcp: TCP Fast Open Server - header & support functions · 10467163
      Jerry Chu 提交于
      This patch adds all the necessary data structure and support
      functions to implement TFO server side. It also documents a number
      of flags for the sysctl_tcp_fastopen knob, and adds a few Linux
      extension MIBs.
      
      In addition, it includes the following:
      
      1. a new TCP_FASTOPEN socket option an application must call to
      supply a max backlog allowed in order to enable TFO on its listener.
      
      2. A number of key data structures:
      "fastopen_rsk" in tcp_sock - for a big socket to access its
      request_sock for retransmission and ack processing purpose. It is
      non-NULL iff 3WHS not completed.
      
      "fastopenq" in request_sock_queue - points to a per Fast Open
      listener data structure "fastopen_queue" to keep track of qlen (# of
      outstanding Fast Open requests) and max_qlen, among other things.
      
      "listener" in tcp_request_sock - to point to the original listener
      for book-keeping purpose, i.e., to maintain qlen against max_qlen
      as part of defense against IP spoofing attack.
      
      3. various data structure and functions, many in tcp_fastopen.c, to
      support server side Fast Open cookie operations, including
      /proc/sys/net/ipv4/tcp_fastopen_key to allow manual rekeying.
      Signed-off-by: NH.K. Jerry Chu <hkchu@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Tom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      10467163
    • S
      ipv6: remove some deadcode · eb7e0575
      Sorin Dumitru 提交于
      __ipv6_regen_rndid no longer returns anything other than 0
      so there's no point in verifying what it returns
      Signed-off-by: NSorin Dumitru <sdumitru@ixiacom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      eb7e0575
    • R
      net: fix documentation of skb_needs_linearize(). · d1a53dfd
      Rami Rosen 提交于
      skb_needs_linearize() does not check highmem DMA as it does not call
      illegal_highdma() anymore, so there is no need to mention highmem DMA here.
      
      (Indeed, ~NETIF_F_SG flag, which is checked in skb_needs_linearize(), can
      be set when illegal_highdma() returns true, and we are assured that
      illegal_highdma() is invoked prior to skb_needs_linearize() as
      skb_needs_linearize() is a static method called only once.
      But ~NETIF_F_SG can be set not only there in this same invocation path.
      It can also be set when can_checksum_protocol() returns false).
      
      see commit 02932ce9,
      Convert skb_need_linearize() to use precomputed features.
      Signed-off-by: NRami Rosen <rosenr@marvell.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d1a53dfd
    • A
      ipv4: Minor logic clean-up in ipv4_mtu · 98d75c37
      Alexander Duyck 提交于
      In ipv4_mtu there is some logic where we are testing for a non-zero value
      and a timer expiration, then setting the value to zero, and then testing if
      the value is zero we set it to a value based on the dst.  Instead of
      bothering with the extra steps it is easier to just cleanup the logic so
      that we set it to the dst based value if it is zero or if the timer has
      expired.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
      98d75c37
    • W
      net:atm:fix up ENOIOCTLCMD error handling · 4a2c2406
      Wanlong Gao 提交于
      At commit 07d106d0, Linus pointed out that ENOIOCTLCMD should be
      translated as ENOTTY to user mode.
      
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: netdev@vger.kernel.org
      Signed-off-by: NWanlong Gao <gaowanlong@cn.fujitsu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4a2c2406
    • W
      openvswitch: using kfree_rcu() to simplify the code · 80f0fd8a
      Wei Yongjun 提交于
      The callback function of call_rcu() just calls a kfree(), so we
      can use kfree_rcu() instead of call_rcu() + callback function.
      
      spatch with a semantic match is used to found this problem.
      (http://coccinelle.lip6.fr/)
      Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn>
      Acked-by: NJesse Gross <jesse@nicira.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      80f0fd8a
    • X
      af_unix: fix shutdown parameter checking · fc61b928
      Xi Wang 提交于
      Return -EINVAL rather than 0 given an invalid "mode" parameter.
      Signed-off-by: NXi Wang <xi.wang@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fc61b928
    • X
      decnet: fix shutdown parameter checking · 46b66d70
      Xi Wang 提交于
      The allowed value of "how" is SHUT_RD/SHUT_WR/SHUT_RDWR (0/1/2),
      rather than SHUTDOWN_MASK (3).
      Signed-off-by: NXi Wang <xi.wang@gmail.com>
      Acked-by: NSteven Whitehouse <swhiteho@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      46b66d70
  9. 31 8月, 2012 5 次提交
    • P
      netfilter: nf_conntrack: fix racy timer handling with reliable events · 5b423f6a
      Pablo Neira Ayuso 提交于
      Existing code assumes that del_timer returns true for alive conntrack
      entries. However, this is not true if reliable events are enabled.
      In that case, del_timer may return true for entries that were
      just inserted in the dying list. Note that packets / ctnetlink may
      hold references to conntrack entries that were just inserted to such
      list.
      
      This patch fixes the issue by adding an independent timer for
      event delivery. This increases the size of the ecache extension.
      Still we can revisit this later and use variable size extensions
      to allocate this area on demand.
      Tested-by: NOliver Smith <olipro@8.c.9.b.0.7.4.0.1.0.0.2.ip6.arpa>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      5b423f6a
    • E
      ipv4: must use rcu protection while calling fib_lookup · c5ae7d41
      Eric Dumazet 提交于
      Following lockdep splat was reported by Pavel Roskin :
      
      [ 1570.586223] ===============================
      [ 1570.586225] [ INFO: suspicious RCU usage. ]
      [ 1570.586228] 3.6.0-rc3-wl-main #98 Not tainted
      [ 1570.586229] -------------------------------
      [ 1570.586231] /home/proski/src/linux/net/ipv4/route.c:645 suspicious rcu_dereference_check() usage!
      [ 1570.586233]
      [ 1570.586233] other info that might help us debug this:
      [ 1570.586233]
      [ 1570.586236]
      [ 1570.586236] rcu_scheduler_active = 1, debug_locks = 0
      [ 1570.586238] 2 locks held by Chrome_IOThread/4467:
      [ 1570.586240]  #0:  (slock-AF_INET){+.-...}, at: [<ffffffff814f2c0c>] release_sock+0x2c/0xa0
      [ 1570.586253]  #1:  (fnhe_lock){+.-...}, at: [<ffffffff815302fc>] update_or_create_fnhe+0x2c/0x270
      [ 1570.586260]
      [ 1570.586260] stack backtrace:
      [ 1570.586263] Pid: 4467, comm: Chrome_IOThread Not tainted 3.6.0-rc3-wl-main #98
      [ 1570.586265] Call Trace:
      [ 1570.586271]  [<ffffffff810976ed>] lockdep_rcu_suspicious+0xfd/0x130
      [ 1570.586275]  [<ffffffff8153042c>] update_or_create_fnhe+0x15c/0x270
      [ 1570.586278]  [<ffffffff815305b3>] __ip_rt_update_pmtu+0x73/0xb0
      [ 1570.586282]  [<ffffffff81530619>] ip_rt_update_pmtu+0x29/0x90
      [ 1570.586285]  [<ffffffff815411dc>] inet_csk_update_pmtu+0x2c/0x80
      [ 1570.586290]  [<ffffffff81558d1e>] tcp_v4_mtu_reduced+0x2e/0xc0
      [ 1570.586293]  [<ffffffff81553bc4>] tcp_release_cb+0xa4/0xb0
      [ 1570.586296]  [<ffffffff814f2c35>] release_sock+0x55/0xa0
      [ 1570.586300]  [<ffffffff815442ef>] tcp_sendmsg+0x4af/0xf50
      [ 1570.586305]  [<ffffffff8156fc60>] inet_sendmsg+0x120/0x230
      [ 1570.586308]  [<ffffffff8156fb40>] ? inet_sk_rebuild_header+0x40/0x40
      [ 1570.586312]  [<ffffffff814f4bdd>] ? sock_update_classid+0xbd/0x3b0
      [ 1570.586315]  [<ffffffff814f4c50>] ? sock_update_classid+0x130/0x3b0
      [ 1570.586320]  [<ffffffff814ec435>] do_sock_write+0xc5/0xe0
      [ 1570.586323]  [<ffffffff814ec4a3>] sock_aio_write+0x53/0x80
      [ 1570.586328]  [<ffffffff8114bc83>] do_sync_write+0xa3/0xe0
      [ 1570.586332]  [<ffffffff8114c5a5>] vfs_write+0x165/0x180
      [ 1570.586335]  [<ffffffff8114c805>] sys_write+0x45/0x90
      [ 1570.586340]  [<ffffffff815d2722>] system_call_fastpath+0x16/0x1b
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NPavel Roskin <proski@gnu.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c5ae7d41
    • F
      net: ipv4: ipmr_expire_timer causes crash when removing net namespace · acbb219d
      Francesco Ruggeri 提交于
      When tearing down a net namespace, ipv4 mr_table structures are freed
      without first deactivating their timers. This can result in a crash in
      run_timer_softirq.
      This patch mimics the corresponding behaviour in ipv6.
      Locking and synchronization seem to be adequate.
      We are about to kfree mrt, so existing code should already make sure that
      no other references to mrt are pending or can be created by incoming traffic.
      The functions invoked here do not cause new references to mrt or other
      race conditions to be created.
      Invoking del_timer_sync guarantees that ipmr_expire_timer is inactive.
      Both ipmr_expire_process (whose completion we may have to wait in
      del_timer_sync) and mroute_clean_tables internally use mfc_unres_lock
      or other synchronizations when needed, and they both only modify mrt.
      
      Tested in Linux 3.4.8.
      Signed-off-by: NFrancesco Ruggeri <fruggeri@aristanetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      acbb219d
    • E
      netpoll: provide an IP ident in UDP frames · ee130409
      Eric Dumazet 提交于
      Let's fill IP header ident field with a meaningful value,
      it might help some setups.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ee130409
    • X
      l2tp: avoid to use synchronize_rcu in tunnel free function · 99469c32
      xeb@mail.ru 提交于
      Avoid to use synchronize_rcu in l2tp_tunnel_free because context may be
      atomic.
      Signed-off-by: NDmitry Kozlov <xeb@mail.ru>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      99469c32