1. 08 9月, 2012 2 次提交
  2. 07 9月, 2012 1 次提交
    • E
      tcp: fix TFO regression · 7ab4551f
      Eric Dumazet 提交于
      Fengguang Wu reported various panics and bisected to commit
      8336886f (tcp: TCP Fast Open Server - support TFO listeners)
      
      Fix this by making sure socket is a TCP socket before accessing TFO data
      structures.
      
      [  233.046014] kfree_debugcheck: out of range ptr ea6000000bb8h.
      [  233.047399] ------------[ cut here ]------------
      [  233.048393] kernel BUG at /c/kernel-tests/src/stable/mm/slab.c:3074!
      [  233.048393] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
      [  233.048393] Modules linked in:
      [  233.048393] CPU 0
      [  233.048393] Pid: 3929, comm: trinity-watchdo Not tainted 3.6.0-rc3+
      #4192 Bochs Bochs
      [  233.048393] RIP: 0010:[<ffffffff81169653>]  [<ffffffff81169653>]
      kfree_debugcheck+0x27/0x2d
      [  233.048393] RSP: 0018:ffff88000facbca8  EFLAGS: 00010092
      [  233.048393] RAX: 0000000000000031 RBX: 0000ea6000000bb8 RCX:
      00000000a189a188
      [  233.048393] RDX: 000000000000a189 RSI: ffffffff8108ad32 RDI:
      ffffffff810d30f9
      [  233.048393] RBP: ffff88000facbcb8 R08: 0000000000000002 R09:
      ffffffff843846f0
      [  233.048393] R10: ffffffff810ae37c R11: 0000000000000908 R12:
      0000000000000202
      [  233.048393] R13: ffffffff823dbd5a R14: ffff88000ec5bea8 R15:
      ffffffff8363c780
      [  233.048393] FS:  00007faa6899c700(0000) GS:ffff88001f200000(0000)
      knlGS:0000000000000000
      [  233.048393] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [  233.048393] CR2: 00007faa6841019c CR3: 0000000012c82000 CR4:
      00000000000006f0
      [  233.048393] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
      0000000000000000
      [  233.048393] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
      0000000000000400
      [  233.048393] Process trinity-watchdo (pid: 3929, threadinfo
      ffff88000faca000, task ffff88000faec600)
      [  233.048393] Stack:
      [  233.048393]  0000000000000000 0000ea6000000bb8 ffff88000facbce8
      ffffffff8116ad81
      [  233.048393]  ffff88000ff588a0 ffff88000ff58850 ffff88000ff588a0
      0000000000000000
      [  233.048393]  ffff88000facbd08 ffffffff823dbd5a ffffffff823dbcb0
      ffff88000ff58850
      [  233.048393] Call Trace:
      [  233.048393]  [<ffffffff8116ad81>] kfree+0x5f/0xca
      [  233.048393]  [<ffffffff823dbd5a>] inet_sock_destruct+0xaa/0x13c
      [  233.048393]  [<ffffffff823dbcb0>] ? inet_sk_rebuild_header
      +0x319/0x319
      [  233.048393]  [<ffffffff8231c307>] __sk_free+0x21/0x14b
      [  233.048393]  [<ffffffff8231c4bd>] sk_free+0x26/0x2a
      [  233.048393]  [<ffffffff825372db>] sctp_close+0x215/0x224
      [  233.048393]  [<ffffffff810d6835>] ? lock_release+0x16f/0x1b9
      [  233.048393]  [<ffffffff823daf12>] inet_release+0x7e/0x85
      [  233.048393]  [<ffffffff82317d15>] sock_release+0x1f/0x77
      [  233.048393]  [<ffffffff82317d94>] sock_close+0x27/0x2b
      [  233.048393]  [<ffffffff81173bbe>] __fput+0x101/0x20a
      [  233.048393]  [<ffffffff81173cd5>] ____fput+0xe/0x10
      [  233.048393]  [<ffffffff810a3794>] task_work_run+0x5d/0x75
      [  233.048393]  [<ffffffff8108da70>] do_exit+0x290/0x7f5
      [  233.048393]  [<ffffffff82707415>] ? retint_swapgs+0x13/0x1b
      [  233.048393]  [<ffffffff8108e23f>] do_group_exit+0x7b/0xba
      [  233.048393]  [<ffffffff8108e295>] sys_exit_group+0x17/0x17
      [  233.048393]  [<ffffffff8270de10>] tracesys+0xdd/0xe2
      [  233.048393] Code: 59 01 5d c3 55 48 89 e5 53 41 50 0f 1f 44 00 00 48
      89 fb e8 d4 b0 f0 ff 84 c0 75 11 48 89 de 48 c7 c7 fc fa f7 82 e8 0d 0f
      57 01 <0f> 0b 5f 5b 5d c3 55 48 89 e5 0f 1f 44 00 00 48 63 87 d8 00 00
      [  233.048393] RIP  [<ffffffff81169653>] kfree_debugcheck+0x27/0x2d
      [  233.048393]  RSP <ffff88000facbca8>
      Reported-by: NFengguang Wu <wfg@linux.intel.com>
      Tested-by: NFengguang Wu <wfg@linux.intel.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: "H.K. Jerry Chu" <hkchu@google.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Acked-by: NH.K. Jerry Chu <hkchu@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7ab4551f
  3. 06 9月, 2012 1 次提交
    • J
      tcp: add generic netlink support for tcp_metrics · d23ff701
      Julian Anastasov 提交于
      Add support for genl "tcp_metrics". No locking
      is changed, only that now we can unlink and delete
      entries after grace period. We implement get/del for
      single entry and dump to support show/flush filtering
      in user space. Del without address attribute causes
      flush for all addresses, sadly under genl_mutex.
      
      v2:
      - remove rcu_assign_pointer as suggested by Eric Dumazet,
      it is not needed because there are no other writes under lock
      - move the flushing code in tcp_metrics_flush_all
      
      v3:
      - remove synchronize_rcu on flush as suggested by Eric Dumazet
      Signed-off-by: NJulian Anastasov <ja@ssi.bg>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d23ff701
  4. 04 9月, 2012 3 次提交
  5. 03 9月, 2012 1 次提交
  6. 01 9月, 2012 4 次提交
    • J
      tcp: TCP Fast Open Server - main code path · 168a8f58
      Jerry Chu 提交于
      This patch adds the main processing path to complete the TFO server
      patches.
      
      A TFO request (i.e., SYN+data packet with a TFO cookie option) first
      gets processed in tcp_v4_conn_request(). If it passes the various TFO
      checks by tcp_fastopen_check(), a child socket will be created right
      away to be accepted by applications, rather than waiting for the 3WHS
      to finish.
      
      In additon to the use of TFO cookie, a simple max_qlen based scheme
      is put in place to fend off spoofed TFO attack.
      
      When a valid ACK comes back to tcp_rcv_state_process(), it will cause
      the state of the child socket to switch from either TCP_SYN_RECV to
      TCP_ESTABLISHED, or TCP_FIN_WAIT1 to TCP_FIN_WAIT2. At this time
      retransmission will resume for any unack'ed (data, FIN,...) segments.
      Signed-off-by: NH.K. Jerry Chu <hkchu@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Tom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      168a8f58
    • J
      tcp: TCP Fast Open Server - support TFO listeners · 8336886f
      Jerry Chu 提交于
      This patch builds on top of the previous patch to add the support
      for TFO listeners. This includes -
      
      1. allocating, properly initializing, and managing the per listener
      fastopen_queue structure when TFO is enabled
      
      2. changes to the inet_csk_accept code to support TFO. E.g., the
      request_sock can no longer be freed upon accept(), not until 3WHS
      finishes
      
      3. allowing a TCP_SYN_RECV socket to properly poll() and sendmsg()
      if it's a TFO socket
      
      4. properly closing a TFO listener, and a TFO socket before 3WHS
      finishes
      
      5. supporting TCP_FASTOPEN socket option
      
      6. modifying tcp_check_req() to use to check a TFO socket as well
      as request_sock
      
      7. supporting TCP's TFO cookie option
      
      8. adding a new SYN-ACK retransmit handler to use the timer directly
      off the TFO socket rather than the listener socket. Note that TFO
      server side will not retransmit anything other than SYN-ACK until
      the 3WHS is completed.
      
      The patch also contains an important function
      "reqsk_fastopen_remove()" to manage the somewhat complex relation
      between a listener, its request_sock, and the corresponding child
      socket. See the comment above the function for the detail.
      Signed-off-by: NH.K. Jerry Chu <hkchu@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Tom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8336886f
    • J
      tcp: TCP Fast Open Server - header & support functions · 10467163
      Jerry Chu 提交于
      This patch adds all the necessary data structure and support
      functions to implement TFO server side. It also documents a number
      of flags for the sysctl_tcp_fastopen knob, and adds a few Linux
      extension MIBs.
      
      In addition, it includes the following:
      
      1. a new TCP_FASTOPEN socket option an application must call to
      supply a max backlog allowed in order to enable TFO on its listener.
      
      2. A number of key data structures:
      "fastopen_rsk" in tcp_sock - for a big socket to access its
      request_sock for retransmission and ack processing purpose. It is
      non-NULL iff 3WHS not completed.
      
      "fastopenq" in request_sock_queue - points to a per Fast Open
      listener data structure "fastopen_queue" to keep track of qlen (# of
      outstanding Fast Open requests) and max_qlen, among other things.
      
      "listener" in tcp_request_sock - to point to the original listener
      for book-keeping purpose, i.e., to maintain qlen against max_qlen
      as part of defense against IP spoofing attack.
      
      3. various data structure and functions, many in tcp_fastopen.c, to
      support server side Fast Open cookie operations, including
      /proc/sys/net/ipv4/tcp_fastopen_key to allow manual rekeying.
      Signed-off-by: NH.K. Jerry Chu <hkchu@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Tom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      10467163
    • A
      ipv4: Minor logic clean-up in ipv4_mtu · 98d75c37
      Alexander Duyck 提交于
      In ipv4_mtu there is some logic where we are testing for a non-zero value
      and a timer expiration, then setting the value to zero, and then testing if
      the value is zero we set it to a value based on the dst.  Instead of
      bothering with the extra steps it is easier to just cleanup the logic so
      that we set it to the dst based value if it is zero or if the timer has
      expired.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
      98d75c37
  7. 31 8月, 2012 3 次提交
    • E
      ipv4: must use rcu protection while calling fib_lookup · c5ae7d41
      Eric Dumazet 提交于
      Following lockdep splat was reported by Pavel Roskin :
      
      [ 1570.586223] ===============================
      [ 1570.586225] [ INFO: suspicious RCU usage. ]
      [ 1570.586228] 3.6.0-rc3-wl-main #98 Not tainted
      [ 1570.586229] -------------------------------
      [ 1570.586231] /home/proski/src/linux/net/ipv4/route.c:645 suspicious rcu_dereference_check() usage!
      [ 1570.586233]
      [ 1570.586233] other info that might help us debug this:
      [ 1570.586233]
      [ 1570.586236]
      [ 1570.586236] rcu_scheduler_active = 1, debug_locks = 0
      [ 1570.586238] 2 locks held by Chrome_IOThread/4467:
      [ 1570.586240]  #0:  (slock-AF_INET){+.-...}, at: [<ffffffff814f2c0c>] release_sock+0x2c/0xa0
      [ 1570.586253]  #1:  (fnhe_lock){+.-...}, at: [<ffffffff815302fc>] update_or_create_fnhe+0x2c/0x270
      [ 1570.586260]
      [ 1570.586260] stack backtrace:
      [ 1570.586263] Pid: 4467, comm: Chrome_IOThread Not tainted 3.6.0-rc3-wl-main #98
      [ 1570.586265] Call Trace:
      [ 1570.586271]  [<ffffffff810976ed>] lockdep_rcu_suspicious+0xfd/0x130
      [ 1570.586275]  [<ffffffff8153042c>] update_or_create_fnhe+0x15c/0x270
      [ 1570.586278]  [<ffffffff815305b3>] __ip_rt_update_pmtu+0x73/0xb0
      [ 1570.586282]  [<ffffffff81530619>] ip_rt_update_pmtu+0x29/0x90
      [ 1570.586285]  [<ffffffff815411dc>] inet_csk_update_pmtu+0x2c/0x80
      [ 1570.586290]  [<ffffffff81558d1e>] tcp_v4_mtu_reduced+0x2e/0xc0
      [ 1570.586293]  [<ffffffff81553bc4>] tcp_release_cb+0xa4/0xb0
      [ 1570.586296]  [<ffffffff814f2c35>] release_sock+0x55/0xa0
      [ 1570.586300]  [<ffffffff815442ef>] tcp_sendmsg+0x4af/0xf50
      [ 1570.586305]  [<ffffffff8156fc60>] inet_sendmsg+0x120/0x230
      [ 1570.586308]  [<ffffffff8156fb40>] ? inet_sk_rebuild_header+0x40/0x40
      [ 1570.586312]  [<ffffffff814f4bdd>] ? sock_update_classid+0xbd/0x3b0
      [ 1570.586315]  [<ffffffff814f4c50>] ? sock_update_classid+0x130/0x3b0
      [ 1570.586320]  [<ffffffff814ec435>] do_sock_write+0xc5/0xe0
      [ 1570.586323]  [<ffffffff814ec4a3>] sock_aio_write+0x53/0x80
      [ 1570.586328]  [<ffffffff8114bc83>] do_sync_write+0xa3/0xe0
      [ 1570.586332]  [<ffffffff8114c5a5>] vfs_write+0x165/0x180
      [ 1570.586335]  [<ffffffff8114c805>] sys_write+0x45/0x90
      [ 1570.586340]  [<ffffffff815d2722>] system_call_fastpath+0x16/0x1b
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NPavel Roskin <proski@gnu.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c5ae7d41
    • F
      net: ipv4: ipmr_expire_timer causes crash when removing net namespace · acbb219d
      Francesco Ruggeri 提交于
      When tearing down a net namespace, ipv4 mr_table structures are freed
      without first deactivating their timers. This can result in a crash in
      run_timer_softirq.
      This patch mimics the corresponding behaviour in ipv6.
      Locking and synchronization seem to be adequate.
      We are about to kfree mrt, so existing code should already make sure that
      no other references to mrt are pending or can be created by incoming traffic.
      The functions invoked here do not cause new references to mrt or other
      race conditions to be created.
      Invoking del_timer_sync guarantees that ipmr_expire_timer is inactive.
      Both ipmr_expire_process (whose completion we may have to wait in
      del_timer_sync) and mroute_clean_tables internally use mfc_unres_lock
      or other synchronizations when needed, and they both only modify mrt.
      
      Tested in Linux 3.4.8.
      Signed-off-by: NFrancesco Ruggeri <fruggeri@aristanetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      acbb219d
    • P
      netfilter: nf_nat_sip: fix incorrect handling of EBUSY for RTCP expectation · 3f509c68
      Pablo Neira Ayuso 提交于
      We're hitting bug while trying to reinsert an already existing
      expectation:
      
      kernel BUG at kernel/timer.c:895!
      invalid opcode: 0000 [#1] SMP
      [...]
      Call Trace:
       <IRQ>
       [<ffffffffa0069563>] nf_ct_expect_related_report+0x4a0/0x57a [nf_conntrack]
       [<ffffffff812d423a>] ? in4_pton+0x72/0x131
       [<ffffffffa00ca69e>] ip_nat_sdp_media+0xeb/0x185 [nf_nat_sip]
       [<ffffffffa00b5b9b>] set_expected_rtp_rtcp+0x32d/0x39b [nf_conntrack_sip]
       [<ffffffffa00b5f15>] process_sdp+0x30c/0x3ec [nf_conntrack_sip]
       [<ffffffff8103f1eb>] ? irq_exit+0x9a/0x9c
       [<ffffffffa00ca738>] ? ip_nat_sdp_media+0x185/0x185 [nf_nat_sip]
      
      We have to remove the RTP expectation if the RTCP expectation hits EBUSY
      since we keep trying with other ports until we succeed.
      Reported-by: NRafal Fitt <rafalf@aplusc.com.pl>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      3f509c68
  8. 30 8月, 2012 8 次提交
  9. 27 8月, 2012 1 次提交
  10. 25 8月, 2012 1 次提交
  11. 24 8月, 2012 2 次提交
  12. 23 8月, 2012 3 次提交
    • E
      net: remove delay at device dismantle · 0115e8e3
      Eric Dumazet 提交于
      I noticed extra one second delay in device dismantle, tracked down to
      a call to dst_dev_event() while some call_rcu() are still in RCU queues.
      
      These call_rcu() were posted by rt_free(struct rtable *rt) calls.
      
      We then wait a little (but one second) in netdev_wait_allrefs() before
      kicking again NETDEV_UNREGISTER.
      
      As the call_rcu() are now completed, dst_dev_event() can do the needed
      device swap on busy dst.
      
      To solve this problem, add a new NETDEV_UNREGISTER_FINAL, called
      after a rcu_barrier(), but outside of RTNL lock.
      
      Use NETDEV_UNREGISTER_FINAL with care !
      
      Change dst_dev_event() handler to react to NETDEV_UNREGISTER_FINAL
      
      Also remove NETDEV_UNREGISTER_BATCH, as its not used anymore after
      IP cache removal.
      
      With help from Gao feng
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Tom Herbert <therbert@google.com>
      Cc: Mahesh Bandewar <maheshb@google.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Gao feng <gaofeng@cn.fujitsu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0115e8e3
    • E
      ipv4: properly update pmtu · 9b04f350
      Eric Dumazet 提交于
      Sylvain Munault reported following info :
      
       - TCP connection get "stuck" with data in send queue when doing
         "large" transfers ( like typing 'ps ax' on a ssh connection )
       - Only happens on path where the PMTU is lower than the MTU of
         the interface
       - Is not present right after boot, it only appears 10-20min after
         boot or so. (and that's inside the _same_ TCP connection, it works
         fine at first and then in the same ssh session, it'll get stuck)
       - Definitely seems related to fragments somehow since I see a router
         sending ICMP message saying fragmentation is needed.
       - Exact same setup works fine with kernel 3.5.1
      
      Problem happens when the 10 minutes (ip_rt_mtu_expires) expiration
      period is over.
      
      ip_rt_update_pmtu() calls dst_set_expires() to rearm a new expiration,
      but dst_set_expires() does nothing because dst.expires is already set.
      
      It seems we want to set the expires field to a new value, regardless
      of prior one.
      
      With help from Julian Anastasov.
      Reported-by: NSylvain Munaut <s.munaut@whatever-company.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      CC: Julian Anastasov <ja@ssi.bg>
      Tested-by: NSylvain Munaut <s.munaut@whatever-company.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9b04f350
    • J
      netfilter: remove unnecessary goto statement for error recovery · 90efbed1
      Jean Sacren 提交于
      Usually it's a good practice to use goto statement for error recovery
      when initializing the module. This approach could be an overkill if:
      
       1) there is only one fail case;
       2) success and failure use the same return statement.
      
      For a cleaner approach, remove the unnecessary goto statement and
      directly implement error recovery.
      Signed-off-by: NJean Sacren <sakiwit@gmail.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      90efbed1
  13. 22 8月, 2012 3 次提交
    • E
      ipv4: fix ip header ident selection in __ip_make_skb() · a9915a1b
      Eric Dumazet 提交于
      Christian Casteyde reported a kmemcheck 32-bit read from uninitialized
      memory in __ip_select_ident().
      
      It turns out that __ip_make_skb() called ip_select_ident() before
      properly initializing iph->daddr.
      
      This is a bug uncovered by commit 1d861aa4 (inet: Minimize use of
      cached route inetpeer.)
      
      Addresses https://bugzilla.kernel.org/show_bug.cgi?id=46131Reported-by: NChristian Casteyde <casteyde.christian@free.fr>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Stephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a9915a1b
    • C
      ipv4: Use newinet->inet_opt in inet_csk_route_child_sock() · 1a7b27c9
      Christoph Paasch 提交于
      Since 0e734419 ("ipv4: Use inet_csk_route_child_sock() in DCCP and
      TCP."), inet_csk_route_child_sock() is called instead of
      inet_csk_route_req().
      
      However, after creating the child-sock in tcp/dccp_v4_syn_recv_sock(),
      ireq->opt is set to NULL, before calling inet_csk_route_child_sock().
      Thus, inside inet_csk_route_child_sock() opt is always NULL and the
      SRR-options are not respected anymore.
      Packets sent by the server won't have the correct destination-IP.
      
      This patch fixes it by accessing newinet->inet_opt instead of ireq->opt
      inside inet_csk_route_child_sock().
      Reported-by: NLuca Boccassi <luca.boccassi@gmail.com>
      Signed-off-by: NChristoph Paasch <christoph.paasch@uclouvain.be>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1a7b27c9
    • E
      tcp: fix possible socket refcount problem · 144d56e9
      Eric Dumazet 提交于
      Commit 6f458dfb (tcp: improve latencies of timer triggered events)
      added bug leading to following trace :
      
      [ 2866.131281] IPv4: Attempt to release TCP socket in state 1 ffff880019ec0000
      [ 2866.131726]
      [ 2866.132188] =========================
      [ 2866.132281] [ BUG: held lock freed! ]
      [ 2866.132281] 3.6.0-rc1+ #622 Not tainted
      [ 2866.132281] -------------------------
      [ 2866.132281] kworker/0:1/652 is freeing memory ffff880019ec0000-ffff880019ec0a1f, with a lock still held there!
      [ 2866.132281]  (sk_lock-AF_INET-RPC){+.+...}, at: [<ffffffff81903619>] tcp_sendmsg+0x29/0xcc6
      [ 2866.132281] 4 locks held by kworker/0:1/652:
      [ 2866.132281]  #0:  (rpciod){.+.+.+}, at: [<ffffffff81083567>] process_one_work+0x1de/0x47f
      [ 2866.132281]  #1:  ((&task->u.tk_work)){+.+.+.}, at: [<ffffffff81083567>] process_one_work+0x1de/0x47f
      [ 2866.132281]  #2:  (sk_lock-AF_INET-RPC){+.+...}, at: [<ffffffff81903619>] tcp_sendmsg+0x29/0xcc6
      [ 2866.132281]  #3:  (&icsk->icsk_retransmit_timer){+.-...}, at: [<ffffffff81078017>] run_timer_softirq+0x1ad/0x35f
      [ 2866.132281]
      [ 2866.132281] stack backtrace:
      [ 2866.132281] Pid: 652, comm: kworker/0:1 Not tainted 3.6.0-rc1+ #622
      [ 2866.132281] Call Trace:
      [ 2866.132281]  <IRQ>  [<ffffffff810bc527>] debug_check_no_locks_freed+0x112/0x159
      [ 2866.132281]  [<ffffffff818a0839>] ? __sk_free+0xfd/0x114
      [ 2866.132281]  [<ffffffff811549fa>] kmem_cache_free+0x6b/0x13a
      [ 2866.132281]  [<ffffffff818a0839>] __sk_free+0xfd/0x114
      [ 2866.132281]  [<ffffffff818a08c0>] sk_free+0x1c/0x1e
      [ 2866.132281]  [<ffffffff81911e1c>] tcp_write_timer+0x51/0x56
      [ 2866.132281]  [<ffffffff81078082>] run_timer_softirq+0x218/0x35f
      [ 2866.132281]  [<ffffffff81078017>] ? run_timer_softirq+0x1ad/0x35f
      [ 2866.132281]  [<ffffffff810f5831>] ? rb_commit+0x58/0x85
      [ 2866.132281]  [<ffffffff81911dcb>] ? tcp_write_timer_handler+0x148/0x148
      [ 2866.132281]  [<ffffffff81070bd6>] __do_softirq+0xcb/0x1f9
      [ 2866.132281]  [<ffffffff81a0a00c>] ? _raw_spin_unlock+0x29/0x2e
      [ 2866.132281]  [<ffffffff81a1227c>] call_softirq+0x1c/0x30
      [ 2866.132281]  [<ffffffff81039f38>] do_softirq+0x4a/0xa6
      [ 2866.132281]  [<ffffffff81070f2b>] irq_exit+0x51/0xad
      [ 2866.132281]  [<ffffffff81a129cd>] do_IRQ+0x9d/0xb4
      [ 2866.132281]  [<ffffffff81a0a3ef>] common_interrupt+0x6f/0x6f
      [ 2866.132281]  <EOI>  [<ffffffff8109d006>] ? sched_clock_cpu+0x58/0xd1
      [ 2866.132281]  [<ffffffff81a0a172>] ? _raw_spin_unlock_irqrestore+0x4c/0x56
      [ 2866.132281]  [<ffffffff81078692>] mod_timer+0x178/0x1a9
      [ 2866.132281]  [<ffffffff818a00aa>] sk_reset_timer+0x19/0x26
      [ 2866.132281]  [<ffffffff8190b2cc>] tcp_rearm_rto+0x99/0xa4
      [ 2866.132281]  [<ffffffff8190dfba>] tcp_event_new_data_sent+0x6e/0x70
      [ 2866.132281]  [<ffffffff8190f7ea>] tcp_write_xmit+0x7de/0x8e4
      [ 2866.132281]  [<ffffffff818a565d>] ? __alloc_skb+0xa0/0x1a1
      [ 2866.132281]  [<ffffffff8190f952>] __tcp_push_pending_frames+0x2e/0x8a
      [ 2866.132281]  [<ffffffff81904122>] tcp_sendmsg+0xb32/0xcc6
      [ 2866.132281]  [<ffffffff819229c2>] inet_sendmsg+0xaa/0xd5
      [ 2866.132281]  [<ffffffff81922918>] ? inet_autobind+0x5f/0x5f
      [ 2866.132281]  [<ffffffff810ee7f1>] ? trace_clock_local+0x9/0xb
      [ 2866.132281]  [<ffffffff8189adab>] sock_sendmsg+0xa3/0xc4
      [ 2866.132281]  [<ffffffff810f5de6>] ? rb_reserve_next_event+0x26f/0x2d5
      [ 2866.132281]  [<ffffffff8103e6a9>] ? native_sched_clock+0x29/0x6f
      [ 2866.132281]  [<ffffffff8103e6f8>] ? sched_clock+0x9/0xd
      [ 2866.132281]  [<ffffffff810ee7f1>] ? trace_clock_local+0x9/0xb
      [ 2866.132281]  [<ffffffff8189ae03>] kernel_sendmsg+0x37/0x43
      [ 2866.132281]  [<ffffffff8199ce49>] xs_send_kvec+0x77/0x80
      [ 2866.132281]  [<ffffffff8199cec1>] xs_sendpages+0x6f/0x1a0
      [ 2866.132281]  [<ffffffff8107826d>] ? try_to_del_timer_sync+0x55/0x61
      [ 2866.132281]  [<ffffffff8199d0d2>] xs_tcp_send_request+0x55/0xf1
      [ 2866.132281]  [<ffffffff8199bb90>] xprt_transmit+0x89/0x1db
      [ 2866.132281]  [<ffffffff81999bcd>] ? call_connect+0x3c/0x3c
      [ 2866.132281]  [<ffffffff81999d92>] call_transmit+0x1c5/0x20e
      [ 2866.132281]  [<ffffffff819a0d55>] __rpc_execute+0x6f/0x225
      [ 2866.132281]  [<ffffffff81999bcd>] ? call_connect+0x3c/0x3c
      [ 2866.132281]  [<ffffffff819a0f33>] rpc_async_schedule+0x28/0x34
      [ 2866.132281]  [<ffffffff810835d6>] process_one_work+0x24d/0x47f
      [ 2866.132281]  [<ffffffff81083567>] ? process_one_work+0x1de/0x47f
      [ 2866.132281]  [<ffffffff819a0f0b>] ? __rpc_execute+0x225/0x225
      [ 2866.132281]  [<ffffffff81083a6d>] worker_thread+0x236/0x317
      [ 2866.132281]  [<ffffffff81083837>] ? process_scheduled_works+0x2f/0x2f
      [ 2866.132281]  [<ffffffff8108b7b8>] kthread+0x9a/0xa2
      [ 2866.132281]  [<ffffffff81a12184>] kernel_thread_helper+0x4/0x10
      [ 2866.132281]  [<ffffffff81a0a4b0>] ? retint_restore_args+0x13/0x13
      [ 2866.132281]  [<ffffffff8108b71e>] ? __init_kthread_worker+0x5a/0x5a
      [ 2866.132281]  [<ffffffff81a12180>] ? gs_change+0x13/0x13
      [ 2866.308506] IPv4: Attempt to release TCP socket in state 1 ffff880019ec0000
      [ 2866.309689] =============================================================================
      [ 2866.310254] BUG TCP (Not tainted): Object already free
      [ 2866.310254] -----------------------------------------------------------------------------
      [ 2866.310254]
      
      The bug comes from the fact that timer set in sk_reset_timer() can run
      before we actually do the sock_hold(). socket refcount reaches zero and
      we free the socket too soon.
      
      timer handler is not allowed to reduce socket refcnt if socket is owned
      by the user, or we need to change sk_reset_timer() implementation.
      
      We should take a reference on the socket in case TCP_DELACK_TIMER_DEFERRED
      or TCP_DELACK_TIMER_DEFERRED bit are set in tsq_flags
      
      Also fix a typo in tcp_delack_timer(), where TCP_WRITE_TIMER_DEFERRED
      was used instead of TCP_DELACK_TIMER_DEFERRED.
      
      For consistency, use same socket refcount change for TCP_MTU_REDUCED_DEFERRED,
      even if not fired from a timer.
      Reported-by: NFengguang Wu <fengguang.wu@intel.com>
      Tested-by: NFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      144d56e9
  14. 20 8月, 2012 1 次提交
  15. 16 8月, 2012 1 次提交
  16. 15 8月, 2012 5 次提交
新手
引导
客服 返回
顶部