1. 19 6月, 2016 1 次提交
  2. 08 6月, 2016 1 次提交
    • S
      RDS: TCP: fix race windows in send-path quiescence by rds_tcp_accept_one() · 9c79440e
      Sowmini Varadhan 提交于
      The send path needs to be quiesced before resetting callbacks from
      rds_tcp_accept_one(), and commit eb192840 ("RDS:TCP: Synchronize
      rds_tcp_accept_one with rds_send_xmit when resetting t_sock") achieves
      this using the c_state and RDS_IN_XMIT bit following the pattern
      used by rds_conn_shutdown(). However this leaves the possibility
      of a race window as shown in the sequence below
          take t_conn_lock in rds_tcp_conn_connect
          send outgoing syn to peer
          drop t_conn_lock in rds_tcp_conn_connect
          incoming from peer triggers rds_tcp_accept_one, conn is
      	marked CONNECTING
          wait for RDS_IN_XMIT to quiesce any rds_send_xmit threads
          call rds_tcp_reset_callbacks
          [.. race-window where incoming syn-ack can cause the conn
      	to be marked UP from rds_tcp_state_change ..]
          lock_sock called from rds_tcp_reset_callbacks, and we set
      	t_sock to null
      As soon as the conn is marked UP in the race-window above, rds_send_xmit()
      threads will proceed to rds_tcp_xmit and may encounter a null-pointer
      deref on the t_sock.
      
      Given that rds_tcp_state_change() is invoked in softirq context, whereas
      rds_tcp_reset_callbacks() is in workq context, and testing for RDS_IN_XMIT
      after lock_sock could result in a deadlock with tcp_sendmsg, this
      commit fixes the race by using a new c_state, RDS_TCP_RESETTING, which
      will prevent a transition to RDS_CONN_UP from rds_tcp_state_change().
      Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
      Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9c79440e
  3. 20 5月, 2016 1 次提交
  4. 04 5月, 2016 1 次提交
    • S
      RDS: TCP: Synchronize accept() and connect() paths on t_conn_lock. · bd7c5f98
      Sowmini Varadhan 提交于
      An arbitration scheme for duelling SYNs is implemented as part of
      commit 241b2719 ("RDS-TCP: Reset tcp callbacks if re-using an
      outgoing socket in rds_tcp_accept_one()") which ensures that both nodes
      involved will arrive at the same arbitration decision. However, this
      needs to be synchronized with an outgoing SYN to be generated by
      rds_tcp_conn_connect(). This commit achieves the synchronization
      through the t_conn_lock mutex in struct rds_tcp_connection.
      
      The rds_conn_state is checked in rds_tcp_conn_connect() after acquiring
      the t_conn_lock mutex.  A SYN is sent out only if the RDS connection is
      not already UP (an UP would indicate that rds_tcp_accept_one() has
      completed 3WH, so no SYN needs to be generated).
      
      Similarly, the rds_conn_state is checked in rds_tcp_accept_one() after
      acquiring the t_conn_lock mutex. The only acceptable states (to
      allow continuation of the arbitration logic) are UP (i.e., outgoing SYN
      was SYN-ACKed by peer after it sent us the SYN) or CONNECTING (we sent
      outgoing SYN before we saw incoming SYN).
      Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
      Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bd7c5f98
  5. 08 8月, 2015 2 次提交
  6. 10 5月, 2015 1 次提交
    • S
      net/rds: RDS-TCP: Always create a new rds_sock for an incoming connection. · f711a6ae
      Sowmini Varadhan 提交于
      When running RDS over TCP, the active (client) side connects to the
      listening ("passive") side at the RDS_TCP_PORT.  After the connection
      is established, if the client side reboots (potentially without even
      sending a FIN) the server still has a TCP socket in the esablished
      state.  If the server-side now gets a new SYN comes from the client
      with a different client port, TCP will create a new socket-pair, but
      the RDS layer will incorrectly pull up the old rds_connection (which
      is still associated with the stale t_sock and RDS socket state).
      
      This patch corrects this behavior by having rds_tcp_accept_one()
      always create a new connection for an incoming TCP SYN.
      The rds and tcp state associated with the old socket-pair is cleaned
      up via the rds_tcp_state_change() callback which would typically be
      invoked in most cases when the client-TCP sends a FIN on TCP restart,
      triggering a transition to CLOSE_WAIT state. In the rarer event of client
      death without a FIN, TCP_KEEPALIVE probes on the socket will detect
      the stale socket, and the TCP transition to CLOSE state will trigger
      the RDS state cleanup.
      Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f711a6ae
  7. 04 10月, 2014 1 次提交
    • H
      net/rds: do proper house keeping if connection fails in rds_tcp_conn_connect · eb74cc97
      Herton R. Krzesinski 提交于
      I see two problems if we consider the sock->ops->connect attempt to fail in
      rds_tcp_conn_connect. The first issue is that for example we don't remove the
      previously added rds_tcp_connection item to rds_tcp_tc_list at
      rds_tcp_set_callbacks, which means that on a next reconnect attempt for the
      same rds_connection, when rds_tcp_conn_connect is called we can again call
      rds_tcp_set_callbacks, resulting in duplicated items on rds_tcp_tc_list,
      leading to list corruption: to avoid this just make sure we call
      properly rds_tcp_restore_callbacks before we exit. The second issue
      is that we should also release the sock properly, by setting sock = NULL
      only if we are returning without error.
      Signed-off-by: NHerton R. Krzesinski <herton@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      eb74cc97
  8. 23 8月, 2012 1 次提交
  9. 25 9月, 2010 1 次提交
    • E
      net: fix a lockdep splat · f064af1e
      Eric Dumazet 提交于
      We have for each socket :
      
      One spinlock (sk_slock.slock)
      One rwlock (sk_callback_lock)
      
      Possible scenarios are :
      
      (A) (this is used in net/sunrpc/xprtsock.c)
      read_lock(&sk->sk_callback_lock) (without blocking BH)
      <BH>
      spin_lock(&sk->sk_slock.slock);
      ...
      read_lock(&sk->sk_callback_lock);
      ...
      
      (B)
      write_lock_bh(&sk->sk_callback_lock)
      stuff
      write_unlock_bh(&sk->sk_callback_lock)
      
      (C)
      spin_lock_bh(&sk->sk_slock)
      ...
      write_lock_bh(&sk->sk_callback_lock)
      stuff
      write_unlock_bh(&sk->sk_callback_lock)
      spin_unlock_bh(&sk->sk_slock)
      
      This (C) case conflicts with (A) :
      
      CPU1 [A]                         CPU2 [C]
      read_lock(callback_lock)
      <BH>                             spin_lock_bh(slock)
      <wait to spin_lock(slock)>
                                       <wait to write_lock_bh(callback_lock)>
      
      We have one problematic (C) use case in inet_csk_listen_stop() :
      
      local_bh_disable();
      bh_lock_sock(child); // spin_lock_bh(&sk->sk_slock)
      WARN_ON(sock_owned_by_user(child));
      ...
      sock_orphan(child); // write_lock_bh(&sk->sk_callback_lock)
      
      lockdep is not happy with this, as reported by Tetsuo Handa
      
      It seems only way to deal with this is to use read_lock_bh(callbacklock)
      everywhere.
      
      Thanks to Jarek for pointing a bug in my first attempt and suggesting
      this solution.
      Reported-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Tested-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      CC: Jarek Poplawski <jarkao2@gmail.com>
      Tested-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f064af1e
  10. 09 9月, 2010 1 次提交
  11. 18 5月, 2010 1 次提交
  12. 04 2月, 2010 1 次提交
  13. 24 8月, 2009 1 次提交