1. 08 3月, 2017 1 次提交
    • S
      rds: tcp: Sequence teardown of listen and acceptor sockets to avoid races · b21dd450
      Sowmini Varadhan 提交于
      Commit a93d01f5 ("RDS: TCP: avoid bad page reference in
      rds_tcp_listen_data_ready") added the function
      rds_tcp_listen_sock_def_readable()  to handle the case when a
      partially set-up acceptor socket drops into rds_tcp_listen_data_ready().
      However, if the listen socket (rtn->rds_tcp_listen_sock) is itself going
      through a tear-down via rds_tcp_listen_stop(), the (*ready)() will be
      null and we would hit a panic  of the form
        BUG: unable to handle kernel NULL pointer dereference at   (null)
        IP:           (null)
         :
        ? rds_tcp_listen_data_ready+0x59/0xb0 [rds_tcp]
        tcp_data_queue+0x39d/0x5b0
        tcp_rcv_established+0x2e5/0x660
        tcp_v4_do_rcv+0x122/0x220
        tcp_v4_rcv+0x8b7/0x980
          :
      In the above case, it is not fatal to encounter a NULL value for
      ready- we should just drop the packet and let the flush of the
      acceptor thread finish gracefully.
      
      In general, the tear-down sequence for listen() and accept() socket
      that is ensured by this commit is:
           rtn->rds_tcp_listen_sock = NULL; /* prevent any new accepts */
           In rds_tcp_listen_stop():
               serialize with, and prevent, further callbacks using lock_sock()
               flush rds_wq
               flush acceptor workq
               sock_release(listen socket)
      Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b21dd450
  2. 16 7月, 2016 1 次提交
    • S
      RDS: TCP: avoid bad page reference in rds_tcp_listen_data_ready · a93d01f5
      Sowmini Varadhan 提交于
      As the existing comments in rds_tcp_listen_data_ready() indicate,
      it is possible under some race-windows to get to this function with the
      accept() socket. If that happens, we could run into a sequence whereby
      
         thread 1				thread 2
      
      rds_tcp_accept_one() thread
      sets up new_sock via ->accept().
      The sk_user_data is now
      sock_def_readable
      					data comes in for new_sock,
      					->sk_data_ready is called, and
      					we land in rds_tcp_listen_data_ready
      rds_tcp_set_callbacks()
      takes the sk_callback_lock and
      sets up sk_user_data to be the cp
      					read_lock sk_callback_lock
      					ready = cp
      					unlock sk_callback_lock
      					page fault on ready
      
      In the above sequence, we end up with a panic on a bad page reference
      when trying to execute (*ready)(). Instead we need to call
      sock_def_readable() safely, which is what this patch achieves.
      Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a93d01f5
  3. 02 7月, 2016 5 次提交
  4. 19 6月, 2016 1 次提交
  5. 08 6月, 2016 1 次提交
  6. 04 5月, 2016 1 次提交
    • S
      RDS: TCP: Synchronize accept() and connect() paths on t_conn_lock. · bd7c5f98
      Sowmini Varadhan 提交于
      An arbitration scheme for duelling SYNs is implemented as part of
      commit 241b2719 ("RDS-TCP: Reset tcp callbacks if re-using an
      outgoing socket in rds_tcp_accept_one()") which ensures that both nodes
      involved will arrive at the same arbitration decision. However, this
      needs to be synchronized with an outgoing SYN to be generated by
      rds_tcp_conn_connect(). This commit achieves the synchronization
      through the t_conn_lock mutex in struct rds_tcp_connection.
      
      The rds_conn_state is checked in rds_tcp_conn_connect() after acquiring
      the t_conn_lock mutex.  A SYN is sent out only if the RDS connection is
      not already UP (an UP would indicate that rds_tcp_accept_one() has
      completed 3WH, so no SYN needs to be generated).
      
      Similarly, the rds_conn_state is checked in rds_tcp_accept_one() after
      acquiring the t_conn_lock mutex. The only acceptable states (to
      allow continuation of the arbitration logic) are UP (i.e., outgoing SYN
      was SYN-ACKed by peer after it sent us the SYN) or CONNECTING (we sent
      outgoing SYN before we saw incoming SYN).
      Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
      Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bd7c5f98
  7. 08 8月, 2015 1 次提交
    • S
      RDS-TCP: Support multiple RDS-TCP listen endpoints, one per netns. · 467fa153
      Sowmini Varadhan 提交于
      Register pernet subsys init/stop functions that will set up
      and tear down per-net RDS-TCP listen endpoints. Unregister
      pernet subusys functions on 'modprobe -r' to clean up these
      end points.
      
      Enable keepalive on both accept and connect socket endpoints.
      The keepalive timer expiration will ensure that client socket
      endpoints will be removed as appropriate from the netns when
      an interface is removed from a namespace.
      
      Register a device notifier callback that will clean up all
      sockets (and thus avoid the need to wait for keepalive timeout)
      when the loopback device is unregistered from the netns indicating
      that the netns is getting deleted.
      Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      467fa153
  8. 24 11月, 2014 1 次提交
  9. 12 4月, 2014 1 次提交
    • D
      net: Fix use after free by removing length arg from sk_data_ready callbacks. · 676d2369
      David S. Miller 提交于
      Several spots in the kernel perform a sequence like:
      
      	skb_queue_tail(&sk->s_receive_queue, skb);
      	sk->sk_data_ready(sk, skb->len);
      
      But at the moment we place the SKB onto the socket receive queue it
      can be consumed and freed up.  So this skb->len access is potentially
      to freed up memory.
      
      Furthermore, the skb->len can be modified by the consumer so it is
      possible that the value isn't accurate.
      
      And finally, no actual implementation of this callback actually uses
      the length argument.  And since nobody actually cared about it's
      value, lots of call sites pass arbitrary values in such as '0' and
      even '1'.
      
      So just remove the length argument from the callback, that way there
      is no confusion whatsoever and all of these use-after-free cases get
      fixed as a side effect.
      
      Based upon a patch by Eric Dumazet and his suggestion to audit this
      issue tree-wide.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      676d2369
  10. 21 10月, 2010 1 次提交
  11. 09 9月, 2010 3 次提交
  12. 24 8月, 2009 1 次提交