1. 09 3月, 2018 2 次提交
  2. 08 3月, 2018 2 次提交
  3. 28 2月, 2018 1 次提交
  4. 17 2月, 2018 3 次提交
    • S
      rds: zerocopy Tx support. · 0cebacce
      Sowmini Varadhan 提交于
      If the MSG_ZEROCOPY flag is specified with rds_sendmsg(), and,
      if the SO_ZEROCOPY socket option has been set on the PF_RDS socket,
      application pages sent down with rds_sendmsg() are pinned.
      
      The pinning uses the accounting infrastructure added by
      Commit a91dbff5 ("sock: ulimit on MSG_ZEROCOPY pages")
      
      The payload bytes in the message may not be modified for the
      duration that the message has been pinned. A multi-threaded
      application using this infrastructure may thus need to be notified
      about send-completion so that it can free/reuse the buffers
      passed to rds_sendmsg(). Notification of send-completion will
      identify each message-buffer by a cookie that the application
      must specify as ancillary data to rds_sendmsg().
      The ancillary data in this case has cmsg_level == SOL_RDS
      and cmsg_type == RDS_CMSG_ZCOPY_COOKIE.
      Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
      Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Acked-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0cebacce
    • S
      rds: support for zcopy completion notification · 01883eda
      Sowmini Varadhan 提交于
      RDS removes a datagram (rds_message) from the retransmit queue when
      an ACK is received. The ACK indicates that the receiver has queued
      the RDS datagram, so that the sender can safely forget the datagram.
      When all references to the rds_message are quiesced, rds_message_purge
      is called to release resources used by the rds_message
      
      If the datagram to be removed had pinned pages set up, add
      an entry to the rs->rs_znotify_queue so that the notifcation
      will be sent up via rds_rm_zerocopy_callback() when the
      rds_message is eventually freed by rds_message_purge.
      
      rds_rm_zerocopy_callback() attempts to batch the number of cookies
      sent with each notification  to a max of SO_EE_ORIGIN_MAX_ZCOOKIES.
      This is achieved by checking the tail skb in the sk_error_queue:
      if this has room for one more cookie, the cookie from the
      current notification is added; else a new skb is added to the
      sk_error_queue. Every invocation of rds_rm_zerocopy_callback() will
      trigger a ->sk_error_report to notify the application.
      Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
      Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Acked-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      01883eda
    • S
      rds: hold a sock ref from rds_message to the rds_sock · ea8994cb
      Sowmini Varadhan 提交于
      The existing model holds a reference from the rds_sock to the
      rds_message, but the rds_message does not itself hold a sock_put()
      on the rds_sock. Instead the m_rs field in the rds_message is
      assigned when the message is queued on the sock, and nulled when
      the message is dequeued from the sock.
      
      We want to be able to notify userspace when the rds_message
      is actually freed (from rds_message_purge(), after the refcounts
      to the rds_message go to 0). At the time that rds_message_purge()
      is called, the message is no longer on the rds_sock retransmit
      queue. Thus the explicit reference for the m_rs is needed to
      send a notification that will signal to userspace that
      it is now safe to free/reuse any pages that may have
      been pinned down for zerocopy.
      
      This patch manages the m_rs assignment in the rds_message with
      the necessary refcount book-keeping.
      Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
      Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ea8994cb
  5. 05 7月, 2017 1 次提交
  6. 18 11月, 2016 1 次提交
    • S
      RDS: TCP: Track peer's connection generation number · 905dd418
      Sowmini Varadhan 提交于
      The RDS transport has to be able to distinguish between
      two types of failure events:
      (a) when the transport fails (e.g., TCP connection reset)
          but the RDS socket/connection layer on both sides stays
          the same
      (b) when the peer's RDS layer itself resets (e.g., due to module
          reload or machine reboot at the peer)
      In case (a) both sides must reconnect and continue the RDS messaging
      without any message loss or disruption to the message sequence numbers,
      and this is achieved by rds_send_path_reset().
      
      In case (b) we should reset all rds_connection state to the
      new incarnation of the peer. Examples of state that needs to
      be reset are next expected rx sequence number from, or messages to be
      retransmitted to, the new incarnation of the peer.
      
      To achieve this, the RDS handshake probe added as part of
      commit 5916e2c1 ("RDS: TCP: Enable multipath RDS for TCP")
      is enhanced so that sender and receiver of the RDS ping-probe
      will add a generation number as part of the RDS_EXTHDR_GEN_NUM
      extension header. Each peer stores local and remote generation
      numbers as part of each rds_connection. Changes in generation
      number will be detected via incoming handshake probe ping
      request or response and will allow the receiver to reset rds_connection
      state.
      Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
      Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      905dd418
  7. 16 7月, 2016 1 次提交
  8. 08 2月, 2015 1 次提交
  9. 16 12月, 2014 1 次提交
  10. 24 11月, 2014 2 次提交
  11. 05 3月, 2013 2 次提交
    • C
      rds: simplify a warning message · 7dac1b51
      Cong Wang 提交于
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com>
      Signed-off-by: NCong Wang <amwang@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7dac1b51
    • C
      rds: limit the size allocated by rds_message_alloc() · ece6b0a2
      Cong Wang 提交于
      Dave Jones reported the following bug:
      
      "When fed mangled socket data, rds will trust what userspace gives it,
      and tries to allocate enormous amounts of memory larger than what
      kmalloc can satisfy."
      
      WARNING: at mm/page_alloc.c:2393 __alloc_pages_nodemask+0xa0d/0xbe0()
      Hardware name: GA-MA78GM-S2H
      Modules linked in: vmw_vsock_vmci_transport vmw_vmci vsock fuse bnep dlci bridge 8021q garp stp mrp binfmt_misc l2tp_ppp l2tp_core rfcomm s
      Pid: 24652, comm: trinity-child2 Not tainted 3.8.0+ #65
      Call Trace:
       [<ffffffff81044155>] warn_slowpath_common+0x75/0xa0
       [<ffffffff8104419a>] warn_slowpath_null+0x1a/0x20
       [<ffffffff811444ad>] __alloc_pages_nodemask+0xa0d/0xbe0
       [<ffffffff8100a196>] ? native_sched_clock+0x26/0x90
       [<ffffffff810b2128>] ? trace_hardirqs_off_caller+0x28/0xc0
       [<ffffffff810b21cd>] ? trace_hardirqs_off+0xd/0x10
       [<ffffffff811861f8>] alloc_pages_current+0xb8/0x180
       [<ffffffff8113eaaa>] __get_free_pages+0x2a/0x80
       [<ffffffff811934fe>] kmalloc_order_trace+0x3e/0x1a0
       [<ffffffff81193955>] __kmalloc+0x2f5/0x3a0
       [<ffffffff8104df0c>] ? local_bh_enable_ip+0x7c/0xf0
       [<ffffffffa0401ab3>] rds_message_alloc+0x23/0xb0 [rds]
       [<ffffffffa04043a1>] rds_sendmsg+0x2b1/0x990 [rds]
       [<ffffffff810b21cd>] ? trace_hardirqs_off+0xd/0x10
       [<ffffffff81564620>] sock_sendmsg+0xb0/0xe0
       [<ffffffff810b2052>] ? get_lock_stats+0x22/0x70
       [<ffffffff810b24be>] ? put_lock_stats.isra.23+0xe/0x40
       [<ffffffff81567f30>] sys_sendto+0x130/0x180
       [<ffffffff810b872d>] ? trace_hardirqs_on+0xd/0x10
       [<ffffffff816c547b>] ? _raw_spin_unlock_irq+0x3b/0x60
       [<ffffffff816cd767>] ? sysret_check+0x1b/0x56
       [<ffffffff810b8695>] ? trace_hardirqs_on_caller+0x115/0x1a0
       [<ffffffff81341d8e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
       [<ffffffff816cd742>] system_call_fastpath+0x16/0x1b
      ---[ end trace eed6ae990d018c8b ]---
      Reported-by: NDave Jones <davej@redhat.com>
      Cc: Dave Jones <davej@redhat.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com>
      Signed-off-by: NCong Wang <amwang@redhat.com>
      Acked-by: NVenkat Venkatsubra <venkat.x.venkatsubra@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ece6b0a2
  12. 01 11月, 2011 1 次提交
  13. 09 11月, 2010 1 次提交
  14. 31 10月, 2010 1 次提交
    • A
      RDS: Let rds_message_alloc_sgs() return NULL · d139ff09
      Andy Grover 提交于
      Even with the previous fix, we still are reading the iovecs once
      to determine SGs needed, and then again later on. Preallocating
      space for sg lists as part of rds_message seemed like a good idea
      but it might be better to not do this. While working to redo that
      code, this patch attempts to protect against userspace rewriting
      the rds_iovec array between the first and second accesses.
      
      The consequences of this would be either a too-small or too-large
      sg list array. Too large is not an issue. This patch changes all
      callers of message_alloc_sgs to handle running out of preallocated
      sgs, and fail gracefully.
      Signed-off-by: NAndy Grover <andy.grover@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d139ff09
  15. 21 10月, 2010 1 次提交
  16. 09 9月, 2010 19 次提交