1. 14 3月, 2020 2 次提交
    • D
      rxrpc: Fix call interruptibility handling · e138aa7d
      David Howells 提交于
      Fix the interruptibility of kernel-initiated client calls so that they're
      either only interruptible when they're waiting for a call slot to come
      available or they're not interruptible at all.  Either way, they're not
      interruptible during transmission.
      
      This should help prevent StoreData calls from being interrupted when
      writeback is in progress.  It doesn't, however, handle interruption during
      the receive phase.
      
      Userspace-initiated calls are still interruptable.  After the signal has
      been handled, sendmsg() will return the amount of data copied out of the
      buffer and userspace can perform another sendmsg() call to continue
      transmission.
      
      Fixes: bc5e3a54 ("rxrpc: Use MSG_WAITALL to tell sendmsg() to temporarily ignore signals")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      e138aa7d
    • D
      rxrpc: Abstract out the calculation of whether there's Tx space · 158fe666
      David Howells 提交于
      Abstract out the calculation of there being sufficient Tx buffer space.
      This is reproduced several times in the rxrpc sendmsg code.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      158fe666
  2. 07 10月, 2019 2 次提交
    • D
      rxrpc: Fix call crypto state cleanup · 91fcfbe8
      David Howells 提交于
      Fix the cleanup of the crypto state on a call after the call has been
      disconnected.  As the call has been disconnected, its connection ref has
      been discarded and so we can't go through that to get to the security ops
      table.
      
      Fix this by caching the security ops pointer in the rxrpc_call struct and
      using that when freeing the call security state.  Also use this in other
      places we're dealing with call-specific security.
      
      The symptoms look like:
      
          BUG: KASAN: use-after-free in rxrpc_release_call+0xb2d/0xb60
          net/rxrpc/call_object.c:481
          Read of size 8 at addr ffff888062ffeb50 by task syz-executor.5/4764
      
      Fixes: 1db88c53 ("rxrpc: Fix -Wframe-larger-than= warnings from on-stack crypto")
      Reported-by: syzbot+eed305768ece6682bb7f@syzkaller.appspotmail.com
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      91fcfbe8
    • D
      rxrpc: Fix call ref leak · c48fc11b
      David Howells 提交于
      When sendmsg() finds a call to continue on with, if the call is in an
      inappropriate state, it doesn't release the ref it just got on that call
      before returning an error.
      
      This causes the following symptom to show up with kasan:
      
      	BUG: KASAN: use-after-free in rxrpc_send_keepalive+0x8a2/0x940
      	net/rxrpc/output.c:635
      	Read of size 8 at addr ffff888064219698 by task kworker/0:3/11077
      
      where line 635 is:
      
      	whdr.epoch	= htonl(peer->local->rxnet->epoch);
      
      The local endpoint (which cannot be pinned by the call) has been released,
      but not the peer (which is pinned by the call).
      
      Fix this by releasing the call in the error path.
      
      Fixes: 37411cad ("rxrpc: Fix potential NULL-pointer exception")
      Reported-by: syzbot+d850c266e3df14da1d31@syzkaller.appspotmail.com
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      c48fc11b
  3. 27 8月, 2019 2 次提交
  4. 30 7月, 2019 1 次提交
    • D
      rxrpc: Fix the lack of notification when sendmsg() fails on a DATA packet · c69565ee
      David Howells 提交于
      Fix the fact that a notification isn't sent to the recvmsg side to indicate
      a call failed when sendmsg() fails to transmit a DATA packet with the error
      ENETUNREACH, EHOSTUNREACH or ECONNREFUSED.
      
      Without this notification, the afs client just sits there waiting for the
      call to complete in some manner (which it's not now going to do), which
      also pins the rxrpc call in place.
      
      This can be seen if the client has a scope-level IPv6 address, but not a
      global-level IPv6 address, and we try and transmit an operation to a
      server's IPv6 address.
      
      Looking in /proc/net/rxrpc/calls shows completed calls just sat there with
      an abort code of RX_USER_ABORT and an error code of -ENETUNREACH.
      
      Fixes: c54e43d7 ("rxrpc: Fix missing start of call timeout")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Reviewed-by: NMarc Dionne <marc.dionne@auristor.com>
      Reviewed-by: NJeffrey Altman <jaltman@auristor.com>
      c69565ee
  5. 24 5月, 2019 1 次提交
  6. 16 5月, 2019 1 次提交
    • D
      rxrpc: Allow the kernel to mark a call as being non-interruptible · b960a34b
      David Howells 提交于
      Allow kernel services using AF_RXRPC to indicate that a call should be
      non-interruptible.  This allows kafs to make things like lock-extension and
      writeback data storage calls non-interruptible.
      
      If this is set, signals will be ignored for operations on that call where
      possible - such as waiting to get a call channel on an rxrpc connection.
      
      It doesn't prevent UDP sendmsg from being interrupted, but that will be
      handled by packet retransmission.
      
      rxrpc_kernel_recv_data() isn't affected by this since that never waits,
      preferring instead to return -EAGAIN and leave the waiting to the caller.
      
      Userspace initiated calls can't be set to be uninterruptible at this time.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      b960a34b
  7. 13 4月, 2019 1 次提交
  8. 16 1月, 2019 1 次提交
    • D
      Revert "rxrpc: Allow failed client calls to be retried" · e122d845
      David Howells 提交于
      The changes introduced to allow rxrpc calls to be retried creates an issue
      when it comes to refcounting afs_call structs.  The problem is that when
      rxrpc_send_data() queues the last packet for an asynchronous call, the
      following sequence can occur:
      
       (1) The notify_end_tx callback is invoked which causes the state in the
           afs_call to be changed from AFS_CALL_CL_REQUESTING or
           AFS_CALL_SV_REPLYING.
      
       (2) afs_deliver_to_call() can then process event notifications from rxrpc
           on the async_work queue.
      
       (3) Delivery of events, such as an abort from the server, can cause the
           afs_call state to be changed to AFS_CALL_COMPLETE on async_work.
      
       (4) For an asynchronous call, afs_process_async_call() notes that the call
           is complete and tried to clean up all the refs on async_work.
      
       (5) rxrpc_send_data() might return the amount of data transferred
           (success) or an error - which could in turn reflect a local error or a
           received error.
      
      Synchronising the clean up after rxrpc_kernel_send_data() returns an error
      with the asynchronous cleanup is then tricky to get right.
      
      Mostly revert commit c038a58c.  The two API
      functions the original commit added aren't currently used.  This makes
      rxrpc_kernel_send_data() always return successfully if it queued the data
      it was given.
      
      Note that this doesn't affect synchronous calls since their Rx notification
      function merely pokes a wait queue and does not refcounting.  The
      asynchronous call notification function *has* to do refcounting and pass a
      ref over the work item to avoid the need to sync the workqueue in call
      cleanup.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e122d845
  9. 11 5月, 2018 1 次提交
    • D
      rxrpc: Fix missing start of call timeout · c54e43d7
      David Howells 提交于
      The expect_rx_by call timeout is supposed to be set when a call is started
      to indicate that we need to receive a packet by that point.  This is
      currently put back every time we receive a packet, but it isn't started
      when we first send a packet.  Without this, the call may wait forever if
      the server doesn't deign to reply.
      
      Fix this by setting the timeout upon a successful UDP sendmsg call for the
      first DATA packet.  The timeout is initiated only for initial transmission
      and not for subsequent retries as we don't want the retry mechanism to
      extend the timeout indefinitely.
      
      Fixes: a158bdd3 ("rxrpc: Fix call timeouts")
      Reported-by: NMarc Dionne <marc.dionne@auristor.com>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      c54e43d7
  10. 31 3月, 2018 3 次提交
    • D
      rxrpc: Fix leak of rxrpc_peer objects · 17226f12
      David Howells 提交于
      When a new client call is requested, an rxrpc_conn_parameters struct object
      is passed in with a bunch of parameters set, such as the local endpoint to
      use.  A pointer to the target peer record is also placed in there by
      rxrpc_get_client_conn() - and this is removed if and only if a new
      connection object is allocated.  Thus it leaks if a new connection object
      isn't allocated.
      
      Fix this by putting any peer object attached to the rxrpc_conn_parameters
      object in the function that allocated it.
      
      Fixes: 19ffa01c ("rxrpc: Use structs to hold connection params and protocol info")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      17226f12
    • D
      rxrpc: Fix checker warnings and errors · 88f2a825
      David Howells 提交于
      Fix various issues detected by checker.
      
      Errors:
      
       (*) rxrpc_discard_prealloc() should be using rcu_assign_pointer to set
           call->socket.
      
      Warnings:
      
       (*) rxrpc_service_connection_reaper() should be passing NULL rather than 0 to
           trace_rxrpc_conn() as the where argument.
      
       (*) rxrpc_disconnect_client_call() should get its net pointer via the
           call->conn rather than call->sock to avoid a warning about accessing
           an RCU pointer without protection.
      
       (*) Proc seq start/stop functions need annotation as they pass locks
           between the functions.
      
      False positives:
      
       (*) Checker doesn't correctly handle of seq-retry lock context balance in
           rxrpc_find_service_conn_rcu().
      
       (*) Checker thinks execution may proceed past the BUG() in
           rxrpc_publish_service_conn().
      
       (*) Variable length array warnings from SKCIPHER_REQUEST_ON_STACK() in
           rxkad.c.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      88f2a825
    • D
      rxrpc: Fix Tx ring annotation after initial Tx failure · 03877bf6
      David Howells 提交于
      rxrpc calls have a ring of packets that are awaiting ACK or retransmission
      and a parallel ring of annotations that tracks the state of those packets.
      If the initial transmission of a packet on the underlying UDP socket fails
      then the packet annotation is marked for resend - but the setting of this
      mark accidentally erases the last-packet mark also stored in the same
      annotation slot.  If this happens, a call won't switch out of the Tx phase
      when all the packets have been transmitted.
      
      Fix this by retaining the last-packet mark and only altering the packet
      state.
      
      Fixes: 248f219c ("rxrpc: Rewrite the data and ack handling code")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      03877bf6
  11. 28 3月, 2018 1 次提交
    • D
      rxrpc, afs: Use debug_ids rather than pointers in traces · a25e21f0
      David Howells 提交于
      In rxrpc and afs, use the debug_ids that are monotonically allocated to
      various objects as they're allocated rather than pointers as kernel
      pointers are now hashed making them less useful.  Further, the debug ids
      aren't reused anywhere nearly as quickly.
      
      In addition, allow kernel services that use rxrpc, such as afs, to take
      numbers from the rxrpc counter, assign them to their own call struct and
      pass them in to rxrpc for both client and service calls so that the trace
      lines for each will have the same ID tag.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      a25e21f0
  12. 29 11月, 2017 1 次提交
  13. 24 11月, 2017 6 次提交
    • D
      rxrpc: Add a timeout for detecting lost ACKs/lost DATA · bd1fdf8c
      David Howells 提交于
      Add an extra timeout that is set/updated when we send a DATA packet that
      has the request-ack flag set.  This allows us to detect if we don't get an
      ACK in response to the latest flagged packet.
      
      The ACK packet is adjudged to have been lost if it doesn't turn up within
      2*RTT of the transmission.
      
      If the timeout occurs, we schedule the sending of a PING ACK to find out
      the state of the other side.  If a new DATA packet is ready to go sooner,
      we cancel the sending of the ping and set the request-ack flag on that
      instead.
      
      If we get back a PING-RESPONSE ACK that indicates a lower tx_top than what
      we had at the time of the ping transmission, we adjudge all the DATA
      packets sent between the response tx_top and the ping-time tx_top to have
      been lost and retransmit immediately.
      
      Rather than sending a PING ACK, we could just pick a DATA packet and
      speculatively retransmit that with request-ack set.  It should result in
      either a REQUESTED ACK or a DUPLICATE ACK which we can then use in lieu the
      a PING-RESPONSE ACK mentioned above.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      bd1fdf8c
    • D
      rxrpc: Express protocol timeouts in terms of RTT · beb8e5e4
      David Howells 提交于
      Express protocol timeouts for data retransmission and deferred ack
      generation in terms on RTT rather than specified timeouts once we have
      sufficient RTT samples.
      
      For the moment, this requires just one RTT sample to be able to use this
      for ack deferral and two for data retransmission.
      
      The data retransmission timeout is set at RTT*1.5 and the ACK deferral
      timeout is set at RTT.
      
      Note that the calculated timeout is limited to a minimum of 4ns to make
      sure it doesn't happen too quickly.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      beb8e5e4
    • D
      rxrpc: Fix call timeouts · a158bdd3
      David Howells 提交于
      Fix the rxrpc call expiration timeouts and make them settable from
      userspace.  By analogy with other rx implementations, there should be three
      timeouts:
      
       (1) "Normal timeout"
      
           This is set for all calls and is triggered if we haven't received any
           packets from the peer in a while.  It is measured from the last time
           we received any packet on that call.  This is not reset by any
           connection packets (such as CHALLENGE/RESPONSE packets).
      
           If a service operation takes a long time, the server should generate
           PING ACKs at a duration that's substantially less than the normal
           timeout so is to keep both sides alive.  This is set at 1/6 of normal
           timeout.
      
       (2) "Idle timeout"
      
           This is set only for a service call and is triggered if we stop
           receiving the DATA packets that comprise the request data.  It is
           measured from the last time we received a DATA packet.
      
       (3) "Hard timeout"
      
           This can be set for a call and specified the maximum lifetime of that
           call.  It should not be specified by default.  Some operations (such
           as volume transfer) take a long time.
      
      Allow userspace to set/change the timeouts on a call with sendmsg, using a
      control message:
      
      	RXRPC_SET_CALL_TIMEOUTS
      
      The data to the message is a number of 32-bit words, not all of which need
      be given:
      
      	u32 hard_timeout;	/* sec from first packet */
      	u32 idle_timeout;	/* msec from packet Rx */
      	u32 normal_timeout;	/* msec from data Rx */
      
      This can be set in combination with any other sendmsg() that affects a
      call.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      a158bdd3
    • D
      rxrpc: Split the call params from the operation params · 48124178
      David Howells 提交于
      When rxrpc_sendmsg() parses the control message buffer, it places the
      parameters extracted into a structure, but lumps together call parameters
      (such as user call ID) with operation parameters (such as whether to send
      data, send an abort or accept a call).
      
      Split the call parameters out into their own structure, a copy of which is
      then embedded in the operation parameters struct.
      
      The call parameters struct is then passed down into the places that need it
      instead of passing the individual parameters.  This allows for extra call
      parameters to be added.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      48124178
    • D
      rxrpc: Don't set upgrade by default in sendmsg() · 48ca2463
      David Howells 提交于
      Don't set upgrade by default when creating a call from sendmsg().  This is
      a holdover from when I was testing the code.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      48ca2463
    • D
      rxrpc: The mutex lock returned by rxrpc_accept_call() needs releasing · 03a6c822
      David Howells 提交于
      The caller of rxrpc_accept_call() must release the lock on call->user_mutex
      returned by that function.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      03a6c822
  14. 24 10月, 2017 1 次提交
  15. 18 10月, 2017 1 次提交
    • D
      rxrpc: Use MSG_WAITALL to tell sendmsg() to temporarily ignore signals · bc5e3a54
      David Howells 提交于
      Make AF_RXRPC accept MSG_WAITALL as a flag to sendmsg() to tell it to
      ignore signals whilst loading up the message queue, provided progress is
      being made in emptying the queue at the other side.
      
      Progress is defined as the base of the transmit window having being
      advanced within 2 RTT periods.  If the period is exceeded with no progress,
      sendmsg() will return anyway, indicating how much data has been copied, if
      any.
      
      Once the supplied buffer is entirely decanted, the sendmsg() will return.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      bc5e3a54
  16. 29 8月, 2017 3 次提交
    • D
      rxrpc: Allow failed client calls to be retried · c038a58c
      David Howells 提交于
      Allow a client call that failed on network error to be retried, provided
      that the Tx queue still holds DATA packet 1.  This allows an operation to
      be submitted to another server or another address for the same server
      without having to repackage and re-encrypt the data so far processed.
      
      Two new functions are provided:
      
       (1) rxrpc_kernel_check_call() - This is used to find out the completion
           state of a call to guess whether it can be retried and whether it
           should be retried.
      
       (2) rxrpc_kernel_retry_call() - Disconnect the call from its current
           connection, reset the state and submit it as a new client call to a
           new address.  The new address need not match the previous address.
      
      A call may be retried even if all the data hasn't been loaded into it yet;
      a partially constructed will be retained at the same point it was at when
      an error condition was detected.  msg_data_left() can be used to find out
      how much data was packaged before the error occurred.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      c038a58c
    • D
      rxrpc: Add notification of end-of-Tx phase · e833251a
      David Howells 提交于
      Add a callback to rxrpc_kernel_send_data() so that a kernel service can get
      a notification that the AF_RXRPC call has transitioned out the Tx phase and
      is now waiting for a reply or a final ACK.
      
      This is called from AF_RXRPC with the call state lock held so the
      notification is guaranteed to come before any reply is passed back.
      
      Further, modify the AFS filesystem to make use of this so that we don't have
      to change the afs_call state before sending the last bit of data.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      e833251a
    • D
      rxrpc: Don't negate call->error before returning it · bd2db2d2
      David Howells 提交于
      call->error is stored as 0 or a negative error code.  Don't negate this
      value (ie. make it positive) before returning it from a kernel function
      (though it should still be negated before passing to userspace through a
      control message).
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      bd2db2d2
  17. 16 6月, 2017 1 次提交
    • J
      networking: convert many more places to skb_put_zero() · b080db58
      Johannes Berg 提交于
      There were many places that my previous spatch didn't find,
      as pointed out by yuan linyu in various patches.
      
      The following spatch found many more and also removes the
      now unnecessary casts:
      
          @@
          identifier p, p2;
          expression len;
          expression skb;
          type t, t2;
          @@
          (
          -p = skb_put(skb, len);
          +p = skb_put_zero(skb, len);
          |
          -p = (t)skb_put(skb, len);
          +p = skb_put_zero(skb, len);
          )
          ... when != p
          (
          p2 = (t2)p;
          -memset(p2, 0, len);
          |
          -memset(p, 0, len);
          )
      
          @@
          type t, t2;
          identifier p, p2;
          expression skb;
          @@
          t *p;
          ...
          (
          -p = skb_put(skb, sizeof(t));
          +p = skb_put_zero(skb, sizeof(t));
          |
          -p = (t *)skb_put(skb, sizeof(t));
          +p = skb_put_zero(skb, sizeof(t));
          )
          ... when != p
          (
          p2 = (t2)p;
          -memset(p2, 0, sizeof(*p));
          |
          -memset(p, 0, sizeof(*p));
          )
      
          @@
          expression skb, len;
          @@
          -memset(skb_put(skb, len), 0, len);
          +skb_put_zero(skb, len);
      
      Apply it to the tree (with one manual fixup to keep the
      comment in vxlan.c, which spatch removed.)
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b080db58
  18. 08 6月, 2017 2 次提交
    • D
      rxrpc: Provide a cmsg to specify the amount of Tx data for a call · e754eba6
      David Howells 提交于
      Provide a control message that can be specified on the first sendmsg() of a
      client call or the first sendmsg() of a service response to indicate the
      total length of the data to be transmitted for that call.
      
      Currently, because the length of the payload of an encrypted DATA packet is
      encrypted in front of the data, the packet cannot be encrypted until we
      know how much data it will hold.
      
      By specifying the length at the beginning of the transmit phase, each DATA
      packet length can be set before we start loading data from userspace (where
      several sendmsg() calls may contribute to a particular packet).
      
      An error will be returned if too little or too much data is presented in
      the Tx phase.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      e754eba6
    • D
      rxrpc: Consolidate sendmsg parameters · 3ab26a6f
      David Howells 提交于
      Consolidate the sendmsg control message parameters into a struct rather
      than passing them individually through the argument list of
      rxrpc_sendmsg_cmsg().  This makes it easier to add more parameters.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      3ab26a6f
  19. 05 6月, 2017 1 次提交
    • D
      rxrpc: Add service upgrade support for client connections · 4e255721
      David Howells 提交于
      Make it possible for a client to use AuriStor's service upgrade facility.
      
      The client does this by adding an RXRPC_UPGRADE_SERVICE control message to
      the first sendmsg() of a call.  This takes no parameters.
      
      When recvmsg() starts returning data from the call, the service ID field in
      the returned msg_name will reflect the result of the upgrade attempt.  If
      the upgrade was ignored, srx_service will match what was set in the
      sendmsg(); if the upgrade happened the srx_service will be altered to
      indicate the service the server upgraded to.
      
      Note that:
      
       (1) The choice of upgrade service is up to the server
      
       (2) Further client calls to the same server that would share a connection
           are blocked if an upgrade probe is in progress.
      
       (3) This should only be used to probe the service.  Clients should then
           use the returned service ID in all subsequent communications with that
           server (and not set the upgrade).  Note that the kernel will not
           retain this information should the connection expire from its cache.
      
       (4) If a server that supports upgrading is replaced by one that doesn't,
           whilst a connection is live, and if the replacement is running, say,
           OpenAFS 1.6.4 or older or an older IBM AFS, then the replacement
           server will not respond to packets sent to the upgraded connection.
      
           At this point, calls will time out and the server must be reprobed.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      4e255721
  20. 06 4月, 2017 3 次提交
    • D
      rxrpc: Trace protocol errors in received packets · fb46f6ee
      David Howells 提交于
      Add a tracepoint (rxrpc_rx_proto) to record protocol errors in received
      packets.  The following changes are made:
      
       (1) Add a function, __rxrpc_abort_eproto(), to note a protocol error on a
           call and mark the call aborted.  This is wrapped by
           rxrpc_abort_eproto() that makes the why string usable in trace.
      
       (2) Add trace_rxrpc_rx_proto() or rxrpc_abort_eproto() to protocol error
           generation points, replacing rxrpc_abort_call() with the latter.
      
       (3) Only send an abort packet in rxkad_verify_packet*() if we actually
           managed to abort the call.
      
      Note that a trace event is also emitted if a kernel user (e.g. afs) tries
      to send data through a call when it's not in the transmission phase, though
      it's not technically a receive event.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      fb46f6ee
    • D
      rxrpc: Note a successfully aborted kernel operation · 84a4c09c
      David Howells 提交于
      Make rxrpc_kernel_abort_call() return an indication as to whether it
      actually aborted the operation or not so that kafs can trace the failure of
      the operation.  Note that 'success' in this context means changing the
      state of the call, not necessarily successfully transmitting an ABORT
      packet.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      84a4c09c
    • D
      rxrpc: Use negative error codes in rxrpc_call struct · 3a92789a
      David Howells 提交于
      Use negative error codes in struct rxrpc_call::error because that's what
      the kernel normally deals with and to make the code consistent.  We only
      turn them positive when transcribing into a cmsg for userspace recvmsg.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      3a92789a
  21. 10 3月, 2017 1 次提交
  22. 08 3月, 2017 1 次提交
  23. 04 3月, 2017 1 次提交
  24. 02 3月, 2017 2 次提交
    • I
      sched/headers: Prepare to move signal wakeup & sigpending methods from... · 174cd4b1
      Ingo Molnar 提交于
      sched/headers: Prepare to move signal wakeup & sigpending methods from <linux/sched.h> into <linux/sched/signal.h>
      
      Fix up affected files that include this signal functionality via sched.h.
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      174cd4b1
    • D
      rxrpc: Fix deadlock between call creation and sendmsg/recvmsg · 540b1c48
      David Howells 提交于
      All the routines by which rxrpc is accessed from the outside are serialised
      by means of the socket lock (sendmsg, recvmsg, bind,
      rxrpc_kernel_begin_call(), ...) and this presents a problem:
      
       (1) If a number of calls on the same socket are in the process of
           connection to the same peer, a maximum of four concurrent live calls
           are permitted before further calls need to wait for a slot.
      
       (2) If a call is waiting for a slot, it is deep inside sendmsg() or
           rxrpc_kernel_begin_call() and the entry function is holding the socket
           lock.
      
       (3) sendmsg() and recvmsg() or the in-kernel equivalents are prevented
           from servicing the other calls as they need to take the socket lock to
           do so.
      
       (4) The socket is stuck until a call is aborted and makes its slot
           available to the waiter.
      
      Fix this by:
      
       (1) Provide each call with a mutex ('user_mutex') that arbitrates access
           by the users of rxrpc separately for each specific call.
      
       (2) Make rxrpc_sendmsg() and rxrpc_recvmsg() unlock the socket as soon as
           they've got a call and taken its mutex.
      
           Note that I'm returning EWOULDBLOCK from recvmsg() if MSG_DONTWAIT is
           set but someone else has the lock.  Should I instead only return
           EWOULDBLOCK if there's nothing currently to be done on a socket, and
           sleep in this particular instance because there is something to be
           done, but we appear to be blocked by the interrupt handler doing its
           ping?
      
       (3) Make rxrpc_new_client_call() unlock the socket after allocating a new
           call, locking its user mutex and adding it to the socket's call tree.
           The call is returned locked so that sendmsg() can add data to it
           immediately.
      
           From the moment the call is in the socket tree, it is subject to
           access by sendmsg() and recvmsg() - even if it isn't connected yet.
      
       (4) Lock new service calls in the UDP data_ready handler (in
           rxrpc_new_incoming_call()) because they may already be in the socket's
           tree and the data_ready handler makes them live immediately if a user
           ID has already been preassigned.
      
           Note that the new call is locked before any notifications are sent
           that it is live, so doing mutex_trylock() *ought* to always succeed.
           Userspace is prevented from doing sendmsg() on calls that are in a
           too-early state in rxrpc_do_sendmsg().
      
       (5) Make rxrpc_new_incoming_call() return the call with the user mutex
           held so that a ping can be scheduled immediately under it.
      
           Note that it might be worth moving the ping call into
           rxrpc_new_incoming_call() and then we can drop the mutex there.
      
       (6) Make rxrpc_accept_call() take the lock on the call it is accepting and
           release the socket after adding the call to the socket's tree.  This
           is slightly tricky as we've dequeued the call by that point and have
           to requeue it.
      
           Note that requeuing emits a trace event.
      
       (7) Make rxrpc_kernel_send_data() and rxrpc_kernel_recv_data() take the
           new mutex immediately and don't bother with the socket mutex at all.
      
      This patch has the nice bonus that calls on the same socket are now to some
      extent parallelisable.
      
      Note that we might want to move rxrpc_service_prealloc() calls out from the
      socket lock and give it its own lock, so that we don't hang progress in
      other calls because we're waiting for the allocator.
      
      We probably also want to avoid calling rxrpc_notify_socket() from within
      the socket lock (rxrpc_accept_call()).
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Tested-by: NMarc Dionne <marc.c.dionne@auristor.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      540b1c48