1. 29 8月, 2017 7 次提交
    • D
      rxrpc: Allow failed client calls to be retried · c038a58c
      David Howells 提交于
      Allow a client call that failed on network error to be retried, provided
      that the Tx queue still holds DATA packet 1.  This allows an operation to
      be submitted to another server or another address for the same server
      without having to repackage and re-encrypt the data so far processed.
      
      Two new functions are provided:
      
       (1) rxrpc_kernel_check_call() - This is used to find out the completion
           state of a call to guess whether it can be retried and whether it
           should be retried.
      
       (2) rxrpc_kernel_retry_call() - Disconnect the call from its current
           connection, reset the state and submit it as a new client call to a
           new address.  The new address need not match the previous address.
      
      A call may be retried even if all the data hasn't been loaded into it yet;
      a partially constructed will be retained at the same point it was at when
      an error condition was detected.  msg_data_left() can be used to find out
      how much data was packaged before the error occurred.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      c038a58c
    • D
      rxrpc: Add notification of end-of-Tx phase · e833251a
      David Howells 提交于
      Add a callback to rxrpc_kernel_send_data() so that a kernel service can get
      a notification that the AF_RXRPC call has transitioned out the Tx phase and
      is now waiting for a reply or a final ACK.
      
      This is called from AF_RXRPC with the call state lock held so the
      notification is guaranteed to come before any reply is passed back.
      
      Further, modify the AFS filesystem to make use of this so that we don't have
      to change the afs_call state before sending the last bit of data.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      e833251a
    • D
      rxrpc: Remove some excess whitespace · 3ec0efde
      David Howells 提交于
      Remove indentation from some blank lines.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      3ec0efde
    • D
      rxrpc: Don't negate call->error before returning it · bd2db2d2
      David Howells 提交于
      call->error is stored as 0 or a negative error code.  Don't negate this
      value (ie. make it positive) before returning it from a kernel function
      (though it should still be negated before passing to userspace through a
      control message).
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      bd2db2d2
    • D
      rxrpc: Fix IPv6 support · 7b674e39
      David Howells 提交于
      Fix IPv6 support in AF_RXRPC in the following ways:
      
       (1) When extracting the address from a received IPv4 packet, if the local
           transport socket is open for IPv6 then fill out the sockaddr_rxrpc
           struct for an IPv4-mapped-to-IPv6 AF_INET6 transport address instead
           of an AF_INET one.
      
       (2) When sending CHALLENGE or RESPONSE packets, the transport length needs
           to be set from the sockaddr_rxrpc::transport_len field rather than
           sizeof() on the IPv4 transport address.
      
       (3) When processing an IPv4 ICMP packet received by an IPv6 socket, set up
           the address correctly before searching for the affected peer.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      7b674e39
    • D
      rxrpc: Use correct timestamp from Kerberos 5 ticket · 0a378585
      David Howells 提交于
      When an XDR-encoded Kerberos 5 ticket is added as an rxrpc-type key, the
      expiry time should be drawn from the k5 part of the token union (which was
      what was filled in), rather than the kad part of the union.
      Reported-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      0a378585
    • B
      net: rxrpc: Replace time_t type with time64_t type · 10674a03
      Baolin Wang 提交于
      Since the 'expiry' variable of 'struct key_preparsed_payload' has been
      changed to 'time64_t' type, which is year 2038 safe on 32bits system.
      
      In net/rxrpc subsystem, we need convert 'u32' type to 'time64_t' type
      when copying ticket expires time to 'prep->expiry', then this patch
      introduces two helper functions to help convert 'u32' to 'time64_t'
      type.
      
      This patch also uses ktime_get_real_seconds() to get current time instead
      of get_seconds() which is not year 2038 safe on 32bits system.
      Signed-off-by: NBaolin Wang <baolin.wang@linaro.org>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      10674a03
  2. 19 8月, 2017 1 次提交
    • D
      rxrpc: Fix oops when discarding a preallocated service call · 9a19bad7
      David Howells 提交于
      rxrpc_service_prealloc_one() doesn't set the socket pointer on any new call
      it preallocates, but does add it to the rxrpc net namespace call list.
      This, however, causes rxrpc_put_call() to oops when the call is discarded
      when the socket is closed.  rxrpc_put_call() needs the socket to be able to
      reach the namespace so that it can use a lock held therein.
      
      Fix this by setting a call's socket pointer immediately before discarding
      it.
      
      This can be triggered by unloading the kafs module, resulting in an oops
      like the following:
      
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000030
      IP: rxrpc_put_call+0x1e2/0x32d
      PGD 0
      P4D 0
      Oops: 0000 [#1] SMP
      Modules linked in: kafs(E-)
      CPU: 3 PID: 3037 Comm: rmmod Tainted: G            E   4.12.0-fscache+ #213
      Hardware name: ASUS All Series/H97-PLUS, BIOS 2306 10/09/2014
      task: ffff8803fc92e2c0 task.stack: ffff8803fef74000
      RIP: 0010:rxrpc_put_call+0x1e2/0x32d
      RSP: 0018:ffff8803fef77e08 EFLAGS: 00010282
      RAX: 0000000000000000 RBX: ffff8803fab99ac0 RCX: 000000000000000f
      RDX: ffffffff81c50a40 RSI: 000000000000000c RDI: ffff8803fc92ea88
      RBP: ffff8803fef77e30 R08: ffff8803fc87b941 R09: ffffffff82946d20
      R10: ffff8803fef77d10 R11: 00000000000076fc R12: 0000000000000005
      R13: ffff8803fab99c20 R14: 0000000000000001 R15: ffffffff816c6aee
      FS:  00007f915a059700(0000) GS:ffff88041fb80000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000000000030 CR3: 00000003fef39000 CR4: 00000000001406e0
      Call Trace:
       rxrpc_discard_prealloc+0x325/0x341
       rxrpc_listen+0xf9/0x146
       kernel_listen+0xb/0xd
       afs_close_socket+0x3e/0x173 [kafs]
       afs_exit+0x1f/0x57 [kafs]
       SyS_delete_module+0x10f/0x19a
       do_syscall_64+0x8a/0x149
       entry_SYSCALL64_slow_path+0x25/0x25
      
      Fixes: 2baec2c3 ("rxrpc: Support network namespacing")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9a19bad7
  3. 21 7月, 2017 1 次提交
    • D
      rxrpc: Move the packet.h include file into net/rxrpc/ · ddc6c70f
      David Howells 提交于
      Move the protocol description header file into net/rxrpc/ and rename it to
      protocol.h.  It's no longer necessary to expose it as packets are no longer
      exposed to kernel services (such as AFS) that use the facility.
      
      The abort codes are transferred to the UAPI header instead as we pass these
      back to userspace and also to kernel services.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      ddc6c70f
  4. 01 7月, 2017 3 次提交
  5. 16 6月, 2017 2 次提交
    • J
      networking: convert many more places to skb_put_zero() · b080db58
      Johannes Berg 提交于
      There were many places that my previous spatch didn't find,
      as pointed out by yuan linyu in various patches.
      
      The following spatch found many more and also removes the
      now unnecessary casts:
      
          @@
          identifier p, p2;
          expression len;
          expression skb;
          type t, t2;
          @@
          (
          -p = skb_put(skb, len);
          +p = skb_put_zero(skb, len);
          |
          -p = (t)skb_put(skb, len);
          +p = skb_put_zero(skb, len);
          )
          ... when != p
          (
          p2 = (t2)p;
          -memset(p2, 0, len);
          |
          -memset(p, 0, len);
          )
      
          @@
          type t, t2;
          identifier p, p2;
          expression skb;
          @@
          t *p;
          ...
          (
          -p = skb_put(skb, sizeof(t));
          +p = skb_put_zero(skb, sizeof(t));
          |
          -p = (t *)skb_put(skb, sizeof(t));
          +p = skb_put_zero(skb, sizeof(t));
          )
          ... when != p
          (
          p2 = (t2)p;
          -memset(p2, 0, sizeof(*p));
          |
          -memset(p, 0, sizeof(*p));
          )
      
          @@
          expression skb, len;
          @@
          -memset(skb_put(skb, len), 0, len);
          +skb_put_zero(skb, len);
      
      Apply it to the tree (with one manual fixup to keep the
      comment in vxlan.c, which spatch removed.)
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b080db58
    • D
      rxrpc: Fix several cases where a padded len isn't checked in ticket decode · 5f2f9765
      David Howells 提交于
      This fixes CVE-2017-7482.
      
      When a kerberos 5 ticket is being decoded so that it can be loaded into an
      rxrpc-type key, there are several places in which the length of a
      variable-length field is checked to make sure that it's not going to
      overrun the available data - but the data is padded to the nearest
      four-byte boundary and the code doesn't check for this extra.  This could
      lead to the size-remaining variable wrapping and the data pointer going
      over the end of the buffer.
      
      Fix this by making the various variable-length data checks use the padded
      length.
      Reported-by: N石磊 <shilei-c@360.cn>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Reviewed-by: NMarc Dionne <marc.c.dionne@auristor.com>
      Reviewed-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5f2f9765
  6. 15 6月, 2017 1 次提交
    • D
      rxrpc: Cache the congestion window setting · f7aec129
      David Howells 提交于
      Cache the congestion window setting that was determined during a call's
      transmission phase when it finishes so that it can be used by the next call
      to the same peer, thereby shortcutting the slow-start algorithm.
      
      The value is stored in the rxrpc_peer struct and is accessed without
      locking.  Each call takes the value that happens to be there when it starts
      and just overwrites the value when it finishes.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f7aec129
  7. 08 6月, 2017 3 次提交
    • D
      rxrpc: Provide a cmsg to specify the amount of Tx data for a call · e754eba6
      David Howells 提交于
      Provide a control message that can be specified on the first sendmsg() of a
      client call or the first sendmsg() of a service response to indicate the
      total length of the data to be transmitted for that call.
      
      Currently, because the length of the payload of an encrypted DATA packet is
      encrypted in front of the data, the packet cannot be encrypted until we
      know how much data it will hold.
      
      By specifying the length at the beginning of the transmit phase, each DATA
      packet length can be set before we start loading data from userspace (where
      several sendmsg() calls may contribute to a particular packet).
      
      An error will be returned if too little or too much data is presented in
      the Tx phase.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      e754eba6
    • D
      rxrpc: Consolidate sendmsg parameters · 3ab26a6f
      David Howells 提交于
      Consolidate the sendmsg control message parameters into a struct rather
      than passing them individually through the argument list of
      rxrpc_sendmsg_cmsg().  This makes it easier to add more parameters.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      3ab26a6f
    • D
      rxrpc: Provide a getsockopt call to query what cmsgs types are supported · 515559ca
      David Howells 提交于
      Provide a getsockopt() call that can query what cmsg types are supported by
      AF_RXRPC.
      515559ca
  8. 05 6月, 2017 6 次提交
    • D
      rxrpc: Add service upgrade support for client connections · 4e255721
      David Howells 提交于
      Make it possible for a client to use AuriStor's service upgrade facility.
      
      The client does this by adding an RXRPC_UPGRADE_SERVICE control message to
      the first sendmsg() of a call.  This takes no parameters.
      
      When recvmsg() starts returning data from the call, the service ID field in
      the returned msg_name will reflect the result of the upgrade attempt.  If
      the upgrade was ignored, srx_service will match what was set in the
      sendmsg(); if the upgrade happened the srx_service will be altered to
      indicate the service the server upgraded to.
      
      Note that:
      
       (1) The choice of upgrade service is up to the server
      
       (2) Further client calls to the same server that would share a connection
           are blocked if an upgrade probe is in progress.
      
       (3) This should only be used to probe the service.  Clients should then
           use the returned service ID in all subsequent communications with that
           server (and not set the upgrade).  Note that the kernel will not
           retain this information should the connection expire from its cache.
      
       (4) If a server that supports upgrading is replaced by one that doesn't,
           whilst a connection is live, and if the replacement is running, say,
           OpenAFS 1.6.4 or older or an older IBM AFS, then the replacement
           server will not respond to packets sent to the upgraded connection.
      
           At this point, calls will time out and the server must be reprobed.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      4e255721
    • D
      rxrpc: Implement service upgrade · 4722974d
      David Howells 提交于
      Implement AuriStor's service upgrade facility.  There are three problems
      that this is meant to deal with:
      
       (1) Various of the standard AFS RPC calls have IPv4 addresses in their
           requests and/or replies - but there's no room for including IPv6
           addresses.
      
       (2) Definition of IPv6-specific RPC operations in the standard operation
           sets has not yet been achieved.
      
       (3) One could envision the creation a new service on the same port that as
           the original service.  The new service could implement improved
           operations - and the client could try this first, falling back to the
           original service if it's not there.
      
           Unfortunately, certain servers ignore packets addressed to a service
           they don't implement and don't respond in any way - not even with an
           ABORT.  This means that the client must then wait for the call timeout
           to occur.
      
      What service upgrade does is to see if the connection is marked as being
      'upgradeable' and if so, change the service ID in the server and thus the
      request and reply formats.  Note that the upgrade isn't mandatory - a
      server that supports only the original call set will ignore the upgrade
      request.
      
      In the protocol, the procedure is then as follows:
      
       (1) To request an upgrade, the first DATA packet in a new connection must
           have the userStatus set to 1 (this is normally 0).  The userStatus
           value is normally ignored by the server.
      
       (2) If the server doesn't support upgrading, the reply packets will
           contain the same service ID as for the first request packet.
      
       (3) If the server does support upgrading, all future reply packets on that
           connection will contain the new service ID and the new service ID will
           be applied to *all* further calls on that connection as well.
      
       (4) The RPC op used to probe the upgrade must take the same request data
           as the shadow call in the upgrade set (but may return a different
           reply).  GetCapability RPC ops were added to all standard sets for
           just this purpose.  Ops where the request formats differ cannot be
           used for probing.
      
       (5) The client must wait for completion of the probe before sending any
           further RPC ops to the same destination.  It should then use the
           service ID that recvmsg() reported back in all future calls.
      
       (6) The shadow service must have call definitions for all the operation
           IDs defined by the original service.
      
      
      To support service upgrading, a server should:
      
       (1) Call bind() twice on its AF_RXRPC socket before calling listen().
           Each bind() should supply a different service ID, but the transport
           addresses must be the same.  This allows the server to receive
           requests with either service ID.
      
       (2) Enable automatic upgrading by calling setsockopt(), specifying
           RXRPC_UPGRADEABLE_SERVICE and passing in a two-member array of
           unsigned shorts as the argument:
      
      	unsigned short optval[2];
      
           This specifies a pair of service IDs.  They must be different and must
           match the service IDs bound to the socket.  Member 0 is the service ID
           to upgrade from and member 1 is the service ID to upgrade to.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      4722974d
    • D
      rxrpc: Permit multiple service binding · 28036f44
      David Howells 提交于
      Permit bind() to be called on an AF_RXRPC socket more than once (currently
      maximum twice) to bind multiple listening services to it.  There are some
      restrictions:
      
       (1) All bind() calls involved must have a non-zero service ID.
      
       (2) The service IDs must all be different.
      
       (3) The rest of the address (notably the transport part) must be the same
           in all (a single UDP socket is shared).
      
       (4) This must be done before listen() or sendmsg() is called.
      
      This allows someone to connect to the service socket with different service
      IDs and lays the foundation for service upgrading.
      
      The service ID used by an incoming call can be extracted from the msg_name
      returned by recvmsg().
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      28036f44
    • D
      rxrpc: Separate the connection's protocol service ID from the lookup ID · 68d6d1ae
      David Howells 提交于
      Keep the rxrpc_connection struct's idea of the service ID that is exposed
      in the protocol separate from the service ID that's used as a lookup key.
      
      This allows the protocol service ID on a client connection to get upgraded
      without making the connection unfindable for other client calls that also
      would like to use the upgraded connection.
      
      The connection's actual service ID is then returned through recvmsg() by
      way of msg_name.
      
      Whilst we're at it, we get rid of the last_service_id field from each
      channel.  The service ID is per-connection, not per-call and an entire
      connection is upgraded in one go.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      68d6d1ae
    • J
    • C
      rxrpc: remove redundant proc_remove call · 1820dd06
      Colin Ian King 提交于
      The proc_remove call is dead code as it occurs after a return and
      hence can never be called. Remove it.
      
      Detected by CoverityScan, CID#1437743 ("Logically dead code")
      Signed-off-by: NColin Ian King <colin.king@canonical.com>
      Acked-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1820dd06
  9. 26 5月, 2017 1 次提交
    • D
      rxrpc: Support network namespacing · 2baec2c3
      David Howells 提交于
      Support network namespacing in AF_RXRPC with the following changes:
      
       (1) All the local endpoint, peer and call lists, locks, counters, etc. are
           moved into the per-namespace record.
      
       (2) All the connection tracking is moved into the per-namespace record
           with the exception of the client connection ID tree, which is kept
           global so that connection IDs are kept unique per-machine.
      
       (3) Each namespace gets its own epoch.  This allows each network namespace
           to pretend to be a separate client machine.
      
       (4) The /proc/net/rxrpc_xxx files are now called /proc/net/rxrpc/xxx and
           the contents reflect the namespace.
      
      fs/afs/ should be okay with this patch as it explicitly requires the current
      net namespace to be init_net to permit a mount to proceed at the moment.  It
      will, however, need updating so that cells, IP addresses and DNS records are
      per-namespace also.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2baec2c3
  10. 06 4月, 2017 7 次提交
    • D
      rxrpc: Trace client call connection · 89ca6948
      David Howells 提交于
      Add a tracepoint (rxrpc_connect_call) to log the combination of rxrpc_call
      pointer, afs_call pointer/user data and wire call parameters to make it
      easier to match the tracebuffer contents to captured network packets.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      89ca6948
    • D
      rxrpc: Trace changes in a call's receive window size · 740586d2
      David Howells 提交于
      Add a tracepoint (rxrpc_rx_rwind_change) to log changes in a call's receive
      window size as imposed by the peer through an ACK packet.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      740586d2
    • D
      rxrpc: Trace received aborts · 005ede28
      David Howells 提交于
      Add a tracepoint (rxrpc_rx_abort) to record received aborts.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      005ede28
    • D
      rxrpc: Trace protocol errors in received packets · fb46f6ee
      David Howells 提交于
      Add a tracepoint (rxrpc_rx_proto) to record protocol errors in received
      packets.  The following changes are made:
      
       (1) Add a function, __rxrpc_abort_eproto(), to note a protocol error on a
           call and mark the call aborted.  This is wrapped by
           rxrpc_abort_eproto() that makes the why string usable in trace.
      
       (2) Add trace_rxrpc_rx_proto() or rxrpc_abort_eproto() to protocol error
           generation points, replacing rxrpc_abort_call() with the latter.
      
       (3) Only send an abort packet in rxkad_verify_packet*() if we actually
           managed to abort the call.
      
      Note that a trace event is also emitted if a kernel user (e.g. afs) tries
      to send data through a call when it's not in the transmission phase, though
      it's not technically a receive event.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      fb46f6ee
    • D
      rxrpc: Handle temporary errors better in rxkad security · ef68622d
      David Howells 提交于
      In the rxkad security module, when we encounter a temporary error (such as
      ENOMEM) from which we could conceivably recover, don't abort the
      connection, but rather permit retransmission of the relevant packets to
      induce a retry.
      
      Note that I'm leaving some places that could be merged together to insert
      tracing in the next patch.
      
      Signed-off-by; David Howells <dhowells@redhat.com>
      ef68622d
    • D
      rxrpc: Note a successfully aborted kernel operation · 84a4c09c
      David Howells 提交于
      Make rxrpc_kernel_abort_call() return an indication as to whether it
      actually aborted the operation or not so that kafs can trace the failure of
      the operation.  Note that 'success' in this context means changing the
      state of the call, not necessarily successfully transmitting an ABORT
      packet.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      84a4c09c
    • D
      rxrpc: Use negative error codes in rxrpc_call struct · 3a92789a
      David Howells 提交于
      Use negative error codes in struct rxrpc_call::error because that's what
      the kernel normally deals with and to make the code consistent.  We only
      turn them positive when transcribing into a cmsg for userspace recvmsg.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      3a92789a
  11. 17 3月, 2017 1 次提交
    • D
      rxrpc: Ignore BUSY packets on old calls · 4d4a6ac7
      David Howells 提交于
      If we receive a BUSY packet for a call we think we've just completed, the
      packet is handed off to the connection processor to deal with - but the
      connection processor doesn't expect a BUSY packet and so flags a protocol
      error.
      
      Fix this by simply ignoring the BUSY packet for the moment.
      
      The symptom of this may appear as a system call failing with EPROTO.  This
      may be triggered by pressing ctrl-C under some circumstances.
      
      This comes about we abort calls due to interruption by a signal (which we
      shouldn't do, but that's going to be a large fix and mostly in fs/afs/).
      What happens is that we abort the call and may also abort follow up calls
      too (this needs offloading somehoe).  So we see a transmission of something
      like the following sequence of packets:
      
      	DATA for call N
      	ABORT call N
      	DATA for call N+1
      	ABORT call N+1
      
      in very quick succession on the same channel.  However, the peer may have
      deferred the processing of the ABORT from the call N to a background thread
      and thus sees the DATA message from the call N+1 coming in before it has
      cleared the channel.  Thus it sends a BUSY packet[*].
      
      [*] Note that some implementations (OpenAFS, for example) mark the BUSY
          packet with one plus the callNumber of the call prior to call N.
          Ordinarily, this would be call N, but there's no requirement for the
          calls on a channel to be numbered strictly sequentially (the number is
          required to increase).
      
          This is wrong and means that the callNumber in the BUSY packet should
          be ignored (it really ought to be N+1 since that's what it's in
          response to).
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4d4a6ac7
  12. 11 3月, 2017 1 次提交
    • D
      rxrpc: Wake up the transmitter if Rx window size increases on the peer · 702f2ac8
      David Howells 提交于
      The RxRPC ACK packet may contain an extension that includes the peer's
      current Rx window size for this call.  We adjust the local Tx window size
      to match.  However, the transmitter can stall if the receive window is
      reduced to 0 by the peer and then reopened.
      
      This is because the normal way that the transmitter is re-energised is by
      dropping something out of our Tx queue and thus making space.  When a
      single gap is made, the transmitter is woken up.  However, because there's
      nothing in the Tx queue at this point, this doesn't happen.
      
      To fix this, perform a wake_up() any time we see the peer's Rx window size
      increasing.
      
      The observable symptom is that calls start failing on ETIMEDOUT and the
      following:
      
      	kAFS: SERVER DEAD state=-62
      
      appears in dmesg.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      702f2ac8
  13. 10 3月, 2017 1 次提交
  14. 08 3月, 2017 1 次提交
  15. 04 3月, 2017 1 次提交
  16. 02 3月, 2017 2 次提交
    • I
      sched/headers: Prepare to move signal wakeup & sigpending methods from... · 174cd4b1
      Ingo Molnar 提交于
      sched/headers: Prepare to move signal wakeup & sigpending methods from <linux/sched.h> into <linux/sched/signal.h>
      
      Fix up affected files that include this signal functionality via sched.h.
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      174cd4b1
    • D
      rxrpc: Fix deadlock between call creation and sendmsg/recvmsg · 540b1c48
      David Howells 提交于
      All the routines by which rxrpc is accessed from the outside are serialised
      by means of the socket lock (sendmsg, recvmsg, bind,
      rxrpc_kernel_begin_call(), ...) and this presents a problem:
      
       (1) If a number of calls on the same socket are in the process of
           connection to the same peer, a maximum of four concurrent live calls
           are permitted before further calls need to wait for a slot.
      
       (2) If a call is waiting for a slot, it is deep inside sendmsg() or
           rxrpc_kernel_begin_call() and the entry function is holding the socket
           lock.
      
       (3) sendmsg() and recvmsg() or the in-kernel equivalents are prevented
           from servicing the other calls as they need to take the socket lock to
           do so.
      
       (4) The socket is stuck until a call is aborted and makes its slot
           available to the waiter.
      
      Fix this by:
      
       (1) Provide each call with a mutex ('user_mutex') that arbitrates access
           by the users of rxrpc separately for each specific call.
      
       (2) Make rxrpc_sendmsg() and rxrpc_recvmsg() unlock the socket as soon as
           they've got a call and taken its mutex.
      
           Note that I'm returning EWOULDBLOCK from recvmsg() if MSG_DONTWAIT is
           set but someone else has the lock.  Should I instead only return
           EWOULDBLOCK if there's nothing currently to be done on a socket, and
           sleep in this particular instance because there is something to be
           done, but we appear to be blocked by the interrupt handler doing its
           ping?
      
       (3) Make rxrpc_new_client_call() unlock the socket after allocating a new
           call, locking its user mutex and adding it to the socket's call tree.
           The call is returned locked so that sendmsg() can add data to it
           immediately.
      
           From the moment the call is in the socket tree, it is subject to
           access by sendmsg() and recvmsg() - even if it isn't connected yet.
      
       (4) Lock new service calls in the UDP data_ready handler (in
           rxrpc_new_incoming_call()) because they may already be in the socket's
           tree and the data_ready handler makes them live immediately if a user
           ID has already been preassigned.
      
           Note that the new call is locked before any notifications are sent
           that it is live, so doing mutex_trylock() *ought* to always succeed.
           Userspace is prevented from doing sendmsg() on calls that are in a
           too-early state in rxrpc_do_sendmsg().
      
       (5) Make rxrpc_new_incoming_call() return the call with the user mutex
           held so that a ping can be scheduled immediately under it.
      
           Note that it might be worth moving the ping call into
           rxrpc_new_incoming_call() and then we can drop the mutex there.
      
       (6) Make rxrpc_accept_call() take the lock on the call it is accepting and
           release the socket after adding the call to the socket's tree.  This
           is slightly tricky as we've dequeued the call by that point and have
           to requeue it.
      
           Note that requeuing emits a trace event.
      
       (7) Make rxrpc_kernel_send_data() and rxrpc_kernel_recv_data() take the
           new mutex immediately and don't bother with the socket mutex at all.
      
      This patch has the nice bonus that calls on the same socket are now to some
      extent parallelisable.
      
      Note that we might want to move rxrpc_service_prealloc() calls out from the
      socket lock and give it its own lock, so that we don't hang progress in
      other calls because we're waiting for the allocator.
      
      We probably also want to avoid calling rxrpc_notify_socket() from within
      the socket lock (rxrpc_accept_call()).
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Tested-by: NMarc Dionne <marc.c.dionne@auristor.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      540b1c48
  17. 27 2月, 2017 1 次提交
    • D
      rxrpc: Kernel calls get stuck in recvmsg · d7e15835
      David Howells 提交于
      Calls made through the in-kernel interface can end up getting stuck because
      of a missed variable update in a loop in rxrpc_recvmsg_data().  The problem
      is like this:
      
       (1) A new packet comes in and doesn't cause a notification to be given to
           the client as there's still another packet in the ring - the
           assumption being that if the client will keep drawing off data until
           the ring is empty.
      
       (2) The client is in rxrpc_recvmsg_data(), inside the big while loop that
           iterates through the packets.  This copies the window pointers into
           variables rather than using the information in the call struct
           because:
      
           (a) MSG_PEEK might be in effect;
      
           (b) we need a barrier after reading call->rx_top to pair with the
           	 barrier in the softirq routine that loads the buffer.
      
       (3) The reading of call->rx_top is done outside of the loop, and top is
           never updated whilst we're in the loop.  This means that even through
           there's a new packet available, we don't see it and may return -EFAULT
           to the caller - who will happily return to the scheduler and await the
           next notification.
      
       (4) No further notifications are forthcoming until there's an abort as the
           ring isn't empty.
      
      The fix is to move the read of call->rx_top inside the loop - but it needs
      to be done before the condition is checked.
      Reported-by: NMarc Dionne <marc.dionne@auristor.com>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Tested-by: NMarc Dionne <marc.dionne@auristor.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d7e15835