1. 22 6月, 2016 1 次提交
    • D
      rxrpc: Use structs to hold connection params and protocol info · 19ffa01c
      David Howells 提交于
      Define and use a structure to hold connection parameters.  This makes it
      easier to pass multiple connection parameters around.
      
      Define and use a structure to hold protocol information used to hash a
      connection for lookup on incoming packet.  Most of these fields will be
      disposed of eventually, including the duplicate local pointer.
      
      Whilst we're at it rename "proto" to "family" when referring to a protocol
      family.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      19ffa01c
  2. 15 6月, 2016 5 次提交
    • D
      rxrpc: Rework local endpoint management · 4f95dd78
      David Howells 提交于
      Rework the local RxRPC endpoint management.
      
      Local endpoint objects are maintained in a flat list as before.  This
      should be okay as there shouldn't be more than one per open AF_RXRPC socket
      (there can be fewer as local endpoints can be shared if their local service
      ID is 0 and they share the same local transport parameters).
      
      Changes:
      
       (1) Local endpoints may now only be shared if they have local service ID 0
           (ie. they're not being used for listening).
      
           This prevents a scenario where process A is listening of the Cache
           Manager port and process B contacts a fileserver - which may then
           attempt to send CM requests back to B.  But if A and B are sharing a
           local endpoint, A will get the CM requests meant for B.
      
       (2) We use a mutex to handle lookups and don't provide RCU-only lookups
           since we only expect to access the list when opening a socket or
           destroying an endpoint.
      
           The local endpoint object is pointed to by the transport socket's
           sk_user_data for the life of the transport socket - allowing us to
           refer to it directly from the sk_data_ready and sk_error_report
           callbacks.
      
       (3) atomic_inc_not_zero() now exists and can be used to only share a local
           endpoint if the last reference hasn't yet gone.
      
       (4) We can remove rxrpc_local_lock - a spinlock that had to be taken with
           BH processing disabled given that we assume sk_user_data won't change
           under us.
      
       (5) The transport socket is shut down before we clear the sk_user_data
           pointer so that we can be sure that the transport socket's callbacks
           won't be invoked once the RCU destruction is scheduled.
      
       (6) Local endpoints have a work item that handles both destruction and
           event processing.  The means that destruction doesn't then need to
           wait for event processing.  The event queues can then be cleared after
           the transport socket is shut down.
      
       (7) Local endpoints are no longer available for resurrection beyond the
           life of the sockets that had them open.  As soon as their last ref
           goes, they are scheduled for destruction and may not have their usage
           count moved from 0.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      4f95dd78
    • D
      rxrpc: Separate local endpoint event handling out into its own file · 87563616
      David Howells 提交于
      Separate local endpoint event handling out into its own file preparatory to
      overhauling the object management aspect (which remains in the original
      file).
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      87563616
    • D
      rxrpc: Use the peer record to distribute network errors · f66d7490
      David Howells 提交于
      Use the peer record to distribute network errors rather than the transport
      object (which I want to get rid of).  An error from a particular peer
      terminates all calls on that peer.
      
      For future consideration:
      
       (1) For ICMP-induced errors it might be worth trying to extract the RxRPC
           header from the offending packet, if one is returned attached to the
           ICMP packet, to better direct the error.
      
           This may be overkill, though, since an ICMP packet would be expected
           to be relating to the destination port, machine or network.  RxRPC
           ABORT and BUSY packets give notice at RxRPC level.
      
       (2) To also abort connection-level communications (such as CHALLENGE
           packets) where indicted by an error - but that requires some revamping
           of the connection event handling first.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      f66d7490
    • D
      rxrpc: Rename rxrpc_UDP_error_report() to rxrpc_error_report() · abe89ef0
      David Howells 提交于
      Rename rxrpc_UDP_error_report() to rxrpc_error_report() as it might get
      called for something other than UDP.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      abe89ef0
    • D
      rxrpc: Rework peer object handling to use hash table and RCU · be6e6707
      David Howells 提交于
      Rework peer object handling to use a hash table instead of a flat list and
      to use RCU.  Peer objects are no longer destroyed by passing them to a
      workqueue to process, but rather are just passed to the RCU garbage
      collector as kfree'able objects.
      
      The hash function uses the local endpoint plus all the components of the
      remote address, except for the RxRPC service ID.  Peers thus represent a
      UDP port on the remote machine as contacted by a UDP port on this machine.
      
      The RCU read lock is used to handle non-creating lookups so that they can
      be called from bottom half context in the sk_error_report handler without
      having to lock the hash table against modification.
      rxrpc_lookup_peer_rcu() *does* take a reference on the peer object as in
      the future, this will be passed to a work item for error distribution in
      the error_report path and this function will cease being used in the
      data_ready path.
      
      Creating lookups are done under spinlock rather than mutex as they might be
      set up due to an external stimulus if the local endpoint is a server.
      
      Captured network error messages (ICMP) are handled with respect to this
      struct and MTU size and RTT are cached here.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      be6e6707
  3. 13 6月, 2016 1 次提交
  4. 11 6月, 2016 1 次提交
    • D
      rxrpc: Limit the listening backlog · 0e119b41
      David Howells 提交于
      Limit the socket incoming call backlog queue size so that a remote client
      can't pump in sufficient new calls that the server runs out of memory.  Note
      that this is partially theoretical at the moment since whilst the number of
      calls is limited, the number of packets trying to set up new calls is not.
      This will be addressed in a later patch.
      
      If the caller of listen() specifies a backlog INT_MAX, then they get the
      current maximum; anything else greater than max_backlog or anything
      negative incurs EINVAL.
      
      The limit on the maximum queue size can be set by:
      
      	echo N >/proc/sys/net/rxrpc/max_backlog
      
      where 4<=N<=32.
      
      Further, set the default backlog to 0, requiring listen() to be called
      before we start actually queueing new calls.  Whilst this kind of is a
      change in the UAPI, the caller can't actually *accept* new calls anyway
      unless they've first called listen() to put the socket into the LISTENING
      state - thus the aforementioned new calls would otherwise just sit there,
      eating up kernel memory.  (Note that sockets that don't have a non-zero
      service ID bound don't get incoming calls anyway.)
      
      Given that the default backlog is now 0, make the AFS filesystem call
      kernel_listen() to set the maximum backlog for itself.
      
      Possible improvements include:
      
       (1) Trimming a too-large backlog to max_backlog when listen is called.
      
       (2) Trimming the backlog value whenever the value is used so that changes
           to max_backlog are applied to an open socket automatically.  Note that
           the AFS filesystem opens one socket and keeps it open for extended
           periods, so would miss out on changes to max_backlog.
      
       (3) Having a separate setting for the AFS filesystem.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0e119b41
  5. 10 6月, 2016 1 次提交
    • D
      rxrpc: Simplify connect() implementation and simplify sendmsg() op · 2341e077
      David Howells 提交于
      Simplify the RxRPC connect() implementation.  It will just note the
      destination address it is given, and if a sendmsg() comes along with no
      address, this will be assigned as the address.  No transport struct will be
      held internally, which will allow us to remove this later.
      
      Simplify sendmsg() also.  Whilst a call is active, userspace refers to it
      by a private unique user ID specified in a control message.  When sendmsg()
      sees a user ID that doesn't map to an extant call, it creates a new call
      for that user ID and attempts to add it.  If, when we try to add it, the
      user ID is now registered, we now reject the message with -EEXIST.  We
      should never see this situation unless two threads are racing, trying to
      create a call with the same ID - which would be an error.
      
      It also isn't required to provide sendmsg() with an address - provided the
      control message data holds a user ID that maps to a currently active call.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2341e077
  6. 04 6月, 2016 1 次提交
    • J
      rxrpc: Use pr_<level> and pr_fmt, reduce object size a few KB · 9b6d5398
      Joe Perches 提交于
      Use the more common kernel logging style and reduce object size.
      
      The logging message prefix changes from a mixture of
      "RxRPC:" and "RXRPC:" to "af_rxrpc: ".
      
      $ size net/rxrpc/built-in.o*
         text	   data	    bss	    dec	    hex	filename
        64172	   1972	   8304	  74448	  122d0	net/rxrpc/built-in.o.new
        67512	   1972	   8304	  77788	  12fdc	net/rxrpc/built-in.o.old
      
      Miscellanea:
      
      o Consolidate the ASSERT macros to use a single pr_err call with
        decimal and hexadecimal output and a stringified #OP argument
      Signed-off-by: NJoe Perches <joe@perches.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9b6d5398
  7. 12 4月, 2016 6 次提交
  8. 14 3月, 2016 1 次提交
  9. 04 3月, 2016 4 次提交
    • D
      rxrpc: Adjust some whitespace and comments · b4f1342f
      David Howells 提交于
      Remove some excess whitespace, insert some missing spaces and adjust a
      couple of comments.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      b4f1342f
    • D
      rxrpc: Keep the skb private record of the Rx header in host byte order · 0d12f8a4
      David Howells 提交于
      Currently, a copy of the Rx packet header is copied into the the sk_buff
      private data so that we can advance the pointer into the buffer,
      potentially discarding the original.  At the moment, this copy is held in
      network byte order, but this means we're doing a lot of unnecessary
      translations.
      
      The reasons it was done this way are that we need the values in network
      byte order occasionally and we can use the copy, slightly modified, as part
      of an iov array when sending an ack or an abort packet.
      
      However, it seems more reasonable on review that it would be better kept in
      host byte order and that we make up a new header when we want to send
      another packet.
      
      To this end, rename the original header struct to rxrpc_wire_header (with
      BE fields) and institute a variant called rxrpc_host_header that has host
      order fields.  Change the struct in the sk_buff private data into an
      rxrpc_host_header and translate the values when filling it in.
      
      This further allows us to keep values kept in various structures in host
      byte order rather than network byte order and allows removal of some fields
      that are byteswapped duplicates.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      0d12f8a4
    • D
      rxrpc: Rename call events to begin RXRPC_CALL_EV_ · 4c198ad1
      David Howells 提交于
      Rename call event names to begin RXRPC_CALL_EV_ to distinguish them from the
      flags.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      4c198ad1
    • D
      rxrpc: Convert call flag and event numbers into enums · 5b8848d1
      David Howells 提交于
      Convert call flag and event numbers into enums and move their definitions
      outside of the struct.
      
      Also move the call state enum outside of the struct and add an extra
      element to count the number of states.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      5b8848d1
  10. 27 1月, 2016 1 次提交
  11. 21 9月, 2015 1 次提交
  12. 01 4月, 2015 1 次提交
    • D
      RxRPC: Handle VERSION Rx protocol packets · 44ba0698
      David Howells 提交于
      Handle VERSION Rx protocol packets.  We should respond to a VERSION packet
      with a string indicating the Rx version.  This is a maximum of 64 characters
      and is padded out to 65 chars with NUL bytes.
      
      Note that other AFS clients use the version request as a NAT keepalive so we
      need to handle it rather than returning an abort.
      
      The standard formulation seems to be:
      
      	<project> <version> built <yyyy>-<mm>-<dd>
      
      for example:
      
      	" OpenAFS 1.6.2 built  2013-05-07 "
      
      (note the three extra spaces) as obtained with:
      
      	rxdebug grand.mit.edu -version
      
      from the openafs package.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      44ba0698
  13. 03 3月, 2015 1 次提交
  14. 12 4月, 2014 1 次提交
    • D
      net: Fix use after free by removing length arg from sk_data_ready callbacks. · 676d2369
      David S. Miller 提交于
      Several spots in the kernel perform a sequence like:
      
      	skb_queue_tail(&sk->s_receive_queue, skb);
      	sk->sk_data_ready(sk, skb->len);
      
      But at the moment we place the SKB onto the socket receive queue it
      can be consumed and freed up.  So this skb->len access is potentially
      to freed up memory.
      
      Furthermore, the skb->len can be modified by the consumer so it is
      possible that the value isn't accurate.
      
      And finally, no actual implementation of this callback actually uses
      the length argument.  And since nobody actually cared about it's
      value, lots of call sites pass arbitrary values in such as '0' and
      even '1'.
      
      So just remove the length argument from the callback, that way there
      is no confusion whatsoever and all of these use-after-free cases get
      fixed as a side effect.
      
      Based upon a patch by Eric Dumazet and his suggestion to audit this
      issue tree-wide.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      676d2369
  15. 04 3月, 2014 1 次提交
  16. 27 2月, 2014 2 次提交
  17. 20 10月, 2013 1 次提交
  18. 16 4月, 2012 1 次提交
  19. 13 8月, 2010 1 次提交
  20. 16 9月, 2009 1 次提交
  21. 15 9月, 2009 1 次提交
  22. 04 4月, 2008 1 次提交
  23. 31 3月, 2008 1 次提交
  24. 06 3月, 2008 1 次提交
  25. 01 2月, 2008 1 次提交
  26. 27 4月, 2007 2 次提交
    • D
      [RXRPC]: Remove bogus atomic_* overrides. · 411faf58
      David S. Miller 提交于
      These are done with CPP defines which several platforms
      use for their atomic.h implementation, which floods the
      build with warnings and breaks the build.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      411faf58
    • D
      [AF_RXRPC]: Add an interface to the AF_RXRPC module for the AFS filesystem to use · 651350d1
      David Howells 提交于
      Add an interface to the AF_RXRPC module so that the AFS filesystem module can
      more easily make use of the services available.  AFS still opens a socket but
      then uses the action functions in lieu of sendmsg() and registers an intercept
      functions to grab messages before they're queued on the socket Rx queue.
      
      This permits AFS (or whatever) to:
      
       (1) Avoid the overhead of using the recvmsg() call.
      
       (2) Use different keys directly on individual client calls on one socket
           rather than having to open a whole slew of sockets, one for each key it
           might want to use.
      
       (3) Avoid calling request_key() at the point of issue of a call or opening of
           a socket.  This is done instead by AFS at the point of open(), unlink() or
           other VFS operation and the key handed through.
      
       (4) Request the use of something other than GFP_KERNEL to allocate memory.
      
      Furthermore:
      
       (*) The socket buffer markings used by RxRPC are made available for AFS so
           that it can interpret the cooked RxRPC messages itself.
      
       (*) rxgen (un)marshalling abort codes are made available.
      
      
      The following documentation for the kernel interface is added to
      Documentation/networking/rxrpc.txt:
      
      =========================
      AF_RXRPC KERNEL INTERFACE
      =========================
      
      The AF_RXRPC module also provides an interface for use by in-kernel utilities
      such as the AFS filesystem.  This permits such a utility to:
      
       (1) Use different keys directly on individual client calls on one socket
           rather than having to open a whole slew of sockets, one for each key it
           might want to use.
      
       (2) Avoid having RxRPC call request_key() at the point of issue of a call or
           opening of a socket.  Instead the utility is responsible for requesting a
           key at the appropriate point.  AFS, for instance, would do this during VFS
           operations such as open() or unlink().  The key is then handed through
           when the call is initiated.
      
       (3) Request the use of something other than GFP_KERNEL to allocate memory.
      
       (4) Avoid the overhead of using the recvmsg() call.  RxRPC messages can be
           intercepted before they get put into the socket Rx queue and the socket
           buffers manipulated directly.
      
      To use the RxRPC facility, a kernel utility must still open an AF_RXRPC socket,
      bind an addess as appropriate and listen if it's to be a server socket, but
      then it passes this to the kernel interface functions.
      
      The kernel interface functions are as follows:
      
       (*) Begin a new client call.
      
      	struct rxrpc_call *
      	rxrpc_kernel_begin_call(struct socket *sock,
      				struct sockaddr_rxrpc *srx,
      				struct key *key,
      				unsigned long user_call_ID,
      				gfp_t gfp);
      
           This allocates the infrastructure to make a new RxRPC call and assigns
           call and connection numbers.  The call will be made on the UDP port that
           the socket is bound to.  The call will go to the destination address of a
           connected client socket unless an alternative is supplied (srx is
           non-NULL).
      
           If a key is supplied then this will be used to secure the call instead of
           the key bound to the socket with the RXRPC_SECURITY_KEY sockopt.  Calls
           secured in this way will still share connections if at all possible.
      
           The user_call_ID is equivalent to that supplied to sendmsg() in the
           control data buffer.  It is entirely feasible to use this to point to a
           kernel data structure.
      
           If this function is successful, an opaque reference to the RxRPC call is
           returned.  The caller now holds a reference on this and it must be
           properly ended.
      
       (*) End a client call.
      
      	void rxrpc_kernel_end_call(struct rxrpc_call *call);
      
           This is used to end a previously begun call.  The user_call_ID is expunged
           from AF_RXRPC's knowledge and will not be seen again in association with
           the specified call.
      
       (*) Send data through a call.
      
      	int rxrpc_kernel_send_data(struct rxrpc_call *call, struct msghdr *msg,
      				   size_t len);
      
           This is used to supply either the request part of a client call or the
           reply part of a server call.  msg.msg_iovlen and msg.msg_iov specify the
           data buffers to be used.  msg_iov may not be NULL and must point
           exclusively to in-kernel virtual addresses.  msg.msg_flags may be given
           MSG_MORE if there will be subsequent data sends for this call.
      
           The msg must not specify a destination address, control data or any flags
           other than MSG_MORE.  len is the total amount of data to transmit.
      
       (*) Abort a call.
      
      	void rxrpc_kernel_abort_call(struct rxrpc_call *call, u32 abort_code);
      
           This is used to abort a call if it's still in an abortable state.  The
           abort code specified will be placed in the ABORT message sent.
      
       (*) Intercept received RxRPC messages.
      
      	typedef void (*rxrpc_interceptor_t)(struct sock *sk,
      					    unsigned long user_call_ID,
      					    struct sk_buff *skb);
      
      	void
      	rxrpc_kernel_intercept_rx_messages(struct socket *sock,
      					   rxrpc_interceptor_t interceptor);
      
           This installs an interceptor function on the specified AF_RXRPC socket.
           All messages that would otherwise wind up in the socket's Rx queue are
           then diverted to this function.  Note that care must be taken to process
           the messages in the right order to maintain DATA message sequentiality.
      
           The interceptor function itself is provided with the address of the socket
           and handling the incoming message, the ID assigned by the kernel utility
           to the call and the socket buffer containing the message.
      
           The skb->mark field indicates the type of message:
      
      	MARK				MEANING
      	===============================	=======================================
      	RXRPC_SKB_MARK_DATA		Data message
      	RXRPC_SKB_MARK_FINAL_ACK	Final ACK received for an incoming call
      	RXRPC_SKB_MARK_BUSY		Client call rejected as server busy
      	RXRPC_SKB_MARK_REMOTE_ABORT	Call aborted by peer
      	RXRPC_SKB_MARK_NET_ERROR	Network error detected
      	RXRPC_SKB_MARK_LOCAL_ERROR	Local error encountered
      	RXRPC_SKB_MARK_NEW_CALL		New incoming call awaiting acceptance
      
           The remote abort message can be probed with rxrpc_kernel_get_abort_code().
           The two error messages can be probed with rxrpc_kernel_get_error_number().
           A new call can be accepted with rxrpc_kernel_accept_call().
      
           Data messages can have their contents extracted with the usual bunch of
           socket buffer manipulation functions.  A data message can be determined to
           be the last one in a sequence with rxrpc_kernel_is_data_last().  When a
           data message has been used up, rxrpc_kernel_data_delivered() should be
           called on it..
      
           Non-data messages should be handled to rxrpc_kernel_free_skb() to dispose
           of.  It is possible to get extra refs on all types of message for later
           freeing, but this may pin the state of a call until the message is finally
           freed.
      
       (*) Accept an incoming call.
      
      	struct rxrpc_call *
      	rxrpc_kernel_accept_call(struct socket *sock,
      				 unsigned long user_call_ID);
      
           This is used to accept an incoming call and to assign it a call ID.  This
           function is similar to rxrpc_kernel_begin_call() and calls accepted must
           be ended in the same way.
      
           If this function is successful, an opaque reference to the RxRPC call is
           returned.  The caller now holds a reference on this and it must be
           properly ended.
      
       (*) Reject an incoming call.
      
      	int rxrpc_kernel_reject_call(struct socket *sock);
      
           This is used to reject the first incoming call on the socket's queue with
           a BUSY message.  -ENODATA is returned if there were no incoming calls.
           Other errors may be returned if the call had been aborted (-ECONNABORTED)
           or had timed out (-ETIME).
      
       (*) Record the delivery of a data message and free it.
      
      	void rxrpc_kernel_data_delivered(struct sk_buff *skb);
      
           This is used to record a data message as having been delivered and to
           update the ACK state for the call.  The socket buffer will be freed.
      
       (*) Free a message.
      
      	void rxrpc_kernel_free_skb(struct sk_buff *skb);
      
           This is used to free a non-DATA socket buffer intercepted from an AF_RXRPC
           socket.
      
       (*) Determine if a data message is the last one on a call.
      
      	bool rxrpc_kernel_is_data_last(struct sk_buff *skb);
      
           This is used to determine if a socket buffer holds the last data message
           to be received for a call (true will be returned if it does, false
           if not).
      
           The data message will be part of the reply on a client call and the
           request on an incoming call.  In the latter case there will be more
           messages, but in the former case there will not.
      
       (*) Get the abort code from an abort message.
      
      	u32 rxrpc_kernel_get_abort_code(struct sk_buff *skb);
      
           This is used to extract the abort code from a remote abort message.
      
       (*) Get the error number from a local or network error message.
      
      	int rxrpc_kernel_get_error_number(struct sk_buff *skb);
      
           This is used to extract the error number from a message indicating either
           a local error occurred or a network error occurred.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      651350d1