1. 02 4月, 2017 1 次提交
  2. 31 3月, 2017 1 次提交
    • X
      sctp: alloc stream info when initializing asoc · 3dbcc105
      Xin Long 提交于
      When sending a msg without asoc established, sctp will send INIT packet
      first and then enqueue chunks.
      
      Before receiving INIT_ACK, stream info is not yet alloced. But enqueuing
      chunks needs to access stream info, like out stream state and out stream
      cnt.
      
      This patch is to fix it by allocing out stream info when initializing an
      asoc, allocing in stream and re-allocing out stream when processing init.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3dbcc105
  3. 29 3月, 2017 1 次提交
    • X
      sctp: change to save MSG_MORE flag into assoc · f9ba3501
      Xin Long 提交于
      David Laight noticed the support for MSG_MORE with datamsg->force_delay
      didn't really work as we expected, as the first msg with MSG_MORE set
      would always block the following chunks' dequeuing.
      
      This Patch is to rewrite it by saving the MSG_MORE flag into assoc as
      David Laight suggested.
      
      asoc->force_delay is used to save MSG_MORE flag before a msg is sent.
      All chunks in queue would not be sent out if asoc->force_delay is set
      by the msg with MSG_MORE flag, until a new msg without MSG_MORE flag
      clears asoc->force_delay.
      
      Note that this change would not affect the flush is generated by other
      triggers, like asoc->state != ESTABLISHED, queue size > pmtu etc.
      
      v1->v2:
        Not clear asoc->force_delay after sending the msg with MSG_MORE flag.
      
      Fixes: 4ea0c32f ("sctp: add support for MSG_MORE")
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NDavid Laight <david.laight@aculab.com>
      Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f9ba3501
  4. 23 3月, 2017 1 次提交
  5. 22 3月, 2017 2 次提交
  6. 10 3月, 2017 1 次提交
    • D
      net: Work around lockdep limitation in sockets that use sockets · cdfbabfb
      David Howells 提交于
      Lockdep issues a circular dependency warning when AFS issues an operation
      through AF_RXRPC from a context in which the VFS/VM holds the mmap_sem.
      
      The theory lockdep comes up with is as follows:
      
       (1) If the pagefault handler decides it needs to read pages from AFS, it
           calls AFS with mmap_sem held and AFS begins an AF_RXRPC call, but
           creating a call requires the socket lock:
      
      	mmap_sem must be taken before sk_lock-AF_RXRPC
      
       (2) afs_open_socket() opens an AF_RXRPC socket and binds it.  rxrpc_bind()
           binds the underlying UDP socket whilst holding its socket lock.
           inet_bind() takes its own socket lock:
      
      	sk_lock-AF_RXRPC must be taken before sk_lock-AF_INET
      
       (3) Reading from a TCP socket into a userspace buffer might cause a fault
           and thus cause the kernel to take the mmap_sem, but the TCP socket is
           locked whilst doing this:
      
      	sk_lock-AF_INET must be taken before mmap_sem
      
      However, lockdep's theory is wrong in this instance because it deals only
      with lock classes and not individual locks.  The AF_INET lock in (2) isn't
      really equivalent to the AF_INET lock in (3) as the former deals with a
      socket entirely internal to the kernel that never sees userspace.  This is
      a limitation in the design of lockdep.
      
      Fix the general case by:
      
       (1) Double up all the locking keys used in sockets so that one set are
           used if the socket is created by userspace and the other set is used
           if the socket is created by the kernel.
      
       (2) Store the kern parameter passed to sk_alloc() in a variable in the
           sock struct (sk_kern_sock).  This informs sock_lock_init(),
           sock_init_data() and sk_clone_lock() as to the lock keys to be used.
      
           Note that the child created by sk_clone_lock() inherits the parent's
           kern setting.
      
       (3) Add a 'kern' parameter to ->accept() that is analogous to the one
           passed in to ->create() that distinguishes whether kernel_accept() or
           sys_accept4() was the caller and can be passed to sk_alloc().
      
           Note that a lot of accept functions merely dequeue an already
           allocated socket.  I haven't touched these as the new socket already
           exists before we get the parameter.
      
           Note also that there are a couple of places where I've made the accepted
           socket unconditionally kernel-based:
      
      	irda_accept()
      	rds_rcp_accept_one()
      	tcp_accept_from_sock()
      
           because they follow a sock_create_kern() and accept off of that.
      
      Whilst creating this, I noticed that lustre and ocfs don't create sockets
      through sock_create_kern() and thus they aren't marked as for-kernel,
      though they appear to be internal.  I wonder if these should do that so
      that they use the new set of lock keys.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cdfbabfb
  7. 02 3月, 2017 2 次提交
  8. 28 2月, 2017 2 次提交
  9. 27 2月, 2017 1 次提交
  10. 25 2月, 2017 1 次提交
    • M
      sctp: deny peeloff operation on asocs with threads sleeping on it · dfcb9f4f
      Marcelo Ricardo Leitner 提交于
      commit 2dcab598 ("sctp: avoid BUG_ON on sctp_wait_for_sndbuf")
      attempted to avoid a BUG_ON call when the association being used for a
      sendmsg() is blocked waiting for more sndbuf and another thread did a
      peeloff operation on such asoc, moving it to another socket.
      
      As Ben Hutchings noticed, then in such case it would return without
      locking back the socket and would cause two unlocks in a row.
      
      Further analysis also revealed that it could allow a double free if the
      application managed to peeloff the asoc that is created during the
      sendmsg call, because then sctp_sendmsg() would try to free the asoc
      that was created only for that call.
      
      This patch takes another approach. It will deny the peeloff operation
      if there is a thread sleeping on the asoc, so this situation doesn't
      exist anymore. This avoids the issues described above and also honors
      the syscalls that are already being handled (it can be multiple sendmsg
      calls).
      
      Joint work with Xin Long.
      
      Fixes: 2dcab598 ("sctp: avoid BUG_ON on sctp_wait_for_sndbuf")
      Cc: Alexander Popov <alex.popov@linux.com>
      Cc: Ben Hutchings <ben@decadent.org.uk>
      Signed-off-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dfcb9f4f
  11. 20 2月, 2017 10 次提交
  12. 10 2月, 2017 5 次提交
  13. 08 2月, 2017 3 次提交
  14. 07 2月, 2017 1 次提交
  15. 27 1月, 2017 1 次提交
  16. 26 1月, 2017 2 次提交
    • X
      sctp: sctp gso should set feature with NETIF_F_SG when calling skb_segment · 5207f399
      Xin Long 提交于
      Now sctp gso puts segments into skb's frag_list, then processes these
      segments in skb_segment. But skb_segment handles them only when gs is
      enabled, as it's in the same branch with skb's frags.
      
      Although almost all the NICs support sg other than some old ones, but
      since commit 1e16aa3d ("net: gso: use feature flag argument in all
      protocol gso handlers"), features &= skb->dev->hw_enc_features, and
      xfrm_output_gso call skb_segment with features = 0, which means sctp
      gso would call skb_segment with sg = 0, and skb_segment would not work
      as expected.
      
      This patch is to fix it by setting features param with NETIF_F_SG when
      calling skb_segment so that it can go the right branch to process the
      skb's frag_list.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5207f399
    • X
      sctp: sctp_addr_id2transport should verify the addr before looking up assoc · 6f29a130
      Xin Long 提交于
      sctp_addr_id2transport is a function for sockopt to look up assoc by
      address. As the address is from userspace, it can be a v4-mapped v6
      address. But in sctp protocol stack, it always handles a v4-mapped
      v6 address as a v4 address. So it's necessary to convert it to a v4
      address before looking up assoc by address.
      
      This patch is to fix it by calling sctp_verify_addr in which it can do
      this conversion before calling sctp_endpoint_lookup_assoc, just like
      what sctp_sendmsg and __sctp_connect do for the address from users.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6f29a130
  17. 25 1月, 2017 2 次提交
    • C
      net: sctp: fix array overrun read on sctp_timer_tbl · 14640869
      Colin Ian King 提交于
      Table sctp_timer_tbl is missing a TIMEOUT_RECONF string so
      add this in. Also compare timeout with the size of the array
      sctp_timer_tbl rather than SCTP_EVENT_TIMEOUT_MAX.  Also add
      a build time check that SCTP_EVENT_TIMEOUT_MAX is correct
      so we don't ever get this kind of mismatch between the table
      and SCTP_EVENT_TIMEOUT_MAX in the future.
      
      Kudos to Marcelo Ricardo Leitner for spotting the missing string
      and suggesting the build time sanity check.
      
      Fixes CoverityScan CID#1397639 ("Out-of-bounds read")
      
      Fixes: 7b9438de ("sctp: add stream reconf timer")
      Signed-off-by: NColin Ian King <colin.king@canonical.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Reviewed-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      14640869
    • K
      Introduce a sysctl that modifies the value of PROT_SOCK. · 4548b683
      Krister Johansen 提交于
      Add net.ipv4.ip_unprivileged_port_start, which is a per namespace sysctl
      that denotes the first unprivileged inet port in the namespace.  To
      disable all privileged ports set this to zero.  It also checks for
      overlap with the local port range.  The privileged and local range may
      not overlap.
      
      The use case for this change is to allow containerized processes to bind
      to priviliged ports, but prevent them from ever being allowed to modify
      their container's network configuration.  The latter is accomplished by
      ensuring that the network namespace is not a child of the user
      namespace.  This modification was needed to allow the container manager
      to disable a namespace's priviliged port restrictions without exposing
      control of the network namespace to processes in the user namespace.
      Signed-off-by: NKrister Johansen <kjlx@templeofstupid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4548b683
  18. 21 1月, 2017 2 次提交
  19. 19 1月, 2017 1 次提交