1. 17 2月, 2016 1 次提交
  2. 08 2月, 2016 1 次提交
  3. 25 1月, 2016 1 次提交
  4. 11 1月, 2016 1 次提交
  5. 05 1月, 2016 1 次提交
    • R
      af_unix: Fix splice-bind deadlock · c845acb3
      Rainer Weikusat 提交于
      On 2015/11/06, Dmitry Vyukov reported a deadlock involving the splice
      system call and AF_UNIX sockets,
      
      http://lists.openwall.net/netdev/2015/11/06/24
      
      The situation was analyzed as
      
      (a while ago) A: socketpair()
      B: splice() from a pipe to /mnt/regular_file
      	does sb_start_write() on /mnt
      C: try to freeze /mnt
      	wait for B to finish with /mnt
      A: bind() try to bind our socket to /mnt/new_socket_name
      	lock our socket, see it not bound yet
      	decide that it needs to create something in /mnt
      	try to do sb_start_write() on /mnt, block (it's
      	waiting for C).
      D: splice() from the same pipe to our socket
      	lock the pipe, see that socket is connected
      	try to lock the socket, block waiting for A
      B:	get around to actually feeding a chunk from
      	pipe to file, try to lock the pipe.  Deadlock.
      
      on 2015/11/10 by Al Viro,
      
      http://lists.openwall.net/netdev/2015/11/10/4
      
      The patch fixes this by removing the kern_path_create related code from
      unix_mknod and executing it as part of unix_bind prior acquiring the
      readlock of the socket in question. This means that A (as used above)
      will sb_start_write on /mnt before it acquires the readlock, hence, it
      won't indirectly block B which first did a sb_start_write and then
      waited for a thread trying to acquire the readlock. Consequently, A
      being blocked by C waiting for B won't cause a deadlock anymore
      (effectively, both A and B acquire two locks in opposite order in the
      situation described above).
      
      Dmitry Vyukov(<dvyukov@google.com>) tested the original patch.
      Signed-off-by: NRainer Weikusat <rweikusat@mobileactivedefense.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c845acb3
  6. 18 12月, 2015 1 次提交
  7. 07 12月, 2015 1 次提交
    • R
      af_unix: fix unix_dgram_recvmsg entry locking · 64874280
      Rainer Weikusat 提交于
      The current unix_dgram_recvsmg code acquires the u->readlock mutex in
      order to protect access to the peek offset prior to calling
      __skb_recv_datagram for actually receiving data. This implies that a
      blocking reader will go to sleep with this mutex held if there's
      presently no data to return to userspace. Two non-desirable side effects
      of this are that a later non-blocking read call on the same socket will
      block on the ->readlock mutex until the earlier blocking call releases it
      (or the readers is interrupted) and that later blocking read calls
      will wait longer than the effective socket read timeout says they
      should: The timeout will only start 'ticking' once such a reader hits
      the schedule_timeout in wait_for_more_packets (core.c) while the time it
      already had to wait until it could acquire the mutex is unaccounted for.
      
      The patch avoids both by using the __skb_try_recv_datagram and
      __skb_wait_for_more packets functions created by the first patch to
      implement a unix_dgram_recvmsg read loop which releases the readlock
      mutex prior to going to sleep and reacquires it as needed
      afterwards. Non-blocking readers will thus immediately return with
      -EAGAIN if there's no data available regardless of any concurrent
      blocking readers and all blocking readers will end up sleeping via
      schedule_timeout, thus honouring the configured socket receive timeout.
      Signed-off-by: NRainer Weikusat <rweikusat@mobileactivedefense.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      64874280
  8. 02 12月, 2015 2 次提交
  9. 01 12月, 2015 2 次提交
  10. 24 11月, 2015 1 次提交
    • R
      unix: avoid use-after-free in ep_remove_wait_queue · 7d267278
      Rainer Weikusat 提交于
      Rainer Weikusat <rweikusat@mobileactivedefense.com> writes:
      An AF_UNIX datagram socket being the client in an n:1 association with
      some server socket is only allowed to send messages to the server if the
      receive queue of this socket contains at most sk_max_ack_backlog
      datagrams. This implies that prospective writers might be forced to go
      to sleep despite none of the message presently enqueued on the server
      receive queue were sent by them. In order to ensure that these will be
      woken up once space becomes again available, the present unix_dgram_poll
      routine does a second sock_poll_wait call with the peer_wait wait queue
      of the server socket as queue argument (unix_dgram_recvmsg does a wake
      up on this queue after a datagram was received). This is inherently
      problematic because the server socket is only guaranteed to remain alive
      for as long as the client still holds a reference to it. In case the
      connection is dissolved via connect or by the dead peer detection logic
      in unix_dgram_sendmsg, the server socket may be freed despite "the
      polling mechanism" (in particular, epoll) still has a pointer to the
      corresponding peer_wait queue. There's no way to forcibly deregister a
      wait queue with epoll.
      
      Based on an idea by Jason Baron, the patch below changes the code such
      that a wait_queue_t belonging to the client socket is enqueued on the
      peer_wait queue of the server whenever the peer receive queue full
      condition is detected by either a sendmsg or a poll. A wake up on the
      peer queue is then relayed to the ordinary wait queue of the client
      socket via wake function. The connection to the peer wait queue is again
      dissolved if either a wake up is about to be relayed or the client
      socket reconnects or a dead peer is detected or the client socket is
      itself closed. This enables removing the second sock_poll_wait from
      unix_dgram_poll, thus avoiding the use-after-free, while still ensuring
      that no blocked writer sleeps forever.
      Signed-off-by: NRainer Weikusat <rweikusat@mobileactivedefense.com>
      Fixes: ec0d215f ("af_unix: fix 'poll for write'/connected DGRAM sockets")
      Reviewed-by: NJason Baron <jbaron@akamai.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7d267278
  11. 18 11月, 2015 1 次提交
  12. 17 11月, 2015 1 次提交
  13. 16 11月, 2015 1 次提交
    • H
      af-unix: fix use-after-free with concurrent readers while splicing · 73ed5d25
      Hannes Frederic Sowa 提交于
      During splicing an af-unix socket to a pipe we have to drop all
      af-unix socket locks. While doing so we allow another reader to enter
      unix_stream_read_generic which can read, copy and finally free another
      skb. If exactly this skb is just in process of being spliced we get a
      use-after-free report by kasan.
      
      First, we must make sure to not have a free while the skb is used during
      the splice operation. We simply increment its use counter before unlocking
      the reader lock.
      
      Stream sockets have the nice characteristic that we don't care about
      zero length writes and they never reach the peer socket's queue. That
      said, we can take the UNIXCB.consumed field as the indicator if the
      skb was already freed from the socket's receive queue. If the skb was
      fully consumed after we locked the reader side again we know it has been
      dropped by a second reader. We indicate a short read to user space and
      abort the current splice operation.
      
      This bug has been found with syzkaller
      (http://github.com/google/syzkaller) by Dmitry Vyukov.
      
      Fixes: 2b514574 ("net: af_unix: implement splice for stream af_unix sockets")
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      73ed5d25
  14. 25 10月, 2015 1 次提交
  15. 05 10月, 2015 1 次提交
  16. 30 9月, 2015 1 次提交
  17. 11 6月, 2015 1 次提交
  18. 27 5月, 2015 1 次提交
  19. 25 5月, 2015 2 次提交
  20. 11 5月, 2015 1 次提交
  21. 16 4月, 2015 2 次提交
  22. 03 3月, 2015 1 次提交
  23. 29 1月, 2015 1 次提交
    • C
      net: remove sock_iocb · 7cc05662
      Christoph Hellwig 提交于
      The sock_iocb structure is allocate on stack for each read/write-like
      operation on sockets, and contains various fields of which only the
      embedded msghdr and sometimes a pointer to the scm_cookie is ever used.
      Get rid of the sock_iocb and put a msghdr directly on the stack and pass
      the scm_cookie explicitly to netlink_mmap_sendmsg.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7cc05662
  24. 10 12月, 2014 1 次提交
    • A
      put iov_iter into msghdr · c0371da6
      Al Viro 提交于
      Note that the code _using_ ->msg_iter at that point will be very
      unhappy with anything other than unshifted iovec-backed iov_iter.
      We still need to convert users to proper primitives.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      c0371da6
  25. 24 11月, 2014 1 次提交
  26. 06 11月, 2014 1 次提交
    • D
      net: Add and use skb_copy_datagram_msg() helper. · 51f3d02b
      David S. Miller 提交于
      This encapsulates all of the skb_copy_datagram_iovec() callers
      with call argument signature "skb, offset, msghdr->msg_iov, length".
      
      When we move to iov_iters in the networking, the iov_iter object will
      sit in the msghdr.
      
      Having a helper like this means there will be less places to touch
      during that transformation.
      
      Based upon descriptions and patch from Al Viro.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      51f3d02b
  27. 17 5月, 2014 1 次提交
  28. 18 4月, 2014 1 次提交
  29. 12 4月, 2014 1 次提交
    • D
      net: Fix use after free by removing length arg from sk_data_ready callbacks. · 676d2369
      David S. Miller 提交于
      Several spots in the kernel perform a sequence like:
      
      	skb_queue_tail(&sk->s_receive_queue, skb);
      	sk->sk_data_ready(sk, skb->len);
      
      But at the moment we place the SKB onto the socket receive queue it
      can be consumed and freed up.  So this skb->len access is potentially
      to freed up memory.
      
      Furthermore, the skb->len can be modified by the consumer so it is
      possible that the value isn't accurate.
      
      And finally, no actual implementation of this callback actually uses
      the length argument.  And since nobody actually cared about it's
      value, lots of call sites pass arbitrary values in such as '0' and
      even '1'.
      
      So just remove the length argument from the callback, that way there
      is no confusion whatsoever and all of these use-after-free cases get
      fixed as a side effect.
      
      Based upon a patch by Eric Dumazet and his suggestion to audit this
      issue tree-wide.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      676d2369
  30. 27 3月, 2014 1 次提交
  31. 07 3月, 2014 1 次提交
    • A
      net: unix socket code abuses csum_partial · 0a13404d
      Anton Blanchard 提交于
      The unix socket code is using the result of csum_partial to
      hash into a lookup table:
      
      	unix_hash_fold(csum_partial(sunaddr, len, 0));
      
      csum_partial is only guaranteed to produce something that can be
      folded into a checksum, as its prototype explains:
      
       * returns a 32-bit number suitable for feeding into itself
       * or csum_tcpudp_magic
      
      The 32bit value should not be used directly.
      
      Depending on the alignment, the ppc64 csum_partial will return
      different 32bit partial checksums that will fold into the same
      16bit checksum.
      
      This difference causes the following testcase (courtesy of
      Gustavo) to sometimes fail:
      
      #include <sys/socket.h>
      #include <stdio.h>
      
      int main()
      {
      	int fd = socket(PF_LOCAL, SOCK_STREAM|SOCK_CLOEXEC, 0);
      
      	int i = 1;
      	setsockopt(fd, SOL_SOCKET, SO_REUSEADDR, &i, 4);
      
      	struct sockaddr addr;
      	addr.sa_family = AF_LOCAL;
      	bind(fd, &addr, 2);
      
      	listen(fd, 128);
      
      	struct sockaddr_storage ss;
      	socklen_t sslen = (socklen_t)sizeof(ss);
      	getsockname(fd, (struct sockaddr*)&ss, &sslen);
      
      	fd = socket(PF_LOCAL, SOCK_STREAM|SOCK_CLOEXEC, 0);
      
      	if (connect(fd, (struct sockaddr*)&ss, sslen) == -1){
      		perror(NULL);
      		return 1;
      	}
      	printf("OK\n");
      	return 0;
      }
      
      As suggested by davem, fix this by using csum_fold to fold the
      partial 32bit checksum into a 16bit checksum before using it.
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0a13404d
  32. 19 1月, 2014 1 次提交
  33. 18 12月, 2013 1 次提交
  34. 11 12月, 2013 1 次提交
  35. 07 12月, 2013 1 次提交
  36. 21 11月, 2013 1 次提交
    • H
      net: rework recvmsg handler msg_name and msg_namelen logic · f3d33426
      Hannes Frederic Sowa 提交于
      This patch now always passes msg->msg_namelen as 0. recvmsg handlers must
      set msg_namelen to the proper size <= sizeof(struct sockaddr_storage)
      to return msg_name to the user.
      
      This prevents numerous uninitialized memory leaks we had in the
      recvmsg handlers and makes it harder for new code to accidentally leak
      uninitialized memory.
      
      Optimize for the case recvfrom is called with NULL as address. We don't
      need to copy the address at all, so set it to NULL before invoking the
      recvmsg handler. We can do so, because all the recvmsg handlers must
      cope with the case a plain read() is called on them. read() also sets
      msg_name to NULL.
      
      Also document these changes in include/linux/net.h as suggested by David
      Miller.
      
      Changes since RFC:
      
      Set msg->msg_name = NULL if user specified a NULL in msg_name but had a
      non-null msg_namelen in verify_iovec/verify_compat_iovec. This doesn't
      affect sendto as it would bail out earlier while trying to copy-in the
      address. It also more naturally reflects the logic by the callers of
      verify_iovec.
      
      With this change in place I could remove "
      if (!uaddr || msg_sys->msg_namelen == 0)
      	msg->msg_name = NULL
      ".
      
      This change does not alter the user visible error logic as we ignore
      msg_namelen as long as msg_name is NULL.
      
      Also remove two unnecessary curly brackets in ___sys_recvmsg and change
      comments to netdev style.
      
      Cc: David Miller <davem@davemloft.net>
      Suggested-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f3d33426