1. 23 5月, 2017 1 次提交
  2. 22 3月, 2017 1 次提交
  3. 10 3月, 2017 1 次提交
    • D
      net: Work around lockdep limitation in sockets that use sockets · cdfbabfb
      David Howells 提交于
      Lockdep issues a circular dependency warning when AFS issues an operation
      through AF_RXRPC from a context in which the VFS/VM holds the mmap_sem.
      
      The theory lockdep comes up with is as follows:
      
       (1) If the pagefault handler decides it needs to read pages from AFS, it
           calls AFS with mmap_sem held and AFS begins an AF_RXRPC call, but
           creating a call requires the socket lock:
      
      	mmap_sem must be taken before sk_lock-AF_RXRPC
      
       (2) afs_open_socket() opens an AF_RXRPC socket and binds it.  rxrpc_bind()
           binds the underlying UDP socket whilst holding its socket lock.
           inet_bind() takes its own socket lock:
      
      	sk_lock-AF_RXRPC must be taken before sk_lock-AF_INET
      
       (3) Reading from a TCP socket into a userspace buffer might cause a fault
           and thus cause the kernel to take the mmap_sem, but the TCP socket is
           locked whilst doing this:
      
      	sk_lock-AF_INET must be taken before mmap_sem
      
      However, lockdep's theory is wrong in this instance because it deals only
      with lock classes and not individual locks.  The AF_INET lock in (2) isn't
      really equivalent to the AF_INET lock in (3) as the former deals with a
      socket entirely internal to the kernel that never sees userspace.  This is
      a limitation in the design of lockdep.
      
      Fix the general case by:
      
       (1) Double up all the locking keys used in sockets so that one set are
           used if the socket is created by userspace and the other set is used
           if the socket is created by the kernel.
      
       (2) Store the kern parameter passed to sk_alloc() in a variable in the
           sock struct (sk_kern_sock).  This informs sock_lock_init(),
           sock_init_data() and sk_clone_lock() as to the lock keys to be used.
      
           Note that the child created by sk_clone_lock() inherits the parent's
           kern setting.
      
       (3) Add a 'kern' parameter to ->accept() that is analogous to the one
           passed in to ->create() that distinguishes whether kernel_accept() or
           sys_accept4() was the caller and can be passed to sk_alloc().
      
           Note that a lot of accept functions merely dequeue an already
           allocated socket.  I haven't touched these as the new socket already
           exists before we get the parameter.
      
           Note also that there are a couple of places where I've made the accepted
           socket unconditionally kernel-based:
      
      	irda_accept()
      	rds_rcp_accept_one()
      	tcp_accept_from_sock()
      
           because they follow a sock_create_kern() and accept off of that.
      
      Whilst creating this, I noticed that lustre and ocfs don't create sockets
      through sock_create_kern() and thus they aren't marked as for-kernel,
      though they appear to be internal.  I wonder if these should do that so
      that they use the new set of lock keys.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cdfbabfb
  4. 02 3月, 2017 1 次提交
  5. 27 9月, 2016 1 次提交
  6. 02 8月, 2016 2 次提交
  7. 27 6月, 2016 1 次提交
  8. 06 5月, 2016 1 次提交
    • I
      VSOCK: do not disconnect socket when peer has shutdown SEND only · dedc58e0
      Ian Campbell 提交于
      The peer may be expecting a reply having sent a request and then done a
      shutdown(SHUT_WR), so tearing down the whole socket at this point seems
      wrong and breaks for me with a client which does a SHUT_WR.
      
      Looking at other socket family's stream_recvmsg callbacks doing a shutdown
      here does not seem to be the norm and removing it does not seem to have
      had any adverse effects that I can see.
      
      I'm using Stefan's RFC virtio transport patches, I'm unsure of the impact
      on the vmci transport.
      Signed-off-by: NIan Campbell <ian.campbell@docker.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Stefan Hajnoczi <stefanha@redhat.com>
      Cc: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
      Cc: Andy King <acking@vmware.com>
      Cc: Dmitry Torokhov <dtor@vmware.com>
      Cc: Jorgen Hansen <jhansen@vmware.com>
      Cc: Adit Ranadive <aditr@vmware.com>
      Cc: netdev@vger.kernel.org
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dedc58e0
  9. 23 3月, 2016 2 次提交
  10. 13 2月, 2016 1 次提交
    • L
      vsock: Fix blocking ops call in prepare_to_wait · 59888180
      Laura Abbott 提交于
      We receoved a bug report from someone using vmware:
      
      WARNING: CPU: 3 PID: 660 at kernel/sched/core.c:7389
      __might_sleep+0x7d/0x90()
      do not call blocking ops when !TASK_RUNNING; state=1 set at
      [<ffffffff810fa68d>] prepare_to_wait+0x2d/0x90
      Modules linked in: vmw_vsock_vmci_transport vsock snd_seq_midi
      snd_seq_midi_event snd_ens1371 iosf_mbi gameport snd_rawmidi
      snd_ac97_codec ac97_bus snd_seq coretemp snd_seq_device snd_pcm
      snd_timer snd soundcore ppdev crct10dif_pclmul crc32_pclmul
      ghash_clmulni_intel vmw_vmci vmw_balloon i2c_piix4 shpchp parport_pc
      parport acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc btrfs
      xor raid6_pq 8021q garp stp llc mrp crc32c_intel serio_raw mptspi vmwgfx
      drm_kms_helper ttm drm scsi_transport_spi mptscsih e1000 ata_generic
      mptbase pata_acpi
      CPU: 3 PID: 660 Comm: vmtoolsd Not tainted
      4.2.0-0.rc1.git3.1.fc23.x86_64 #1
      Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop
      Reference Platform, BIOS 6.00 05/20/2014
       0000000000000000 0000000049e617f3 ffff88006ac37ac8 ffffffff818641f5
       0000000000000000 ffff88006ac37b20 ffff88006ac37b08 ffffffff810ab446
       ffff880068009f40 ffffffff81c63bc0 0000000000000061 0000000000000000
      Call Trace:
       [<ffffffff818641f5>] dump_stack+0x4c/0x65
       [<ffffffff810ab446>] warn_slowpath_common+0x86/0xc0
       [<ffffffff810ab4d5>] warn_slowpath_fmt+0x55/0x70
       [<ffffffff8112551d>] ? debug_lockdep_rcu_enabled+0x1d/0x20
       [<ffffffff810fa68d>] ? prepare_to_wait+0x2d/0x90
       [<ffffffff810fa68d>] ? prepare_to_wait+0x2d/0x90
       [<ffffffff810da2bd>] __might_sleep+0x7d/0x90
       [<ffffffff812163b3>] __might_fault+0x43/0xa0
       [<ffffffff81430477>] copy_from_iter+0x87/0x2a0
       [<ffffffffa039460a>] __qp_memcpy_to_queue+0x9a/0x1b0 [vmw_vmci]
       [<ffffffffa0394740>] ? qp_memcpy_to_queue+0x20/0x20 [vmw_vmci]
       [<ffffffffa0394757>] qp_memcpy_to_queue_iov+0x17/0x20 [vmw_vmci]
       [<ffffffffa0394d50>] qp_enqueue_locked+0xa0/0x140 [vmw_vmci]
       [<ffffffffa039593f>] vmci_qpair_enquev+0x4f/0xd0 [vmw_vmci]
       [<ffffffffa04847bb>] vmci_transport_stream_enqueue+0x1b/0x20
      [vmw_vsock_vmci_transport]
       [<ffffffffa047ae05>] vsock_stream_sendmsg+0x2c5/0x320 [vsock]
       [<ffffffff810fabd0>] ? wake_atomic_t_function+0x70/0x70
       [<ffffffff81702af8>] sock_sendmsg+0x38/0x50
       [<ffffffff81702ff4>] SYSC_sendto+0x104/0x190
       [<ffffffff8126e25a>] ? vfs_read+0x8a/0x140
       [<ffffffff817042ee>] SyS_sendto+0xe/0x10
       [<ffffffff8186d9ae>] entry_SYSCALL_64_fastpath+0x12/0x76
      
      transport->stream_enqueue may call copy_to_user so it should
      not be called inside a prepare_to_wait. Narrow the scope of
      the prepare_to_wait to avoid the bad call. This also applies
      to vsock_stream_recvmsg as well.
      Reported-by: NVinson Lee <vlee@freedesktop.org>
      Tested-by: NVinson Lee <vlee@freedesktop.org>
      Signed-off-by: NLaura Abbott <labbott@fedoraproject.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      59888180
  11. 09 12月, 2015 1 次提交
  12. 04 12月, 2015 1 次提交
  13. 02 11月, 2015 1 次提交
    • S
      VSOCK: define VSOCK_SS_LISTEN once only · ea3803c1
      Stefan Hajnoczi 提交于
      The SS_LISTEN socket state is defined by both af_vsock.c and
      vmci_transport.c.  This is risky since the value could be changed in one
      file and the other would be out of sync.
      
      Rename from SS_LISTEN to VSOCK_SS_LISTEN since the constant is not part
      of enum socket_state (SS_CONNECTED, ...).  This way it is clear that the
      constant is vsock-specific.
      
      The big text reflow in af_vsock.c was necessary to keep to the maximum
      line length.  Text is unchanged except for s/SS_LISTEN/VSOCK_SS_LISTEN/.
      Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ea3803c1
  14. 21 10月, 2015 1 次提交
  15. 11 5月, 2015 1 次提交
  16. 03 3月, 2015 1 次提交
  17. 24 11月, 2014 1 次提交
  18. 06 5月, 2014 1 次提交
  19. 21 11月, 2013 1 次提交
    • H
      net: rework recvmsg handler msg_name and msg_namelen logic · f3d33426
      Hannes Frederic Sowa 提交于
      This patch now always passes msg->msg_namelen as 0. recvmsg handlers must
      set msg_namelen to the proper size <= sizeof(struct sockaddr_storage)
      to return msg_name to the user.
      
      This prevents numerous uninitialized memory leaks we had in the
      recvmsg handlers and makes it harder for new code to accidentally leak
      uninitialized memory.
      
      Optimize for the case recvfrom is called with NULL as address. We don't
      need to copy the address at all, so set it to NULL before invoking the
      recvmsg handler. We can do so, because all the recvmsg handlers must
      cope with the case a plain read() is called on them. read() also sets
      msg_name to NULL.
      
      Also document these changes in include/linux/net.h as suggested by David
      Miller.
      
      Changes since RFC:
      
      Set msg->msg_name = NULL if user specified a NULL in msg_name but had a
      non-null msg_namelen in verify_iovec/verify_compat_iovec. This doesn't
      affect sendto as it would bail out earlier while trying to copy-in the
      address. It also more naturally reflects the logic by the callers of
      verify_iovec.
      
      With this change in place I could remove "
      if (!uaddr || msg_sys->msg_namelen == 0)
      	msg->msg_name = NULL
      ".
      
      This change does not alter the user visible error logic as we ignore
      msg_namelen as long as msg_name is NULL.
      
      Also remove two unnecessary curly brackets in ___sys_recvmsg and change
      comments to netdev style.
      
      Cc: David Miller <davem@davemloft.net>
      Suggested-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f3d33426
  20. 06 8月, 2013 1 次提交
  21. 28 7月, 2013 1 次提交
  22. 24 6月, 2013 2 次提交
  23. 25 4月, 2013 2 次提交
    • G
      VSOCK: Drop bogus __init annotation from vsock_init_tables() · 22ee3b57
      Geert Uytterhoeven 提交于
      If gcc (e.g. 4.1.2) decides not to inline vsock_init_tables(), this will
      cause a section mismatch:
      
      WARNING: net/vmw_vsock/vsock.o(.text+0x1bc): Section mismatch in reference from the function __vsock_core_init() to the function .init.text:vsock_init_tables()
      The function __vsock_core_init() references
      the function __init vsock_init_tables().
      This is often because __vsock_core_init lacks a __init
      annotation or the annotation of vsock_init_tables is wrong.
      
      This may cause crashes if VSOCKETS=y and VMWARE_VMCI_VSOCKETS=m.
      Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      22ee3b57
    • A
      VSOCK: Fix misc device registration · 6ad0b2f7
      Asias He 提交于
      When we call vsock_core_init to init VSOCK the second time,
      vsock_device.minor still points to the old dynamically allocated minor
      number. misc_register will allocate it for us successfully as if we were
      asking for a static one. However, when other user call misc_register to
      allocate a dynamic minor number, it will give the one used by
      vsock_core_init(), causing this:
      
        [  405.470687] WARNING: at fs/sysfs/dir.c:536 sysfs_add_one+0xcc/0xf0()
        [  405.470689] Hardware name: OptiPlex 790
        [  405.470690] sysfs: cannot create duplicate filename '/dev/char/10:54'
      
      Always set vsock_device.minor to MISC_DYNAMIC_MINOR before we
      register.
      
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Andy King <acking@vmware.com>
      Cc: Dmitry Torokhov <dtor@vmware.com>
      Cc: Reilly Grant <grantr@vmware.com>
      Cc: netdev@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NAsias He <asias@redhat.com>
      Acked-by: NDmitry Torokhov <dtor@vmware.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6ad0b2f7
  24. 08 4月, 2013 1 次提交
  25. 03 4月, 2013 1 次提交
  26. 19 2月, 2013 3 次提交
  27. 11 2月, 2013 1 次提交
    • A
      VSOCK: Introduce VM Sockets · d021c344
      Andy King 提交于
      VM Sockets allows communication between virtual machines and the hypervisor.
      User level applications both in a virtual machine and on the host can use the
      VM Sockets API, which facilitates fast and efficient communication between
      guest virtual machines and their host.  A socket address family, designed to be
      compatible with UDP and TCP at the interface level, is provided.
      
      Today, VM Sockets is used by various VMware Tools components inside the guest
      for zero-config, network-less access to VMware host services.  In addition to
      this, VMware's users are using VM Sockets for various applications, where
      network access of the virtual machine is restricted or non-existent.  Examples
      of this are VMs communicating with device proxies for proprietary hardware
      running as host applications and automated testing of applications running
      within virtual machines.
      
      The VMware VM Sockets are similar to other socket types, like Berkeley UNIX
      socket interface.  The VM Sockets module supports both connection-oriented
      stream sockets like TCP, and connectionless datagram sockets like UDP. The VM
      Sockets protocol family is defined as "AF_VSOCK" and the socket operations
      split for SOCK_DGRAM and SOCK_STREAM.
      
      For additional information about the use of VM Sockets, please refer to the
      VM Sockets Programming Guide available at:
      
      https://www.vmware.com/support/developer/vmci-sdk/Signed-off-by: NGeorge Zhang <georgezhang@vmware.com>
      Signed-off-by: NDmitry Torokhov <dtor@vmware.com>
      Signed-off-by: NAndy king <acking@vmware.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d021c344