1. 02 8月, 2017 1 次提交
  2. 26 7月, 2017 1 次提交
  3. 25 7月, 2017 1 次提交
  4. 05 7月, 2017 1 次提交
  5. 14 6月, 2017 1 次提交
  6. 23 5月, 2017 1 次提交
  7. 22 5月, 2017 2 次提交
    • M
      net: allow simultaneous SW and HW transmit timestamping · b50a5c70
      Miroslav Lichvar 提交于
      Add SOF_TIMESTAMPING_OPT_TX_SWHW option to allow an outgoing packet to
      be looped to the socket's error queue with a software timestamp even
      when a hardware transmit timestamp is expected to be provided by the
      driver.
      
      Applications using this option will receive two separate messages from
      the error queue, one with a software timestamp and the other with a
      hardware timestamp. As the hardware timestamp is saved to the shared skb
      info, which may happen before the first message with software timestamp
      is received by the application, the hardware timestamp is copied to the
      SCM_TIMESTAMPING control message only when the skb has no software
      timestamp or it is an incoming packet.
      
      While changing sw_tx_timestamp(), inline it in skb_tx_timestamp() as
      there are no other users.
      
      CC: Richard Cochran <richardcochran@gmail.com>
      CC: Willem de Bruijn <willemb@google.com>
      Signed-off-by: NMiroslav Lichvar <mlichvar@redhat.com>
      Acked-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b50a5c70
    • M
      net: add new control message for incoming HW-timestamped packets · aad9c8c4
      Miroslav Lichvar 提交于
      Add SOF_TIMESTAMPING_OPT_PKTINFO option to request a new control message
      for incoming packets with hardware timestamps. It contains the index of
      the real interface which received the packet and the length of the
      packet at layer 2.
      
      The index is useful with bonding, bridges and other interfaces, where
      IP_PKTINFO doesn't allow applications to determine which PHC made the
      timestamp. With the L2 length (and link speed) it is possible to
      transpose preamble timestamps to trailer timestamps, which are used in
      the NTP protocol.
      
      While this information could be provided by two new socket options
      independently from timestamping, it doesn't look like they would be very
      useful. With this option any performance impact is limited to hardware
      timestamping.
      
      Use dev_get_by_napi_id() to get the device and its index. On kernels
      with disabled CONFIG_NET_RX_BUSY_POLL or drivers not using NAPI, a zero
      index will be returned in the control message.
      
      CC: Richard Cochran <richardcochran@gmail.com>
      Acked-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NMiroslav Lichvar <mlichvar@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      aad9c8c4
  8. 18 4月, 2017 1 次提交
  9. 07 4月, 2017 1 次提交
  10. 22 3月, 2017 2 次提交
  11. 10 3月, 2017 2 次提交
    • D
      net: Work around lockdep limitation in sockets that use sockets · cdfbabfb
      David Howells 提交于
      Lockdep issues a circular dependency warning when AFS issues an operation
      through AF_RXRPC from a context in which the VFS/VM holds the mmap_sem.
      
      The theory lockdep comes up with is as follows:
      
       (1) If the pagefault handler decides it needs to read pages from AFS, it
           calls AFS with mmap_sem held and AFS begins an AF_RXRPC call, but
           creating a call requires the socket lock:
      
      	mmap_sem must be taken before sk_lock-AF_RXRPC
      
       (2) afs_open_socket() opens an AF_RXRPC socket and binds it.  rxrpc_bind()
           binds the underlying UDP socket whilst holding its socket lock.
           inet_bind() takes its own socket lock:
      
      	sk_lock-AF_RXRPC must be taken before sk_lock-AF_INET
      
       (3) Reading from a TCP socket into a userspace buffer might cause a fault
           and thus cause the kernel to take the mmap_sem, but the TCP socket is
           locked whilst doing this:
      
      	sk_lock-AF_INET must be taken before mmap_sem
      
      However, lockdep's theory is wrong in this instance because it deals only
      with lock classes and not individual locks.  The AF_INET lock in (2) isn't
      really equivalent to the AF_INET lock in (3) as the former deals with a
      socket entirely internal to the kernel that never sees userspace.  This is
      a limitation in the design of lockdep.
      
      Fix the general case by:
      
       (1) Double up all the locking keys used in sockets so that one set are
           used if the socket is created by userspace and the other set is used
           if the socket is created by the kernel.
      
       (2) Store the kern parameter passed to sk_alloc() in a variable in the
           sock struct (sk_kern_sock).  This informs sock_lock_init(),
           sock_init_data() and sk_clone_lock() as to the lock keys to be used.
      
           Note that the child created by sk_clone_lock() inherits the parent's
           kern setting.
      
       (3) Add a 'kern' parameter to ->accept() that is analogous to the one
           passed in to ->create() that distinguishes whether kernel_accept() or
           sys_accept4() was the caller and can be passed to sk_alloc().
      
           Note that a lot of accept functions merely dequeue an already
           allocated socket.  I haven't touched these as the new socket already
           exists before we get the parameter.
      
           Note also that there are a couple of places where I've made the accepted
           socket unconditionally kernel-based:
      
      	irda_accept()
      	rds_rcp_accept_one()
      	tcp_accept_from_sock()
      
           because they follow a sock_create_kern() and accept off of that.
      
      Whilst creating this, I noticed that lustre and ocfs don't create sockets
      through sock_create_kern() and thus they aren't marked as for-kernel,
      though they appear to be internal.  I wonder if these should do that so
      that they use the new set of lock keys.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cdfbabfb
    • A
      net: initialize msg.msg_flags in recvfrom · 9f138fa6
      Alexander Potapenko 提交于
      KMSAN reports a use of uninitialized memory in put_cmsg() because
      msg.msg_flags in recvfrom haven't been initialized properly.
      The flag values don't affect the result on this path, but it's still a
      good idea to initialize them explicitly.
      Signed-off-by: NAlexander Potapenko <glider@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9f138fa6
  12. 22 2月, 2017 1 次提交
    • M
      net: socket: fix recvmmsg not returning error from sock_error · e623a9e9
      Maxime Jayat 提交于
      Commit 34b88a68 ("net: Fix use after free in the recvmmsg exit path"),
      changed the exit path of recvmmsg to always return the datagrams
      variable and modified the error paths to set the variable to the error
      code returned by recvmsg if necessary.
      
      However in the case sock_error returned an error, the error code was
      then ignored, and recvmmsg returned 0.
      
      Change the error path of recvmmsg to correctly return the error code
      of sock_error.
      
      The bug was triggered by using recvmmsg on a CAN interface which was
      not up. Linux 4.6 and later return 0 in this case while earlier
      releases returned -ENETDOWN.
      
      Fixes: 34b88a68 ("net: Fix use after free in the recvmmsg exit path")
      Signed-off-by: NMaxime Jayat <maxime.jayat@mobile-devices.fr>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e623a9e9
  13. 11 1月, 2017 1 次提交
  14. 10 1月, 2017 1 次提交
  15. 05 1月, 2017 1 次提交
  16. 02 1月, 2017 1 次提交
  17. 26 12月, 2016 1 次提交
    • T
      ktime: Get rid of the union · 2456e855
      Thomas Gleixner 提交于
      ktime is a union because the initial implementation stored the time in
      scalar nanoseconds on 64 bit machine and in a endianess optimized timespec
      variant for 32bit machines. The Y2038 cleanup removed the timespec variant
      and switched everything to scalar nanoseconds. The union remained, but
      become completely pointless.
      
      Get rid of the union and just keep ktime_t as simple typedef of type s64.
      
      The conversion was done with coccinelle and some manual mopping up.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      2456e855
  18. 25 12月, 2016 1 次提交
  19. 11 12月, 2016 1 次提交
  20. 09 12月, 2016 1 次提交
  21. 30 11月, 2016 1 次提交
    • F
      tcp: SOF_TIMESTAMPING_OPT_STATS option for SO_TIMESTAMPING · 1c885808
      Francis Yan 提交于
      This patch exports the sender chronograph stats via the socket
      SO_TIMESTAMPING channel. Currently we can instrument how long a
      particular application unit of data was queued in TCP by tracking
      SOF_TIMESTAMPING_TX_SOFTWARE and SOF_TIMESTAMPING_TX_SCHED. Having
      these sender chronograph stats exported simultaneously along with
      these timestamps allow further breaking down the various sender
      limitation.  For example, a video server can tell if a particular
      chunk of video on a connection takes a long time to deliver because
      TCP was experiencing small receive window. It is not possible to
      tell before this patch without packet traces.
      
      To prepare these stats, the user needs to set
      SOF_TIMESTAMPING_OPT_STATS and SOF_TIMESTAMPING_OPT_TSONLY flags
      while requesting other SOF_TIMESTAMPING TX timestamps. When the
      timestamps are available in the error queue, the stats are returned
      in a separate control message of type SCM_TIMESTAMPING_OPT_STATS,
      in a list of TLVs (struct nlattr) of types: TCP_NLA_BUSY_TIME,
      TCP_NLA_RWND_LIMITED, TCP_NLA_SNDBUF_LIMITED. Unit is microsecond.
      Signed-off-by: NFrancis Yan <francisyyan@gmail.com>
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NSoheil Hassas Yeganeh <soheil@google.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1c885808
  22. 17 11月, 2016 1 次提交
    • A
      xattr: Fix setting security xattrs on sockfs · 4a590153
      Andreas Gruenbacher 提交于
      The IOP_XATTR flag is set on sockfs because sockfs supports getting the
      "system.sockprotoname" xattr.  Since commit 6c6ef9f2, this flag is checked for
      setxattr support as well.  This is wrong on sockfs because security xattr
      support there is supposed to be provided by security_inode_setsecurity.  The
      smack security module relies on socket labels (xattrs).
      
      Fix this by adding a security xattr handler on sockfs that returns
      -EAGAIN, and by checking for -EAGAIN in setxattr.
      
      We cannot simply check for -EOPNOTSUPP in setxattr because there are
      filesystems that neither have direct security xattr support nor support
      via security_inode_setsecurity.  A more proper fix might be to move the
      call to security_inode_setsecurity into sockfs, but it's not clear to me
      if that is safe: we would end up calling security_inode_post_setxattr after
      that as well.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      4a590153
  23. 10 11月, 2016 1 次提交
  24. 05 11月, 2016 1 次提交
    • L
      net: core: Add a UID field to struct sock. · 86741ec2
      Lorenzo Colitti 提交于
      Protocol sockets (struct sock) don't have UIDs, but most of the
      time, they map 1:1 to userspace sockets (struct socket) which do.
      
      Various operations such as the iptables xt_owner match need
      access to the "UID of a socket", and do so by following the
      backpointer to the struct socket. This involves taking
      sk_callback_lock and doesn't work when there is no socket
      because userspace has already called close().
      
      Simplify this by adding a sk_uid field to struct sock whose value
      matches the UID of the corresponding struct socket. The semantics
      are as follows:
      
      1. Whenever sk_socket is non-null: sk_uid is the same as the UID
         in sk_socket, i.e., matches the return value of sock_i_uid.
         Specifically, the UID is set when userspace calls socket(),
         fchown(), or accept().
      2. When sk_socket is NULL, sk_uid is defined as follows:
         - For a socket that no longer has a sk_socket because
           userspace has called close(): the previous UID.
         - For a cloned socket (e.g., an incoming connection that is
           established but on which userspace has not yet called
           accept): the UID of the socket it was cloned from.
         - For a socket that has never had an sk_socket: UID 0 inside
           the user namespace corresponding to the network namespace
           the socket belongs to.
      
      Kernel sockets created by sock_create_kern are a special case
      of #1 and sk_uid is the user that created them. For kernel
      sockets created at network namespace creation time, such as the
      per-processor ICMP and TCP sockets, this is the user that created
      the network namespace.
      Signed-off-by: NLorenzo Colitti <lorenzo@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      86741ec2
  25. 31 10月, 2016 1 次提交
    • A
      net: add an ioctl to get a socket network namespace · c62cce2c
      Andrey Vagin 提交于
      Each socket operates in a network namespace where it has been created,
      so if we want to dump and restore a socket, we have to know its network
      namespace.
      
      We have a socket_diag to get information about sockets, it doesn't
      report sockets which are not bound or connected.
      
      This patch introduces a new socket ioctl, which is called SIOCGSKNS
      and used to get a file descriptor for a socket network namespace.
      
      A task must have CAP_NET_ADMIN in a target network namespace to
      use this ioctl.
      
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NAndrei Vagin <avagin@openvz.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c62cce2c
  26. 08 10月, 2016 1 次提交
  27. 07 10月, 2016 2 次提交
  28. 20 5月, 2016 1 次提交
  29. 29 4月, 2016 1 次提交
  30. 11 4月, 2016 1 次提交
  31. 08 4月, 2016 1 次提交
  32. 05 4月, 2016 1 次提交
    • S
      sock: enable timestamping using control messages · c14ac945
      Soheil Hassas Yeganeh 提交于
      Currently, SOL_TIMESTAMPING can only be enabled using setsockopt.
      This is very costly when users want to sample writes to gather
      tx timestamps.
      
      Add support for enabling SO_TIMESTAMPING via control messages by
      using tsflags added in `struct sockcm_cookie` (added in the previous
      patches in this series) to set the tx_flags of the last skb created in
      a sendmsg. With this patch, the timestamp recording bits in tx_flags
      of the skbuff is overridden if SO_TIMESTAMPING is passed in a cmsg.
      
      Please note that this is only effective for overriding the recording
      timestamps flags. Users should enable timestamp reporting (e.g.,
      SOF_TIMESTAMPING_SOFTWARE | SOF_TIMESTAMPING_OPT_ID) using
      socket options and then should ask for SOF_TIMESTAMPING_TX_*
      using control messages per sendmsg to sample timestamps for each
      write.
      Signed-off-by: NSoheil Hassas Yeganeh <soheil@google.com>
      Acked-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c14ac945
  33. 29 3月, 2016 1 次提交
  34. 15 3月, 2016 1 次提交
  35. 14 3月, 2016 1 次提交
  36. 10 3月, 2016 1 次提交
    • T
      net: Add MSG_BATCH flag · f092276d
      Tom Herbert 提交于
      Add a new msg flag called MSG_BATCH. This flag is used in sendmsg to
      indicate that more messages will follow (i.e. a batch of messages is
      being sent). This is similar to MSG_MORE except that the following
      messages are not merged into one packet, they are sent individually.
      sendmmsg is updated so that each contained message except for the
      last one is marked as MSG_BATCH.
      
      MSG_BATCH is a performance optimization in cases where a socket
      implementation can benefit by transmitting packets in a batch.
      Signed-off-by: NTom Herbert <tom@herbertland.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f092276d