1. 05 4月, 2016 1 次提交
    • S
      sock: enable timestamping using control messages · c14ac945
      Soheil Hassas Yeganeh 提交于
      Currently, SOL_TIMESTAMPING can only be enabled using setsockopt.
      This is very costly when users want to sample writes to gather
      tx timestamps.
      
      Add support for enabling SO_TIMESTAMPING via control messages by
      using tsflags added in `struct sockcm_cookie` (added in the previous
      patches in this series) to set the tx_flags of the last skb created in
      a sendmsg. With this patch, the timestamp recording bits in tx_flags
      of the skbuff is overridden if SO_TIMESTAMPING is passed in a cmsg.
      
      Please note that this is only effective for overriding the recording
      timestamps flags. Users should enable timestamp reporting (e.g.,
      SOF_TIMESTAMPING_SOFTWARE | SOF_TIMESTAMPING_OPT_ID) using
      socket options and then should ask for SOF_TIMESTAMPING_TX_*
      using control messages per sendmsg to sample timestamps for each
      write.
      Signed-off-by: NSoheil Hassas Yeganeh <soheil@google.com>
      Acked-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c14ac945
  2. 15 3月, 2016 1 次提交
  3. 14 3月, 2016 1 次提交
  4. 10 3月, 2016 3 次提交
  5. 15 1月, 2016 1 次提交
    • V
      kmemcg: account certain kmem allocations to memcg · 5d097056
      Vladimir Davydov 提交于
      Mark those kmem allocations that are known to be easily triggered from
      userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them accounted to
      memcg.  For the list, see below:
      
       - threadinfo
       - task_struct
       - task_delay_info
       - pid
       - cred
       - mm_struct
       - vm_area_struct and vm_region (nommu)
       - anon_vma and anon_vma_chain
       - signal_struct
       - sighand_struct
       - fs_struct
       - files_struct
       - fdtable and fdtable->full_fds_bits
       - dentry and external_name
       - inode for all filesystems. This is the most tedious part, because
         most filesystems overwrite the alloc_inode method.
      
      The list is far from complete, so feel free to add more objects.
      Nevertheless, it should be close to "account everything" approach and
      keep most workloads within bounds.  Malevolent users will be able to
      breach the limit, but this was possible even with the former "account
      everything" approach (simply because it did not account everything in
      fact).
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NVladimir Davydov <vdavydov@virtuozzo.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5d097056
  6. 11 1月, 2016 1 次提交
  7. 31 12月, 2015 1 次提交
    • N
      net, socket, socket_wq: fix missing initialization of flags · 574aab1e
      Nicolai Stange 提交于
      Commit ceb5d58b ("net: fix sock_wake_async() rcu protection") from
      the current 4.4 release cycle introduced a new flags member in
      struct socket_wq and moved SOCKWQ_ASYNC_NOSPACE and SOCKWQ_ASYNC_WAITDATA
      from struct socket's flags member into that new place.
      
      Unfortunately, the new flags field is never initialized properly, at least
      not for the struct socket_wq instance created in sock_alloc_inode().
      
      One particular issue I encountered because of this is that my GNU Emacs
      failed to draw anything on my desktop -- i.e. what I got is a transparent
      window, including the title bar. Bisection lead to the commit mentioned
      above and further investigation by means of strace told me that Emacs
      is indeed speaking to my Xorg through an O_ASYNC AF_UNIX socket. This is
      reproducible 100% of times and the fact that properly initializing the
      struct socket_wq ->flags fixes the issue leads me to the conclusion that
      somehow SOCKWQ_ASYNC_WAITDATA got set in the uninitialized ->flags,
      preventing my Emacs from receiving any SIGIO's due to data becoming
      available and it got stuck.
      
      Make sock_alloc_inode() set the newly created struct socket_wq's ->flags
      member to zero.
      
      Fixes: ceb5d58b ("net: fix sock_wake_async() rcu protection")
      Signed-off-by: NNicolai Stange <nicstange@gmail.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      574aab1e
  8. 16 12月, 2015 1 次提交
  9. 02 12月, 2015 2 次提交
    • E
      net: fix sock_wake_async() rcu protection · ceb5d58b
      Eric Dumazet 提交于
      Dmitry provided a syzkaller (http://github.com/google/syzkaller)
      triggering a fault in sock_wake_async() when async IO is requested.
      
      Said program stressed af_unix sockets, but the issue is generic
      and should be addressed in core networking stack.
      
      The problem is that by the time sock_wake_async() is called,
      we should not access the @flags field of 'struct socket',
      as the inode containing this socket might be freed without
      further notice, and without RCU grace period.
      
      We already maintain an RCU protected structure, "struct socket_wq"
      so moving SOCKWQ_ASYNC_NOSPACE & SOCKWQ_ASYNC_WAITDATA into it
      is the safe route.
      
      It also reduces number of cache lines needing dirtying, so might
      provide a performance improvement anyway.
      
      In followup patches, we might move remaining flags (SOCK_NOSPACE,
      SOCK_PASSCRED, SOCK_PASSSEC) to save 8 bytes and let 'struct socket'
      being mostly read and let it being shared between cpus.
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ceb5d58b
    • E
      net: rename SOCK_ASYNC_NOSPACE and SOCK_ASYNC_WAITDATA · 9cd3e072
      Eric Dumazet 提交于
      This patch is a cleanup to make following patch easier to
      review.
      
      Goal is to move SOCK_ASYNC_NOSPACE and SOCK_ASYNC_WAITDATA
      from (struct socket)->flags to a (struct socket_wq)->flags
      to benefit from RCU protection in sock_wake_async()
      
      To ease backports, we rename both constants.
      
      Two new helpers, sk_set_bit(int nr, struct sock *sk)
      and sk_clear_bit(int net, struct sock *sk) are added so that
      following patch can change their implementation.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9cd3e072
  10. 29 9月, 2015 1 次提交
  11. 11 5月, 2015 2 次提交
  12. 16 4月, 2015 1 次提交
  13. 12 4月, 2015 3 次提交
  14. 09 4月, 2015 3 次提交
  15. 24 3月, 2015 1 次提交
  16. 21 3月, 2015 1 次提交
  17. 14 3月, 2015 1 次提交
  18. 13 3月, 2015 1 次提交
  19. 03 3月, 2015 1 次提交
  20. 02 3月, 2015 1 次提交
    • E
      net: move skb->dropcount to skb->cb[] · 744d5a3e
      Eyal Birger 提交于
      Commit 97775007 ("af_packet: add interframe drop cmsg (v6)")
      unionized skb->mark and skb->dropcount in order to allow recording
      of the socket drop count while maintaining struct sk_buff size.
      
      skb->dropcount was introduced since there was no available room
      in skb->cb[] in packet sockets. However, its introduction led to
      the inability to export skb->mark, or any other aliased field to
      userspace if so desired.
      
      Moving the dropcount metric to skb->cb[] eliminates this problem
      at the expense of 4 bytes less in skb->cb[] for protocol families
      using it.
      Signed-off-by: NEyal Birger <eyal.birger@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      744d5a3e
  21. 04 2月, 2015 2 次提交
  22. 29 1月, 2015 1 次提交
    • C
      net: remove sock_iocb · 7cc05662
      Christoph Hellwig 提交于
      The sock_iocb structure is allocate on stack for each read/write-like
      operation on sockets, and contains various fields of which only the
      embedded msghdr and sometimes a pointer to the scm_cookie is ever used.
      Get rid of the sock_iocb and put a msghdr directly on the stack and pass
      the scm_cookie explicitly to netlink_mmap_sendmsg.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7cc05662
  23. 28 1月, 2015 1 次提交
  24. 18 1月, 2015 1 次提交
  25. 16 1月, 2015 1 次提交
  26. 19 12月, 2014 1 次提交
  27. 11 12月, 2014 1 次提交
    • A
      make default ->i_fop have ->open() fail with ENXIO · bd9b51e7
      Al Viro 提交于
      As it is, default ->i_fop has NULL ->open() (along with all other methods).
      The only case where it matters is reopening (via procfs symlink) a file that
      didn't get its ->f_op from ->i_fop - anything else will have ->i_fop assigned
      to something sane (default would fail on read/write/ioctl/etc.).
      
      	Unfortunately, such case exists - alloc_file() users, especially
      anon_get_file() ones.  There we have tons of opened files of very different
      kinds sharing the same inode.  As the result, attempt to reopen those via
      procfs succeeds and you get a descriptor you can't do anything with.
      
      	Moreover, in case of sockets we set ->i_fop that will only be used
      on such reopen attempts - and put a failing ->open() into it to make sure
      those do not succeed.
      
      	It would be simpler to put such ->open() into default ->i_fop and leave
      it unchanged both for anon inode (as we do anyway) and for socket ones.  Result:
      	* everything going through do_dentry_open() works as it used to
      	* sock_no_open() kludge is gone
      	* attempts to reopen anon-inode files fail as they really ought to
      	* ditto for aio_private_file()
      	* ditto for perfmon - this one actually tried to imitate sock_no_open()
      trick, but failed to set ->i_fop, so in the current tree reopens succeed and
      yield completely useless descriptor.  Intent clearly had been to fail with
      -ENXIO on such reopens; now it actually does.
      	* everything else that used alloc_file() keeps working - it has ->i_fop
      set for its inodes anyway
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      bd9b51e7
  28. 10 12月, 2014 2 次提交
  29. 20 11月, 2014 2 次提交