1. 18 2月, 2020 1 次提交
  2. 10 12月, 2019 1 次提交
  3. 15 6月, 2019 1 次提交
  4. 12 6月, 2019 1 次提交
  5. 31 5月, 2019 1 次提交
  6. 20 5月, 2019 1 次提交
  7. 13 4月, 2019 1 次提交
  8. 22 2月, 2019 1 次提交
  9. 20 1月, 2019 1 次提交
  10. 15 12月, 2018 1 次提交
  11. 16 10月, 2018 1 次提交
    • D
      netlink: Add answer_flags to netlink_callback · 22e6c58b
      David Ahern 提交于
      With dump filtering we need a way to ensure the NLM_F_DUMP_FILTERED
      flag is set on a message back to the user if the data returned is
      influenced by some input attributes. Normally this can be done as
      messages are added to the skb, but if the filter results in no data
      being returned, the user could be confused as to why.
      
      This patch adds answer_flags to the netlink_callback allowing dump
      handlers to set the NLM_F_DUMP_FILTERED at a minimum in the
      NLMSG_DONE message ensuring the flag gets back to the user.
      
      The netlink_callback space is initialized to 0 via a memset in
      __netlink_dump_start, so init of the new answer_flags is covered.
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      22e6c58b
  12. 09 10月, 2018 2 次提交
    • D
      netlink: Add new socket option to enable strict checking on dumps · 89d35528
      David Ahern 提交于
      Add a new socket option, NETLINK_DUMP_STRICT_CHK, that userspace
      can use via setsockopt to request strict checking of headers and
      attributes on dump requests.
      
      To get dump features such as kernel side filtering based on data in
      the header or attributes appended to the dump request, userspace
      must call setsockopt() for NETLINK_DUMP_STRICT_CHK and a non-zero
      value. Since the netlink sock and its flags are private to the
      af_netlink code, the strict checking flag is passed to dump handlers
      via a flag in the netlink_callback struct.
      
      For old userspace on new kernel there is no impact as all of the data
      checks in later patches are wrapped in a check on the new strict flag.
      
      For new userspace on old kernel, the setsockopt will fail and even if
      new userspace sets data in the headers and appended attributes the
      kernel will silently ignore it. Moving forward when the setsockopt
      succeeds, the new userspace on old kernel means the dump request can
      pass an attribute the kernel does not understand. The dump will then
      fail as the older kernel does not understand it.
      
      New userspace on new kernel setting the socket option gets the benefit
      of the improved data dump.
      
      Kernel side the NETLINK_DUMP_STRICT_CHK uapi is converted to a generic
      NETLINK_F_STRICT_CHK flag which can potentially be leveraged for tighter
      checking on the NEW, DEL, and SET commands.
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Acked-by: NChristian Brauner <christian@brauner.io>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      89d35528
    • D
      netlink: Pass extack to dump handlers · 4a19edb6
      David Ahern 提交于
      Declare extack in netlink_dump and pass to dump handlers via
      netlink_callback. Add any extack message after the dump_done_errno
      allowing error messages to be returned. This will be useful when
      strict checking is done on dump requests, returning why the dump
      fails EINVAL.
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Acked-by: NChristian Brauner <christian@brauner.io>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4a19edb6
  13. 12 9月, 2018 1 次提交
  14. 06 9月, 2018 1 次提交
    • D
      netlink: Make groups check less stupid in netlink_bind() · 428f944b
      Dmitry Safonov 提交于
      As Linus noted, the test for 0 is needless, groups type can follow the
      usual kernel style and 8*sizeof(unsigned long) is BITS_PER_LONG:
      
      > The code [..] isn't technically incorrect...
      > But it is stupid.
      > Why stupid? Because the test for 0 is pointless.
      >
      > Just doing
      >        if (nlk->ngroups < 8*sizeof(groups))
      >                groups &= (1UL << nlk->ngroups) - 1;
      >
      > would have been fine and more understandable, since the "mask by shift
      > count" already does the right thing for a ngroups value of 0. Now that
      > test for zero makes me go "what's special about zero?". It turns out
      > that the answer to that is "nothing".
      [..]
      > The type of "groups" is kind of silly too.
      >
      > Yeah, "long unsigned int" isn't _technically_ wrong. But we normally
      > call that type "unsigned long".
      
      Cleanup my piece of pointlessness.
      
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Steffen Klassert <steffen.klassert@secunet.com>
      Cc: netdev@vger.kernel.org
      Fairly-blamed-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NDmitry Safonov <dima@arista.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      428f944b
  15. 05 8月, 2018 1 次提交
    • D
      netlink: Don't shift on 64 for ngroups · 91874ecf
      Dmitry Safonov 提交于
      It's legal to have 64 groups for netlink_sock.
      
      As user-supplied nladdr->nl_groups is __u32, it's possible to subscribe
      only to first 32 groups.
      
      The check for correctness of .bind() userspace supplied parameter
      is done by applying mask made from ngroups shift. Which broke Android
      as they have 64 groups and the shift for mask resulted in an overflow.
      
      Fixes: 61f4b237 ("netlink: Don't shift with UB on nlk->ngroups")
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Steffen Klassert <steffen.klassert@secunet.com>
      Cc: netdev@vger.kernel.org
      Cc: stable@vger.kernel.org
      Reported-and-Tested-by: NNathan Chancellor <natechancellor@gmail.com>
      Signed-off-by: NDmitry Safonov <dima@arista.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      91874ecf
  16. 02 8月, 2018 1 次提交
    • J
      netlink: Fix spectre v1 gadget in netlink_create() · bc5b6c0b
      Jeremy Cline 提交于
      'protocol' is a user-controlled value, so sanitize it after the bounds
      check to avoid using it for speculative out-of-bounds access to arrays
      indexed by it.
      
      This addresses the following accesses detected with the help of smatch:
      
      * net/netlink/af_netlink.c:654 __netlink_create() warn: potential
        spectre issue 'nlk_cb_mutex_keys' [w]
      
      * net/netlink/af_netlink.c:654 __netlink_create() warn: potential
        spectre issue 'nlk_cb_mutex_key_strings' [w]
      
      * net/netlink/af_netlink.c:685 netlink_create() warn: potential spectre
        issue 'nl_table' [w] (local cap)
      
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: NJeremy Cline <jcline@redhat.com>
      Reviewed-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bc5b6c0b
  17. 31 7月, 2018 1 次提交
  18. 30 7月, 2018 1 次提交
  19. 25 7月, 2018 1 次提交
  20. 29 6月, 2018 1 次提交
    • L
      Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL · a11e1d43
      Linus Torvalds 提交于
      The poll() changes were not well thought out, and completely
      unexplained.  They also caused a huge performance regression, because
      "->poll()" was no longer a trivial file operation that just called down
      to the underlying file operations, but instead did at least two indirect
      calls.
      
      Indirect calls are sadly slow now with the Spectre mitigation, but the
      performance problem could at least be largely mitigated by changing the
      "->get_poll_head()" operation to just have a per-file-descriptor pointer
      to the poll head instead.  That gets rid of one of the new indirections.
      
      But that doesn't fix the new complexity that is completely unwarranted
      for the regular case.  The (undocumented) reason for the poll() changes
      was some alleged AIO poll race fixing, but we don't make the common case
      slower and more complex for some uncommon special case, so this all
      really needs way more explanations and most likely a fundamental
      redesign.
      
      [ This revert is a revert of about 30 different commits, not reverted
        individually because that would just be unnecessarily messy  - Linus ]
      
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a11e1d43
  21. 26 5月, 2018 1 次提交
  22. 16 5月, 2018 1 次提交
  23. 05 5月, 2018 1 次提交
  24. 08 4月, 2018 1 次提交
  25. 28 3月, 2018 1 次提交
  26. 26 3月, 2018 1 次提交
  27. 23 2月, 2018 1 次提交
    • J
      netlink: put module reference if dump start fails · b87b6194
      Jason A. Donenfeld 提交于
      Before, if cb->start() failed, the module reference would never be put,
      because cb->cb_running is intentionally false at this point. Users are
      generally annoyed by this because they can no longer unload modules that
      leak references. Also, it may be possible to tediously wrap a reference
      counter back to zero, especially since module.c still uses atomic_inc
      instead of refcount_inc.
      
      This patch expands the error path to simply call module_put if
      cb->start() fails.
      
      Fixes: 41c87425 ("netlink: do not set cb_running if dump's start() errs")
      Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b87b6194
  28. 13 2月, 2018 3 次提交
    • K
      net: Convert netlink_tap_net_ops · b86b47a3
      Kirill Tkhai 提交于
      These pernet_operations init just allocated net memory,
      and they obviously can be executed in parallel in any
      others.
      
      v3: New
      Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com>
      Acked-by: NAndrei Vagin <avagin@virtuozzo.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b86b47a3
    • K
      net: Convert netlink_net_ops · 194b95d2
      Kirill Tkhai 提交于
      The methods of netlink_net_ops create and destroy "netlink"
      file, which are not interesting for foreigh pernet_operations.
      So, netlink_net_ops may safely be made async.
      Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com>
      Acked-by: NAndrei Vagin <avagin@virtuozzo.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      194b95d2
    • D
      net: make getname() functions return length rather than use int* parameter · 9b2c45d4
      Denys Vlasenko 提交于
      Changes since v1:
      Added changes in these files:
          drivers/infiniband/hw/usnic/usnic_transport.c
          drivers/staging/lustre/lnet/lnet/lib-socket.c
          drivers/target/iscsi/iscsi_target_login.c
          drivers/vhost/net.c
          fs/dlm/lowcomms.c
          fs/ocfs2/cluster/tcp.c
          security/tomoyo/network.c
      
      Before:
      All these functions either return a negative error indicator,
      or store length of sockaddr into "int *socklen" parameter
      and return zero on success.
      
      "int *socklen" parameter is awkward. For example, if caller does not
      care, it still needs to provide on-stack storage for the value
      it does not need.
      
      None of the many FOO_getname() functions of various protocols
      ever used old value of *socklen. They always just overwrite it.
      
      This change drops this parameter, and makes all these functions, on success,
      return length of sockaddr. It's always >= 0 and can be differentiated
      from an error.
      
      Tests in callers are changed from "if (err)" to "if (err < 0)", where needed.
      
      rpc_sockname() lost "int buflen" parameter, since its only use was
      to be passed to kernel_getsockname() as &buflen and subsequently
      not used in any way.
      
      Userspace API is not changed.
      
          text    data     bss      dec     hex filename
      30108430 2633624  873672 33615726 200ef6e vmlinux.before.o
      30108109 2633612  873672 33615393 200ee21 vmlinux.o
      Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com>
      CC: David S. Miller <davem@davemloft.net>
      CC: linux-kernel@vger.kernel.org
      CC: netdev@vger.kernel.org
      CC: linux-bluetooth@vger.kernel.org
      CC: linux-decnet-user@lists.sourceforge.net
      CC: linux-wireless@vger.kernel.org
      CC: linux-rdma@vger.kernel.org
      CC: linux-sctp@vger.kernel.org
      CC: linux-nfs@vger.kernel.org
      CC: linux-x25@vger.kernel.org
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9b2c45d4
  29. 19 1月, 2018 1 次提交
  30. 17 1月, 2018 1 次提交
    • A
      net: delete /proc THIS_MODULE references · 96890d62
      Alexey Dobriyan 提交于
      /proc has been ignoring struct file_operations::owner field for 10 years.
      Specifically, it started with commit 786d7e16
      ("Fix rmmod/read/write races in /proc entries"). Notice the chunk where
      inode->i_fop is initialized with proxy struct file_operations for
      regular files:
      
      	-               if (de->proc_fops)
      	-                       inode->i_fop = de->proc_fops;
      	+               if (de->proc_fops) {
      	+                       if (S_ISREG(inode->i_mode))
      	+                               inode->i_fop = &proc_reg_file_ops;
      	+                       else
      	+                               inode->i_fop = de->proc_fops;
      	+               }
      
      VFS stopped pinning module at this point.
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      96890d62
  31. 16 1月, 2018 1 次提交
  32. 12 12月, 2017 1 次提交
    • K
      netlink: Add netns check on taps · 93c64764
      Kevin Cernekee 提交于
      Currently, a nlmon link inside a child namespace can observe systemwide
      netlink activity.  Filter the traffic so that nlmon can only sniff
      netlink messages from its own netns.
      
      Test case:
      
          vpnns -- bash -c "ip link add nlmon0 type nlmon; \
                            ip link set nlmon0 up; \
                            tcpdump -i nlmon0 -q -w /tmp/nlmon.pcap -U" &
          sudo ip xfrm state add src 10.1.1.1 dst 10.1.1.2 proto esp \
              spi 0x1 mode transport \
              auth sha1 0x6162633132330000000000000000000000000000 \
              enc aes 0x00000000000000000000000000000000
          grep --binary abc123 /tmp/nlmon.pcap
      Signed-off-by: NKevin Cernekee <cernekee@chromium.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      93c64764
  33. 11 12月, 2017 3 次提交
    • C
      netlink: convert netlink tap spinlock to mutex · b1042d35
      Cong Wang 提交于
      Both netlink_add_tap() and netlink_remove_tap() are
      called in process context, no need to bother spinlock.
      
      Note, in fact, currently we always hold RTNL when calling
      these two functions, so we don't need any other lock at
      all, but keeping this lock doesn't harm anything.
      
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b1042d35
    • C
      netlink: make netlink tap per netns · 25e3f70f
      Cong Wang 提交于
      nlmon device is not supposed to capture netlink events from
      other netns, so instead of filtering events, we can simply
      make netlink tap itself per netns.
      
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Kevin Cernekee <cernekee@chromium.org>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      25e3f70f
    • T
      rhashtable: Change rhashtable_walk_start to return void · 97a6ec4a
      Tom Herbert 提交于
      Most callers of rhashtable_walk_start don't care about a resize event
      which is indicated by a return value of -EAGAIN. So calls to
      rhashtable_walk_start are wrapped wih code to ignore -EAGAIN. Something
      like this is common:
      
             ret = rhashtable_walk_start(rhiter);
             if (ret && ret != -EAGAIN)
                     goto out;
      
      Since zero and -EAGAIN are the only possible return values from the
      function this check is pointless. The condition never evaluates to true.
      
      This patch changes rhashtable_walk_start to return void. This simplifies
      code for the callers that ignore -EAGAIN. For the few cases where the
      caller cares about the resize event, particularly where the table can be
      walked in mulitple parts for netlink or seq file dump, the function
      rhashtable_walk_start_check has been added that returns -EAGAIN on a
      resize event.
      Signed-off-by: NTom Herbert <tom@quantonium.net>
      Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      97a6ec4a
  34. 14 11月, 2017 1 次提交
  35. 13 11月, 2017 1 次提交
    • J
      af_netlink: ensure that NLMSG_DONE never fails in dumps · 0642840b
      Jason A. Donenfeld 提交于
      The way people generally use netlink_dump is that they fill in the skb
      as much as possible, breaking when nla_put returns an error. Then, they
      get called again and start filling out the next skb, and again, and so
      forth. The mechanism at work here is the ability for the iterative
      dumping function to detect when the skb is filled up and not fill it
      past the brim, waiting for a fresh skb for the rest of the data.
      
      However, if the attributes are small and nicely packed, it is possible
      that a dump callback function successfully fills in attributes until the
      skb is of size 4080 (libmnl's default page-sized receive buffer size).
      The dump function completes, satisfied, and then, if it happens to be
      that this is actually the last skb, and no further ones are to be sent,
      then netlink_dump will add on the NLMSG_DONE part:
      
        nlh = nlmsg_put_answer(skb, cb, NLMSG_DONE, sizeof(len), NLM_F_MULTI);
      
      It is very important that netlink_dump does this, of course. However, in
      this example, that call to nlmsg_put_answer will fail, because the
      previous filling by the dump function did not leave it enough room. And
      how could it possibly have done so? All of the nla_put variety of
      functions simply check to see if the skb has enough tailroom,
      independent of the context it is in.
      
      In order to keep the important assumptions of all netlink dump users, it
      is therefore important to give them an skb that has this end part of the
      tail already reserved, so that the call to nlmsg_put_answer does not
      fail. Otherwise, library authors are forced to find some bizarre sized
      receive buffer that has a large modulo relative to the common sizes of
      messages received, which is ugly and buggy.
      
      This patch thus saves the NLMSG_DONE for an additional message, for the
      case that things are dangerously close to the brim. This requires
      keeping track of the errno from ->dump() across calls.
      Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0642840b