1. 27 6月, 2017 2 次提交
  2. 26 6月, 2017 4 次提交
  3. 25 6月, 2017 3 次提交
  4. 24 6月, 2017 6 次提交
  5. 23 6月, 2017 2 次提交
  6. 22 6月, 2017 4 次提交
  7. 21 6月, 2017 18 次提交
    • D
      net: introduce SO_PEERGROUPS getsockopt · 28b5ba2a
      David Herrmann 提交于
      This adds the new getsockopt(2) option SO_PEERGROUPS on SOL_SOCKET to
      retrieve the auxiliary groups of the remote peer. It is designed to
      naturally extend SO_PEERCRED. That is, the underlying data is from the
      same credentials. Regarding its syntax, it is based on SO_PEERSEC. That
      is, if the provided buffer is too small, ERANGE is returned and @optlen
      is updated. Otherwise, the information is copied, @optlen is set to the
      actual size, and 0 is returned.
      
      While SO_PEERCRED (and thus `struct ucred') already returns the primary
      group, it lacks the auxiliary group vector. However, nearly all access
      controls (including kernel side VFS and SYSVIPC, but also user-space
      polkit, DBus, ...) consider the entire set of groups, rather than just
      the primary group. But this is currently not possible with pure
      SO_PEERCRED. Instead, user-space has to work around this and query the
      system database for the auxiliary groups of a UID retrieved via
      SO_PEERCRED.
      
      Unfortunately, there is no race-free way to query the auxiliary groups
      of the PID/UID retrieved via SO_PEERCRED. Hence, the current user-space
      solution is to use getgrouplist(3p), which itself falls back to NSS and
      whatever is configured in nsswitch.conf(3). This effectively checks
      which groups we *would* assign to the user if it logged in *now*. On
      normal systems it is as easy as reading /etc/group, but with NSS it can
      resort to quering network databases (eg., LDAP), using IPC or network
      communication.
      
      Long story short: Whenever we want to use auxiliary groups for access
      checks on IPC, we need further IPC to talk to the user/group databases,
      rather than just relying on SO_PEERCRED and the incoming socket. This
      is unfortunate, and might even result in dead-locks if the database
      query uses the same IPC as the original request.
      
      So far, those recursions / dead-locks have been avoided by using
      primitive IPC for all crucial NSS modules. However, we want to avoid
      re-inventing the wheel for each NSS module that might be involved in
      user/group queries. Hence, we would preferably make DBus (and other IPC
      that supports access-management based on groups) work without resorting
      to the user/group database. This new SO_PEERGROUPS ioctl would allow us
      to make dbus-daemon work without ever calling into NSS.
      
      Cc: Michal Sekletar <msekleta@redhat.com>
      Cc: Simon McVittie <simon.mcvittie@collabora.co.uk>
      Reviewed-by: NTom Gundersen <teg@jklm.no>
      Signed-off-by: NDavid Herrmann <dh.herrmann@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      28b5ba2a
    • P
      udp: prefetch rmem_alloc in udp_queue_rcv_skb() · dd99e425
      Paolo Abeni 提交于
      On UDP packets processing, if the BH is the bottle-neck, it
      always sees a cache miss while updating rmem_alloc; try to
      avoid it prefetching the value as soon as we have the socket
      available.
      
      Performances under flood with multiple NIC rx queues used are
      unaffected, but when a single NIC rx queue is in use, this
      gives ~10% performance improvement.
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dd99e425
    • J
      ip6mr: add netlink notifications on mrt6msg cache reports · dd12d15c
      Julien Gomes 提交于
      Add Netlink notifications on cache reports in ip6mr, in addition to the
      existing mrt6msg sent to mroute6_sk.
      Send RTM_NEWCACHEREPORT notifications to RTNLGRP_IPV6_MROUTE_R.
      
      MSGTYPE, MIF_ID, SRC_ADDR and DST_ADDR Netlink attributes contain the
      same data as their equivalent fields in the mrt6msg header.
      PKT attribute is the packet sent to mroute6_sk, without the added
      mrt6msg header.
      Suggested-by: NRyan Halbrook <halbrook@arista.com>
      Signed-off-by: NJulien Gomes <julien@arista.com>
      Reviewed-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dd12d15c
    • J
      ipmr: add netlink notifications on igmpmsg cache reports · 5a645dd8
      Julien Gomes 提交于
      Add Netlink notifications on cache reports in ipmr, in addition to the
      existing igmpmsg sent to mroute_sk.
      Send RTM_NEWCACHEREPORT notifications to RTNLGRP_IPV4_MROUTE_R.
      
      MSGTYPE, VIF_ID, SRC_ADDR and DST_ADDR Netlink attributes contain the
      same data as their equivalent fields in the igmpmsg header.
      PKT attribute is the packet sent to mroute_sk, without the added igmpmsg
      header.
      Suggested-by: NRyan Halbrook <halbrook@arista.com>
      Signed-off-by: NJulien Gomes <julien@arista.com>
      Reviewed-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5a645dd8
    • J
      rtnetlink: add restricted rtnl groups for ipv4 and ipv6 mroute · 5f729eaa
      Julien Gomes 提交于
      Add RTNLGRP_{IPV4,IPV6}_MROUTE_R as two new restricted groups for the
      NETLINK_ROUTE family.
      Binding to these groups specifically requires CAP_NET_ADMIN to allow
      multicast of sensitive messages (e.g. mroute cache reports).
      Suggested-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NJulien Gomes <julien@arista.com>
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5f729eaa
    • A
      tcp: md5: hide unused variable · 083a0326
      Arnd Bergmann 提交于
      Changing from a memcpy to per-member comparison left the
      size variable unused:
      
      net/ipv4/tcp_ipv4.c: In function 'tcp_md5_do_lookup':
      net/ipv4/tcp_ipv4.c:910:15: error: unused variable 'size' [-Werror=unused-variable]
      
      This does not show up when CONFIG_IPV6 is enabled, but the
      variable can be removed either way, along with the now unused
      assignment.
      
      Fixes: 6797318e ("tcp: md5: add an address prefix for key lookup")
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      083a0326
    • W
      igmp: add a missing spin_lock_init() · b4846fc3
      WANG Cong 提交于
      Andrey reported a lockdep warning on non-initialized
      spinlock:
      
       INFO: trying to register non-static key.
       the code is fine but needs lockdep annotation.
       turning off the locking correctness validator.
       CPU: 1 PID: 4099 Comm: a.out Not tainted 4.12.0-rc6+ #9
       Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
       Call Trace:
        __dump_stack lib/dump_stack.c:16
        dump_stack+0x292/0x395 lib/dump_stack.c:52
        register_lock_class+0x717/0x1aa0 kernel/locking/lockdep.c:755
        ? 0xffffffffa0000000
        __lock_acquire+0x269/0x3690 kernel/locking/lockdep.c:3255
        lock_acquire+0x22d/0x560 kernel/locking/lockdep.c:3855
        __raw_spin_lock_bh ./include/linux/spinlock_api_smp.h:135
        _raw_spin_lock_bh+0x36/0x50 kernel/locking/spinlock.c:175
        spin_lock_bh ./include/linux/spinlock.h:304
        ip_mc_clear_src+0x27/0x1e0 net/ipv4/igmp.c:2076
        igmpv3_clear_delrec+0xee/0x4f0 net/ipv4/igmp.c:1194
        ip_mc_destroy_dev+0x4e/0x190 net/ipv4/igmp.c:1736
      
      We miss a spin_lock_init() in igmpv3_add_delrec(), probably
      because previously we never use it on this code path. Since
      we already unlink it from the global mc_tomb list, it is
      probably safe not to acquire this spinlock here. It does not
      harm to have it although, to avoid conditional locking.
      
      Fixes: c38b7d32 ("igmp: acquire pmc lock for ip_mc_clear_src()")
      Reported-by: NAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b4846fc3
    • S
      rtnetlink: add IFLA_GROUP to ifla_policy · db833d40
      Serhey Popovych 提交于
      Network interface groups support added while ago, however
      there is no IFLA_GROUP attribute description in policy
      and netlink message size calculations until now.
      
      Add IFLA_GROUP attribute to the policy.
      
      Fixes: cbda10fa ("net_device: add support for network device groups")
      Signed-off-by: NSerhey Popovych <serhe.popovych@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      db833d40
    • S
      ipv6: Do not leak throw route references · 07f61557
      Serhey Popovych 提交于
      While commit 73ba57bf ("ipv6: fix backtracking for throw routes")
      does good job on error propagation to the fib_rules_lookup()
      in fib rules core framework that also corrects throw routes
      handling, it does not solve route reference leakage problem
      happened when we return -EAGAIN to the fib_rules_lookup()
      and leave routing table entry referenced in arg->result.
      
      If rule with matched throw route isn't last matched in the
      list we overwrite arg->result losing reference on throw
      route stored previously forever.
      
      We also partially revert commit ab997ad4 ("ipv6: fix the
      incorrect return value of throw route") since we never return
      routing table entry with dst.error == -EAGAIN when
      CONFIG_IPV6_MULTIPLE_TABLES is on. Also there is no point
      to check for RTF_REJECT flag since it is always set throw
      route.
      
      Fixes: 73ba57bf ("ipv6: fix backtracking for throw routes")
      Signed-off-by: NSerhey Popovych <serhe.popovych@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      07f61557
    • X
      sctp: handle errors when updating asoc · 5ee8aa68
      Xin Long 提交于
      It's a bad thing not to handle errors when updating asoc. The memory
      allocation failure in any of the functions called in sctp_assoc_update()
      would cause sctp to work unexpectedly.
      
      This patch is to fix it by aborting the asoc and reporting the error when
      any of these functions fails.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5ee8aa68
    • X
      sctp: uncork the old asoc before changing to the new one · 8cd5c25f
      Xin Long 提交于
      local_cork is used to decide if it should uncork asoc outq after processing
      some cmds, and it is set when replying or sending msgs. local_cork should
      always have the same value with current asoc q->cork in some way.
      
      The thing is when changing to a new asoc by cmd SET_ASOC, local_cork may
      not be consistent with the current asoc any more. The cmd seqs can be:
      
        SCTP_CMD_UPDATE_ASSOC (asoc)
        SCTP_CMD_REPLY (asoc)
        SCTP_CMD_SET_ASOC (new_asoc)
        SCTP_CMD_DELETE_TCB (new_asoc)
        SCTP_CMD_SET_ASOC (asoc)
        SCTP_CMD_REPLY (asoc)
      
      The 1st REPLY makes OLD asoc q->cork and local_cork both are 1, and the cmd
      DELETE_TCB clears NEW asoc q->cork and local_cork. After asoc goes back to
      OLD asoc, q->cork is still 1 while local_cork is 0. The 2nd REPLY will not
      set local_cork because q->cork is already set and it can't be uncorked and
      sent out because of this.
      
      To keep local_cork consistent with the current asoc q->cork, this patch is
      to uncork the old asoc if local_cork is set before changing to the new one.
      
      Note that the above cmd seqs will be used in the next patch when updating
      asoc and handling errors in it.
      Suggested-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8cd5c25f
    • X
      dccp: call inet_add_protocol after register_pernet_subsys in dccp_v6_init · a0f9a4c2
      Xin Long 提交于
      Patch "call inet_add_protocol after register_pernet_subsys in dccp_v4_init"
      fixed a null pointer dereference issue for dccp_ipv4 module.
      
      The same fix is needed for dccp_ipv6 module.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a0f9a4c2
    • X
      dccp: call inet_add_protocol after register_pernet_subsys in dccp_v4_init · d5494acb
      Xin Long 提交于
      Now dccp_ipv4 works as a kernel module. During loading this module, if
      one dccp packet is being recieved after inet_add_protocol but before
      register_pernet_subsys in which v4_ctl_sk is initialized, a null pointer
      dereference may be triggered because of init_net.dccp.v4_ctl_sk is 0x0.
      
      Jianlin found this issue when the following call trace occurred:
      
      [  171.950177] BUG: unable to handle kernel NULL pointer dereference at 0000000000000110
      [  171.951007] IP: [<ffffffffc0558364>] dccp_v4_ctl_send_reset+0xc4/0x220 [dccp_ipv4]
      [...]
      [  171.984629] Call Trace:
      [  171.984859]  <IRQ>
      [  171.985061]
      [  171.985213]  [<ffffffffc0559a53>] dccp_v4_rcv+0x383/0x3f9 [dccp_ipv4]
      [  171.985711]  [<ffffffff815ca054>] ip_local_deliver_finish+0xb4/0x1f0
      [  171.986309]  [<ffffffff815ca339>] ip_local_deliver+0x59/0xd0
      [  171.986852]  [<ffffffff810cd7a4>] ? update_curr+0x104/0x190
      [  171.986956]  [<ffffffff815c9cda>] ip_rcv_finish+0x8a/0x350
      [  171.986956]  [<ffffffff815ca666>] ip_rcv+0x2b6/0x410
      [  171.986956]  [<ffffffff810c83b4>] ? task_cputime+0x44/0x80
      [  171.986956]  [<ffffffff81586f22>] __netif_receive_skb_core+0x572/0x7c0
      [  171.986956]  [<ffffffff810d2c51>] ? trigger_load_balance+0x61/0x1e0
      [  171.986956]  [<ffffffff81587188>] __netif_receive_skb+0x18/0x60
      [  171.986956]  [<ffffffff8158841e>] process_backlog+0xae/0x180
      [  171.986956]  [<ffffffff8158799d>] net_rx_action+0x16d/0x380
      [  171.986956]  [<ffffffff81090b7f>] __do_softirq+0xef/0x280
      [  171.986956]  [<ffffffff816b6a1c>] call_softirq+0x1c/0x30
      
      This patch is to move inet_add_protocol after register_pernet_subsys in
      dccp_v4_init, so that v4_ctl_sk is initialized before any incoming dccp
      packets are processed.
      Reported-by: NJianlin Shi <jishi@redhat.com>
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d5494acb
    • M
      vxlan: get rid of redundant vxlan_dev.flags · dc5321d7
      Matthias Schiffer 提交于
      There is no good reason to keep the flags twice in vxlan_dev and
      vxlan_config.
      Signed-off-by: NMatthias Schiffer <mschiffer@universe-factory.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dc5321d7
    • Y
    • Y
      net: introduce __skb_put_[zero, data, u8] · de77b966
      yuan linyu 提交于
      follow Johannes Berg, semantic patch file as below,
      @@
      identifier p, p2;
      expression len;
      expression skb;
      type t, t2;
      @@
      (
      -p = __skb_put(skb, len);
      +p = __skb_put_zero(skb, len);
      |
      -p = (t)__skb_put(skb, len);
      +p = __skb_put_zero(skb, len);
      )
      ... when != p
      (
      p2 = (t2)p;
      -memset(p2, 0, len);
      |
      -memset(p, 0, len);
      )
      
      @@
      identifier p;
      expression len;
      expression skb;
      type t;
      @@
      (
      -t p = __skb_put(skb, len);
      +t p = __skb_put_zero(skb, len);
      )
      ... when != p
      (
      -memset(p, 0, len);
      )
      
      @@
      type t, t2;
      identifier p, p2;
      expression skb;
      @@
      t *p;
      ...
      (
      -p = __skb_put(skb, sizeof(t));
      +p = __skb_put_zero(skb, sizeof(t));
      |
      -p = (t *)__skb_put(skb, sizeof(t));
      +p = __skb_put_zero(skb, sizeof(t));
      )
      ... when != p
      (
      p2 = (t2)p;
      -memset(p2, 0, sizeof(*p));
      |
      -memset(p, 0, sizeof(*p));
      )
      
      @@
      expression skb, len;
      @@
      -memset(__skb_put(skb, len), 0, len);
      +__skb_put_zero(skb, len);
      
      @@
      expression skb, len, data;
      @@
      -memcpy(__skb_put(skb, len), data, len);
      +__skb_put_data(skb, data, len);
      
      @@
      expression SKB, C, S;
      typedef u8;
      identifier fn = {__skb_put};
      fresh identifier fn2 = fn ## "_u8";
      @@
      - *(u8 *)fn(SKB, S) = C;
      + fn2(SKB, C);
      Signed-off-by: Nyuan linyu <Linyu.Yuan@alcatel-sbell.com.cn>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      de77b966
    • S
      net/core: remove explicit do_softirq() from busy_poll_stop() · fe420d87
      Sebastian Siewior 提交于
      Since commit 217f6974 ("net: busy-poll: allow preemption in
      sk_busy_loop()") there is an explicit do_softirq() invocation after
      local_bh_enable() has been invoked.
      I don't understand why we need this because local_bh_enable() will
      invoke do_softirq() once the softirq counter reached zero and we have
      softirq-related work pending.
      Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fe420d87
    • S
      fib_rules: Resolve goto rules target on delete · bdaf32c3
      Serhey Popovych 提交于
      We should avoid marking goto rules unresolved when their
      target is actually reachable after rule deletion.
      
      Consolder following sample scenario:
      
        # ip -4 ru sh
        0:      from all lookup local
        32000:  from all goto 32100
        32100:  from all lookup main
        32100:  from all lookup default
        32766:  from all lookup main
        32767:  from all lookup default
      
        # ip -4 ru del pref 32100 table main
        # ip -4 ru sh
        0:      from all lookup local
        32000:  from all goto 32100 [unresolved]
        32100:  from all lookup default
        32766:  from all lookup main
        32767:  from all lookup default
      
      After removal of first rule with preference 32100 we
      mark all goto rules as unreachable, even when rule with
      same preference as removed one still present.
      
      Check if next rule with same preference is available
      and make all rules with goto action pointing to it.
      Signed-off-by: NSerhey Popovych <serhe.popovych@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bdaf32c3
  8. 20 6月, 2017 1 次提交