1. 19 1月, 2014 1 次提交
  2. 07 1月, 2014 2 次提交
  3. 02 1月, 2014 1 次提交
  4. 01 1月, 2014 2 次提交
    • D
      netlink: specify netlink packet direction for nlmon · 604d13c9
      Daniel Borkmann 提交于
      In order to facilitate development for netlink protocol dissector,
      fill the unused field skb->pkt_type of the cloned skb with a hint
      of the address space of the new owner (receiver) socket in the
      notion of "to kernel" resp. "to user".
      
      At the time we invoke __netlink_deliver_tap_skb(), we already have
      set the new skb owner via netlink_skb_set_owner_r(), so we can use
      that for netlink_is_kernel() probing.
      
      In normal PF_PACKET network traffic, this field denotes if the
      packet is destined for us (PACKET_HOST), if it's broadcast
      (PACKET_BROADCAST), etc.
      
      As we only have 3 bit reserved, we can use the value (= 6) of
      PACKET_FASTROUTE as it's _not used_ anywhere in the whole kernel
      and not supported anywhere, and packets of such type were never
      exposed to user space, so there are no overlapping users of such
      kind. Thus, as wished, that seems the only way to make both
      PACKET_* values non-overlapping and therefore device agnostic.
      
      By using those two flags for netlink skbs on nlmon devices, they
      can be made available and picked up via sll_pkttype (previously
      unused in netlink context) in struct sockaddr_ll. We now have
      these two directions:
      
       - PACKET_USER (= 6)    ->  to user space
       - PACKET_KERNEL (= 7)  ->  to kernel space
      
      Partial `ip a` example strace for sa_family=AF_NETLINK with
      detected nl msg direction:
      
      syscall:                     direction:
      sendto(3,  ...) = 40         /* to kernel */
      recvmsg(3, ...) = 3404       /* to user */
      recvmsg(3, ...) = 1120       /* to user */
      recvmsg(3, ...) = 20         /* to user */
      sendto(3,  ...) = 40         /* to kernel */
      recvmsg(3, ...) = 168        /* to user */
      recvmsg(3, ...) = 144        /* to user */
      recvmsg(3, ...) = 20         /* to user */
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NJakub Zawadzki <darkjames-ws@darkjames.pl>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      604d13c9
    • D
      netlink: only do not deliver to tap when both sides are kernel sks · 73bfd370
      Daniel Borkmann 提交于
      We should also deliver packets to nlmon devices when we are in
      netlink_unicast_kernel(), and only one of the {src,dst} sockets
      is user sk and the other one kernel sk. That's e.g. the case in
      netlink diag, netlink route, etc. Still, forbid to deliver messages
      from kernel to kernel sks.
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NJakub Zawadzki <darkjames-ws@darkjames.pl>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      73bfd370
  5. 29 11月, 2013 2 次提交
    • J
      genetlink/pmcraid: use proper genetlink multicast API · 5e53e689
      Johannes Berg 提交于
      The pmcraid driver is abusing the genetlink API and is using its
      family ID as the multicast group ID, which is invalid and may
      belong to somebody else (and likely will.)
      
      Make it use the correct API, but since this may already be used
      as-is by userspace, reserve a family ID for this code and also
      reserve that group ID to not break userspace assumptions.
      
      My previous patch broke event delivery in the driver as I missed
      that it wasn't using the right API and forgot to update it later
      in my series.
      
      While changing this, I noticed that the genetlink code could use
      the static group ID instead of a strcmp(), so also do that for
      the VFS_DQUOT family.
      
      Cc: Anil Ravindranath <anil_ravindranath@pmc-sierra.com>
      Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5e53e689
    • G
      genetlink: Fix uninitialized variable in genl_validate_assign_mc_groups() · 0f0e2159
      Geert Uytterhoeven 提交于
      net/netlink/genetlink.c: In function ‘genl_validate_assign_mc_groups’:
      net/netlink/genetlink.c:217: warning: ‘err’ may be used uninitialized in this
      function
      
      Commit 2a94fe48 ("genetlink: make multicast
      groups const, prevent abuse") split genl_register_mc_group() in multiple
      functions, but dropped the initialization of err.
      
      Initialize err to zero to fix this.
      Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0f0e2159
  6. 22 11月, 2013 1 次提交
    • J
      genetlink: fix genlmsg_multicast() bug · 220815a9
      Johannes Berg 提交于
      Unfortunately, I introduced a tremendously stupid bug into
      genlmsg_multicast() when doing all those multicast group
      changes: it adjusts the group number, but then passes it
      to genlmsg_multicast_netns() which does that again.
      
      Somehow, my tests failed to catch this, so add a warning
      into genlmsg_multicast_netns() and remove the offending
      group ID adjustment.
      
      Also add a warning to the similar code in other functions
      so people who misuse them are more loudly warned.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      220815a9
  7. 21 11月, 2013 1 次提交
    • H
      net: rework recvmsg handler msg_name and msg_namelen logic · f3d33426
      Hannes Frederic Sowa 提交于
      This patch now always passes msg->msg_namelen as 0. recvmsg handlers must
      set msg_namelen to the proper size <= sizeof(struct sockaddr_storage)
      to return msg_name to the user.
      
      This prevents numerous uninitialized memory leaks we had in the
      recvmsg handlers and makes it harder for new code to accidentally leak
      uninitialized memory.
      
      Optimize for the case recvfrom is called with NULL as address. We don't
      need to copy the address at all, so set it to NULL before invoking the
      recvmsg handler. We can do so, because all the recvmsg handlers must
      cope with the case a plain read() is called on them. read() also sets
      msg_name to NULL.
      
      Also document these changes in include/linux/net.h as suggested by David
      Miller.
      
      Changes since RFC:
      
      Set msg->msg_name = NULL if user specified a NULL in msg_name but had a
      non-null msg_namelen in verify_iovec/verify_compat_iovec. This doesn't
      affect sendto as it would bail out earlier while trying to copy-in the
      address. It also more naturally reflects the logic by the callers of
      verify_iovec.
      
      With this change in place I could remove "
      if (!uaddr || msg_sys->msg_namelen == 0)
      	msg->msg_name = NULL
      ".
      
      This change does not alter the user visible error logic as we ignore
      msg_namelen as long as msg_name is NULL.
      
      Also remove two unnecessary curly brackets in ___sys_recvmsg and change
      comments to netdev style.
      
      Cc: David Miller <davem@davemloft.net>
      Suggested-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f3d33426
  8. 20 11月, 2013 8 次提交
  9. 19 11月, 2013 1 次提交
  10. 16 11月, 2013 1 次提交
  11. 15 11月, 2013 3 次提交
  12. 07 9月, 2013 1 次提交
  13. 29 8月, 2013 2 次提交
  14. 23 8月, 2013 1 次提交
  15. 16 8月, 2013 1 次提交
  16. 13 8月, 2013 1 次提交
    • J
      genetlink: fix family dump race · 58ad436f
      Johannes Berg 提交于
      When dumping generic netlink families, only the first dump call
      is locked with genl_lock(), which protects the list of families,
      and thus subsequent calls can access the data without locking,
      racing against family addition/removal. This can cause a crash.
      Fix it - the locking needs to be conditional because the first
      time around it's already locked.
      
      A similar bug was reported to me on an old kernel (3.4.47) but
      the exact scenario that happened there is no longer possible,
      on those kernels the first round wasn't locked either. Looking
      at the current code I found the race described above, which had
      also existed on the old kernel.
      
      Cc: stable@vger.kernel.org
      Reported-by: NAndrei Otcheretianski <andrei.otcheretianski@intel.com>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      58ad436f
  17. 03 8月, 2013 1 次提交
  18. 31 7月, 2013 1 次提交
    • P
      genetlink: fix usage of NLM_F_EXCL or NLM_F_REPLACE · e1ee3673
      Pablo Neira 提交于
      Currently, it is not possible to use neither NLM_F_EXCL nor
      NLM_F_REPLACE from genetlink. This is due to this checking in
      genl_family_rcv_msg:
      
      	if (nlh->nlmsg_flags & NLM_F_DUMP)
      
      NLM_F_DUMP is NLM_F_MATCH|NLM_F_ROOT. Thus, if NLM_F_EXCL or
      NLM_F_REPLACE flag is set, genetlink believes that you're
      requesting a dump and it calls the .dumpit callback.
      
      The solution that I propose is to refine this checking to
      make it stricter:
      
      	if ((nlh->nlmsg_flags & NLM_F_DUMP) == NLM_F_DUMP)
      
      And given the combination NLM_F_REPLACE and NLM_F_EXCL does
      not make sense to me, it removes the ambiguity.
      
      There was a patch that tried to fix this some time ago (0ab03c2b
      netlink: test for all flags of the NLM_F_DUMP composite) but it
      tried to resolve this ambiguity in *all* existing netlink subsystems,
      not only genetlink. That patch was reverted since it broke iproute2,
      which is using NLM_F_ROOT to request the dump of the routing cache.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e1ee3673
  19. 28 7月, 2013 1 次提交
    • S
      genetlink: release cb_lock before requesting additional module · c74f2b26
      Stanislaw Gruszka 提交于
      Requesting external module with cb_lock taken can result in
      the deadlock like showed below:
      
      [ 2458.111347] Showing all locks held in the system:
      [ 2458.111347] 1 lock held by NetworkManager/582:
      [ 2458.111347]  #0:  (cb_lock){++++++}, at: [<ffffffff8162bc79>] genl_rcv+0x19/0x40
      [ 2458.111347] 1 lock held by modprobe/603:
      [ 2458.111347]  #0:  (cb_lock){++++++}, at: [<ffffffff8162baa5>] genl_lock_all+0x15/0x30
      
      [ 2461.579457] SysRq : Show Blocked State
      [ 2461.580103]   task                        PC stack   pid father
      [ 2461.580103] NetworkManager  D ffff880034b84500  4040   582      1 0x00000080
      [ 2461.580103]  ffff8800197ff720 0000000000000046 00000000001d5340 ffff8800197fffd8
      [ 2461.580103]  ffff8800197fffd8 00000000001d5340 ffff880019631700 7fffffffffffffff
      [ 2461.580103]  ffff8800197ff880 ffff8800197ff878 ffff880019631700 ffff880019631700
      [ 2461.580103] Call Trace:
      [ 2461.580103]  [<ffffffff817355f9>] schedule+0x29/0x70
      [ 2461.580103]  [<ffffffff81731ad1>] schedule_timeout+0x1c1/0x360
      [ 2461.580103]  [<ffffffff810e69eb>] ? mark_held_locks+0xbb/0x140
      [ 2461.580103]  [<ffffffff817377ac>] ? _raw_spin_unlock_irq+0x2c/0x50
      [ 2461.580103]  [<ffffffff810e6b6d>] ? trace_hardirqs_on_caller+0xfd/0x1c0
      [ 2461.580103]  [<ffffffff81736398>] wait_for_completion_killable+0xe8/0x170
      [ 2461.580103]  [<ffffffff810b7fa0>] ? wake_up_state+0x20/0x20
      [ 2461.580103]  [<ffffffff81095825>] call_usermodehelper_exec+0x1a5/0x210
      [ 2461.580103]  [<ffffffff817362ed>] ? wait_for_completion_killable+0x3d/0x170
      [ 2461.580103]  [<ffffffff81095cc3>] __request_module+0x1b3/0x370
      [ 2461.580103]  [<ffffffff810e6b6d>] ? trace_hardirqs_on_caller+0xfd/0x1c0
      [ 2461.580103]  [<ffffffff8162c5c9>] ctrl_getfamily+0x159/0x190
      [ 2461.580103]  [<ffffffff8162d8a4>] genl_family_rcv_msg+0x1f4/0x2e0
      [ 2461.580103]  [<ffffffff8162d990>] ? genl_family_rcv_msg+0x2e0/0x2e0
      [ 2461.580103]  [<ffffffff8162da1e>] genl_rcv_msg+0x8e/0xd0
      [ 2461.580103]  [<ffffffff8162b729>] netlink_rcv_skb+0xa9/0xc0
      [ 2461.580103]  [<ffffffff8162bc88>] genl_rcv+0x28/0x40
      [ 2461.580103]  [<ffffffff8162ad6d>] netlink_unicast+0xdd/0x190
      [ 2461.580103]  [<ffffffff8162b149>] netlink_sendmsg+0x329/0x750
      [ 2461.580103]  [<ffffffff815db849>] sock_sendmsg+0x99/0xd0
      [ 2461.580103]  [<ffffffff810bb58f>] ? local_clock+0x5f/0x70
      [ 2461.580103]  [<ffffffff810e96e8>] ? lock_release_non_nested+0x308/0x350
      [ 2461.580103]  [<ffffffff815dbc6e>] ___sys_sendmsg+0x39e/0x3b0
      [ 2461.580103]  [<ffffffff810565af>] ? kvm_clock_read+0x2f/0x50
      [ 2461.580103]  [<ffffffff810218b9>] ? sched_clock+0x9/0x10
      [ 2461.580103]  [<ffffffff810bb2bd>] ? sched_clock_local+0x1d/0x80
      [ 2461.580103]  [<ffffffff810bb448>] ? sched_clock_cpu+0xa8/0x100
      [ 2461.580103]  [<ffffffff810e33ad>] ? trace_hardirqs_off+0xd/0x10
      [ 2461.580103]  [<ffffffff810bb58f>] ? local_clock+0x5f/0x70
      [ 2461.580103]  [<ffffffff810e3f7f>] ? lock_release_holdtime.part.28+0xf/0x1a0
      [ 2461.580103]  [<ffffffff8120fec9>] ? fget_light+0xf9/0x510
      [ 2461.580103]  [<ffffffff8120fe0c>] ? fget_light+0x3c/0x510
      [ 2461.580103]  [<ffffffff815dd1d2>] __sys_sendmsg+0x42/0x80
      [ 2461.580103]  [<ffffffff815dd222>] SyS_sendmsg+0x12/0x20
      [ 2461.580103]  [<ffffffff81741ad9>] system_call_fastpath+0x16/0x1b
      [ 2461.580103] modprobe        D ffff88000f2c8000  4632   603    602 0x00000080
      [ 2461.580103]  ffff88000f04fba8 0000000000000046 00000000001d5340 ffff88000f04ffd8
      [ 2461.580103]  ffff88000f04ffd8 00000000001d5340 ffff8800377d4500 ffff8800377d4500
      [ 2461.580103]  ffffffff81d0b260 ffffffff81d0b268 ffffffff00000000 ffffffff81d0b2b0
      [ 2461.580103] Call Trace:
      [ 2461.580103]  [<ffffffff817355f9>] schedule+0x29/0x70
      [ 2461.580103]  [<ffffffff81736d4d>] rwsem_down_write_failed+0xed/0x1a0
      [ 2461.580103]  [<ffffffff810bb200>] ? update_cpu_load_active+0x10/0xb0
      [ 2461.580103]  [<ffffffff8137b473>] call_rwsem_down_write_failed+0x13/0x20
      [ 2461.580103]  [<ffffffff8173492d>] ? down_write+0x9d/0xb2
      [ 2461.580103]  [<ffffffff8162baa5>] ? genl_lock_all+0x15/0x30
      [ 2461.580103]  [<ffffffff8162baa5>] genl_lock_all+0x15/0x30
      [ 2461.580103]  [<ffffffff8162cbb3>] genl_register_family+0x53/0x1f0
      [ 2461.580103]  [<ffffffffa01dc000>] ? 0xffffffffa01dbfff
      [ 2461.580103]  [<ffffffff8162d650>] genl_register_family_with_ops+0x20/0x80
      [ 2461.580103]  [<ffffffffa01dc000>] ? 0xffffffffa01dbfff
      [ 2461.580103]  [<ffffffffa017fe84>] nl80211_init+0x24/0xf0 [cfg80211]
      [ 2461.580103]  [<ffffffffa01dc000>] ? 0xffffffffa01dbfff
      [ 2461.580103]  [<ffffffffa01dc043>] cfg80211_init+0x43/0xdb [cfg80211]
      [ 2461.580103]  [<ffffffff810020fa>] do_one_initcall+0xfa/0x1b0
      [ 2461.580103]  [<ffffffff8105cb93>] ? set_memory_nx+0x43/0x50
      [ 2461.580103]  [<ffffffff810f75af>] load_module+0x1c6f/0x27f0
      [ 2461.580103]  [<ffffffff810f2c90>] ? store_uevent+0x40/0x40
      [ 2461.580103]  [<ffffffff810f82c6>] SyS_finit_module+0x86/0xb0
      [ 2461.580103]  [<ffffffff81741ad9>] system_call_fastpath+0x16/0x1b
      [ 2461.580103] Sched Debug Version: v0.10, 3.11.0-0.rc1.git4.1.fc20.x86_64 #1
      
      Problem start to happen after adding net-pf-16-proto-16-family-nl80211
      alias name to cfg80211 module by below commit (though that commit
      itself is perfectly fine):
      
      commit fb4e1568
      Author: Marcel Holtmann <marcel@holtmann.org>
      Date:   Sun Apr 28 16:22:06 2013 -0700
      
          nl80211: Add generic netlink module alias for cfg80211/nl80211
      Reported-and-tested-by: NJeff Layton <jlayton@redhat.com>
      Reported-by: NRichard W.M. Jones <rjones@redhat.com>
      Signed-off-by: NStanislaw Gruszka <sgruszka@redhat.com>
      Reviewed-by: NPravin B Shelar <pshelar@nicira.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c74f2b26
  20. 28 6月, 2013 1 次提交
    • P
      netlink: fix splat in skb_clone with large messages · 3a36515f
      Pablo Neira 提交于
      Since (c05cdb1b netlink: allow large data transfers from user-space),
      netlink splats if it invokes skb_clone on large netlink skbs since:
      
      * skb_shared_info was not correctly initialized.
      * skb->destructor is not set in the cloned skb.
      
      This was spotted by trinity:
      
      [  894.990671] BUG: unable to handle kernel paging request at ffffc9000047b001
      [  894.991034] IP: [<ffffffff81a212c4>] skb_clone+0x24/0xc0
      [...]
      [  894.991034] Call Trace:
      [  894.991034]  [<ffffffff81ad299a>] nl_fib_input+0x6a/0x240
      [  894.991034]  [<ffffffff81c3b7e6>] ? _raw_read_unlock+0x26/0x40
      [  894.991034]  [<ffffffff81a5f189>] netlink_unicast+0x169/0x1e0
      [  894.991034]  [<ffffffff81a601e1>] netlink_sendmsg+0x251/0x3d0
      
      Fix it by:
      
      1) introducing a new netlink_skb_clone function that is used in nl_fib_input,
         that sets our special skb->destructor in the cloned skb. Moreover, handle
         the release of the large cloned skb head area in the destructor path.
      
      2) not allowing large skbuffs in the netlink broadcast path. I cannot find
         any reasonable use of the large data transfer using netlink in that path,
         moreover this helps to skip extra skb_clone handling.
      
      I found two more netlink clients that are cloning the skbs, but they are
      not in the sendmsg path. Therefore, the sole client cloning that I found
      seems to be the fib frontend.
      
      Thanks to Eric Dumazet for helping to address this issue.
      Reported-by: NFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3a36515f
  21. 25 6月, 2013 1 次提交
    • D
      net: netlink: virtual tap device management · bcbde0d4
      Daniel Borkmann 提交于
      Similarly to the networking receive path with ptype_all taps, we add
      the possibility to register netdevices that are for ARPHRD_NETLINK to
      the netlink subsystem, so that those can be used for netlink analyzers
      resp. debuggers. We do not offer a direct callback function as out-of-tree
      modules could do crap with it. Instead, a netdevice must be registered
      properly and only receives a clone, managed by the netlink layer. Symbols
      are exported as GPL-only.
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bcbde0d4
  22. 13 6月, 2013 1 次提交
  23. 11 6月, 2013 2 次提交
  24. 08 6月, 2013 1 次提交
    • P
      netlink: allow large data transfers from user-space · c05cdb1b
      Pablo Neira Ayuso 提交于
      I can hit ENOBUFS in the sendmsg() path with a large batch that is
      composed of many netlink messages. Here that limit is 8 MBytes of
      skbuff data area as kmalloc does not manage to get more than that.
      
      While discussing atomic rule-set for nftables with Patrick McHardy,
      we decided to put all rule-set updates that need to be applied
      atomically in one single batch to simplify the existing approach.
      However, as explained above, the existing netlink code limits us
      to a maximum of ~20000 rules that fit in one single batch without
      hitting ENOBUFS. iptables does not have such limitation as it is
      using vmalloc.
      
      This patch adds netlink_alloc_large_skb() which is only used in
      the netlink_sendmsg() path. It uses alloc_skb if the memory
      requested is <= one memory page, that should be the common case
      for most subsystems, else vmalloc for higher memory allocations.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c05cdb1b
  25. 05 6月, 2013 1 次提交
  26. 02 5月, 2013 1 次提交