1. 06 9月, 2015 1 次提交
  2. 08 4月, 2015 1 次提交
    • D
      netfilter: Pass socket pointer down through okfn(). · 7026b1dd
      David Miller 提交于
      On the output paths in particular, we have to sometimes deal with two
      socket contexts.  First, and usually skb->sk, is the local socket that
      generated the frame.
      
      And second, is potentially the socket used to control a tunneling
      socket, such as one the encapsulates using UDP.
      
      We do not want to disassociate skb->sk when encapsulating in order
      to fix this, because that would break socket memory accounting.
      
      The most extreme case where this can cause huge problems is an
      AF_PACKET socket transmitting over a vxlan device.  We hit code
      paths doing checks that assume they are dealing with an ipv4
      socket, but are actually operating upon the AF_PACKET one.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7026b1dd
  3. 03 4月, 2015 4 次提交
  4. 01 4月, 2015 3 次提交
  5. 30 3月, 2015 1 次提交
  6. 13 3月, 2015 1 次提交
    • E
      net: Introduce possible_net_t · 0c5c9fb5
      Eric W. Biederman 提交于
      Having to say
      > #ifdef CONFIG_NET_NS
      > 	struct net *net;
      > #endif
      
      in structures is a little bit wordy and a little bit error prone.
      
      Instead it is possible to say:
      > typedef struct {
      > #ifdef CONFIG_NET_NS
      >       struct net *net;
      > #endif
      > } possible_net_t;
      
      And then in a header say:
      
      > 	possible_net_t net;
      
      Which is cleaner and easier to use and easier to test, as the
      possible_net_t is always there no matter what the compile options.
      
      Further this allows read_pnet and write_pnet to be functions in all
      cases which is better at catching typos.
      
      This change adds possible_net_t, updates the definitions of read_pnet
      and write_pnet, updates optional struct net * variables that
      write_pnet uses on to have the type possible_net_t, and finally fixes
      up the b0rked users of read_pnet and write_pnet.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0c5c9fb5
  7. 18 1月, 2015 1 次提交
    • J
      netlink: make nlmsg_end() and genlmsg_end() void · 053c095a
      Johannes Berg 提交于
      Contrary to common expectations for an "int" return, these functions
      return only a positive value -- if used correctly they cannot even
      return 0 because the message header will necessarily be in the skb.
      
      This makes the very common pattern of
      
        if (genlmsg_end(...) < 0) { ... }
      
      be a whole bunch of dead code. Many places also simply do
      
        return nlmsg_end(...);
      
      and the caller is expected to deal with it.
      
      This also commonly (at least for me) causes errors, because it is very
      common to write
      
        if (my_function(...))
          /* error condition */
      
      and if my_function() does "return nlmsg_end()" this is of course wrong.
      
      Additionally, there's not a single place in the kernel that actually
      needs the message length returned, and if anyone needs it later then
      it'll be very easy to just use skb->len there.
      
      Remove this, and make the functions void. This removes a bunch of dead
      code as described above. The patch adds lines because I did
      
      -	return nlmsg_end(...);
      +	nlmsg_end(...);
      +	return 0;
      
      I could have preserved all the function's return values by returning
      skb->len, but instead I've audited all the places calling the affected
      functions and found that none cared. A few places actually compared
      the return value with <= 0 in dump functionality, but that could just
      be changed to < 0 with no change in behaviour, so I opted for the more
      efficient version.
      
      One instance of the error I've made numerous times now is also present
      in net/phonet/pn_netlink.c in the route_dumpit() function - it didn't
      check for <0 or <=0 and thus broke out of the loop every single time.
      I've preserved this since it will (I think) have caused the messages to
      userspace to be formatted differently with just a single message for
      every SKB returned to userspace. It's possible that this isn't needed
      for the tools that actually use this, but I don't even know what they
      are so couldn't test that changing this behaviour would be acceptable.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      053c095a
  8. 20 11月, 2014 1 次提交
  9. 31 10月, 2014 1 次提交
  10. 25 8月, 2014 1 次提交
    • I
      ipv6: White-space cleansing : Line Layouts · 67ba4152
      Ian Morris 提交于
      This patch makes no changes to the logic of the code but simply addresses
      coding style issues as detected by checkpatch.
      
      Both objdump and diff -w show no differences.
      
      A number of items are addressed in this patch:
      * Multiple spaces converted to tabs
      * Spaces before tabs removed.
      * Spaces in pointer typing cleansed (char *)foo etc.
      * Remove space after sizeof
      * Ensure spacing around comparators such as if statements.
      Signed-off-by: NIan Morris <ipm@chirality.org.uk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      67ba4152
  11. 16 7月, 2014 1 次提交
    • T
      net: set name_assign_type in alloc_netdev() · c835a677
      Tom Gundersen 提交于
      Extend alloc_netdev{,_mq{,s}}() to take name_assign_type as argument, and convert
      all users to pass NET_NAME_UNKNOWN.
      
      Coccinelle patch:
      
      @@
      expression sizeof_priv, name, setup, txqs, rxqs, count;
      @@
      
      (
      -alloc_netdev_mqs(sizeof_priv, name, setup, txqs, rxqs)
      +alloc_netdev_mqs(sizeof_priv, name, NET_NAME_UNKNOWN, setup, txqs, rxqs)
      |
      -alloc_netdev_mq(sizeof_priv, name, setup, count)
      +alloc_netdev_mq(sizeof_priv, name, NET_NAME_UNKNOWN, setup, count)
      |
      -alloc_netdev(sizeof_priv, name, setup)
      +alloc_netdev(sizeof_priv, name, NET_NAME_UNKNOWN, setup)
      )
      
      v9: move comments here from the wrong commit
      Signed-off-by: NTom Gundersen <teg@jklm.no>
      Reviewed-by: NDavid Herrmann <dh.herrmann@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c835a677
  12. 29 4月, 2014 1 次提交
  13. 17 4月, 2014 1 次提交
    • C
      ipv4, fib: pass LOOPBACK_IFINDEX instead of 0 to flowi4_iif · 6a662719
      Cong Wang 提交于
      As suggested by Julian:
      
      	Simply, flowi4_iif must not contain 0, it does not
      	look logical to ignore all ip rules with specified iif.
      
      because in fib_rule_match() we do:
      
              if (rule->iifindex && (rule->iifindex != fl->flowi_iif))
                      goto out;
      
      flowi4_iif should be LOOPBACK_IFINDEX by default.
      
      We need to move LOOPBACK_IFINDEX to include/net/flow.h:
      
      1) It is mostly used by flowi_iif
      
      2) Fix the following compile error if we use it in flow.h
      by the patches latter:
      
      In file included from include/linux/netfilter.h:277:0,
                       from include/net/netns/netfilter.h:5,
                       from include/net/net_namespace.h:21,
                       from include/linux/netdevice.h:43,
                       from include/linux/icmpv6.h:12,
                       from include/linux/ipv6.h:61,
                       from include/net/ipv6.h:16,
                       from include/linux/sunrpc/clnt.h:27,
                       from include/linux/nfs_fs.h:30,
                       from init/do_mounts.c:32:
      include/net/flow.h: In function ‘flowi4_init_output’:
      include/net/flow.h:84:32: error: ‘LOOPBACK_IFINDEX’ undeclared (first use in this function)
      
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Julian Anastasov <ja@ssi.bg>
      Cc: David S. Miller <davem@davemloft.net>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NCong Wang <cwang@twopensource.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6a662719
  14. 21 3月, 2014 1 次提交
  15. 15 1月, 2014 1 次提交
  16. 04 9月, 2013 1 次提交
  17. 25 7月, 2013 1 次提交
    • H
      ipv6: take rtnl_lock and mark mrt6 table as freed on namespace cleanup · 905a6f96
      Hannes Frederic Sowa 提交于
      Otherwise we end up dereferencing the already freed net->ipv6.mrt pointer
      which leads to a panic (from Srivatsa S. Bhat):
      
      BUG: unable to handle kernel paging request at ffff882018552020
      IP: [<ffffffffa0366b02>] ip6mr_sk_done+0x32/0xb0 [ipv6]
      PGD 290a067 PUD 207ffe0067 PMD 207ff1d067 PTE 8000002018552060
      Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
      Modules linked in: ebtable_nat ebtables nfs fscache nf_conntrack_ipv4 nf_defrag_ipv4 ipt_REJECT xt_CHECKSUM iptable_mangle iptable_filter ip_tables nfsd lockd nfs_acl exportfs auth_rpcgss autofs4 sunrpc 8021q garp bridge stp llc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter
      +ip6_tables ipv6 vfat fat vhost_net macvtap macvlan vhost tun kvm_intel kvm uinput iTCO_wdt iTCO_vendor_support cdc_ether usbnet mii microcode i2c_i801 i2c_core lpc_ich mfd_core shpchp ioatdma dca mlx4_core be2net wmi acpi_cpufreq mperf ext4 jbd2 mbcache dm_mirror dm_region_hash dm_log dm_mod
      CPU: 0 PID: 7 Comm: kworker/u33:0 Not tainted 3.11.0-rc1-ea45e-a #4
      Hardware name: IBM  -[8737R2A]-/00Y2738, BIOS -[B2E120RUS-1.20]- 11/30/2012
      Workqueue: netns cleanup_net
      task: ffff8810393641c0 ti: ffff881039366000 task.ti: ffff881039366000
      RIP: 0010:[<ffffffffa0366b02>]  [<ffffffffa0366b02>] ip6mr_sk_done+0x32/0xb0 [ipv6]
      RSP: 0018:ffff881039367bd8  EFLAGS: 00010286
      RAX: ffff881039367fd8 RBX: ffff882018552000 RCX: dead000000200200
      RDX: 0000000000000000 RSI: ffff881039367b68 RDI: ffff881039367b68
      RBP: ffff881039367bf8 R08: ffff881039367b68 R09: 2222222222222222
      R10: 2222222222222222 R11: 2222222222222222 R12: ffff882015a7a040
      R13: ffff882014eb89c0 R14: ffff8820289e2800 R15: 0000000000000000
      FS:  0000000000000000(0000) GS:ffff88103fc00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: ffff882018552020 CR3: 0000000001c0b000 CR4: 00000000000407f0
      Stack:
       ffff881039367c18 ffff882014eb89c0 ffff882015e28c00 0000000000000000
       ffff881039367c18 ffffffffa034d9d1 ffff8820289e2800 ffff882014eb89c0
       ffff881039367c58 ffffffff815bdecb ffffffff815bddf2 ffff882014eb89c0
      Call Trace:
       [<ffffffffa034d9d1>] rawv6_close+0x21/0x40 [ipv6]
       [<ffffffff815bdecb>] inet_release+0xfb/0x220
       [<ffffffff815bddf2>] ? inet_release+0x22/0x220
       [<ffffffffa032686f>] inet6_release+0x3f/0x50 [ipv6]
       [<ffffffff8151c1d9>] sock_release+0x29/0xa0
       [<ffffffff81525520>] sk_release_kernel+0x30/0x70
       [<ffffffffa034f14b>] icmpv6_sk_exit+0x3b/0x80 [ipv6]
       [<ffffffff8152fff9>] ops_exit_list+0x39/0x60
       [<ffffffff815306fb>] cleanup_net+0xfb/0x1a0
       [<ffffffff81075e3a>] process_one_work+0x1da/0x610
       [<ffffffff81075dc9>] ? process_one_work+0x169/0x610
       [<ffffffff81076390>] worker_thread+0x120/0x3a0
       [<ffffffff81076270>] ? process_one_work+0x610/0x610
       [<ffffffff8107da2e>] kthread+0xee/0x100
       [<ffffffff8107d940>] ? __init_kthread_worker+0x70/0x70
       [<ffffffff8162a99c>] ret_from_fork+0x7c/0xb0
       [<ffffffff8107d940>] ? __init_kthread_worker+0x70/0x70
      Code: 20 48 89 5d e8 4c 89 65 f0 4c 89 6d f8 66 66 66 66 90 4c 8b 67 30 49 89 fd e8 db 3c 1e e1 49 8b 9c 24 90 08 00 00 48 85 db 74 06 <4c> 39 6b 20 74 20 bb f3 ff ff ff e8 8e 3c 1e e1 89 d8 4c 8b 65
      RIP  [<ffffffffa0366b02>] ip6mr_sk_done+0x32/0xb0 [ipv6]
       RSP <ffff881039367bd8>
      CR2: ffff882018552020
      Reported-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Tested-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      905a6f96
  18. 24 7月, 2013 1 次提交
  19. 29 5月, 2013 1 次提交
  20. 29 3月, 2013 1 次提交
  21. 19 2月, 2013 2 次提交
  22. 28 1月, 2013 1 次提交
  23. 22 1月, 2013 1 次提交
    • N
      mcast: add multicast proxy support (IPv4 and IPv6) · 660b26dc
      Nicolas Dichtel 提交于
      This patch add the support of proxy multicast, ie being able to build a static
      multicast tree. It adds the support of (*,*) and (*,G) entries.
      
      The user should define an (*,*) entry which is not used for real forwarding.
      This entry defines the upstream in iif and contains all interfaces from the
      static tree in its oifs. It will be used to forward packet upstream when they
      come from an interface belonging to the static tree.
      Hence, the user should define (*,G) entries to build its static tree. Note that
      upstream interface must be part of oifs: packets are sent to all oifs
      interfaces except the input interface. This ensures to always join the whole
      static tree, even if the packet is not coming from the upstream interface.
      Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Acked-by: NDavid L Stevens <dlstevens@us.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      660b26dc
  24. 05 12月, 2012 7 次提交
  25. 27 11月, 2012 1 次提交
  26. 26 11月, 2012 1 次提交
  27. 19 11月, 2012 1 次提交
    • E
      net: Allow userns root to control ipv6 · af31f412
      Eric W. Biederman 提交于
      Allow an unpriviled user who has created a user namespace, and then
      created a network namespace to effectively use the new network
      namespace, by reducing capable(CAP_NET_ADMIN) and
      capable(CAP_NET_RAW) calls to be ns_capable(net->user_ns,
      CAP_NET_ADMIN), or capable(net->user_ns, CAP_NET_RAW) calls.
      
      Settings that merely control a single network device are allowed.
      Either the network device is a logical network device where
      restrictions make no difference or the network device is hardware NIC
      that has been explicity moved from the initial network namespace.
      
      In general policy and network stack state changes are allowed while
      resource control is left unchanged.
      
      Allow the SIOCSIFADDR ioctl to add ipv6 addresses.
      Allow the SIOCDIFADDR ioctl to delete ipv6 addresses.
      Allow the SIOCADDRT ioctl to add ipv6 routes.
      Allow the SIOCDELRT ioctl to delete ipv6 routes.
      
      Allow creation of ipv6 raw sockets.
      
      Allow setting the IPV6_JOIN_ANYCAST socket option.
      Allow setting the IPV6_FL_A_RENEW parameter of the IPV6_FLOWLABEL_MGR
      socket option.
      
      Allow setting the IPV6_TRANSPARENT socket option.
      Allow setting the IPV6_HOPOPTS socket option.
      Allow setting the IPV6_RTHDRDSTOPTS socket option.
      Allow setting the IPV6_DSTOPTS socket option.
      Allow setting the IPV6_IPSEC_POLICY socket option.
      Allow setting the IPV6_XFRM_POLICY socket option.
      
      Allow sending packets with the IPV6_2292HOPOPTS control message.
      Allow sending packets with the IPV6_2292DSTOPTS control message.
      Allow sending packets with the IPV6_RTHDRDSTOPTS control message.
      
      Allow setting the multicast routing socket options on non multicast
      routing sockets.
      
      Allow the SIOCADDTUNNEL, SIOCCHGTUNNEL, and SIOCDELTUNNEL ioctls for
      setting up, changing and deleting tunnels over ipv6.
      
      Allow the SIOCADDTUNNEL, SIOCCHGTUNNEL, SIOCDELTUNNEL ioctls for
      setting up, changing and deleting ipv6 over ipv4 tunnels.
      
      Allow the SIOCADDPRL, SIOCDELPRL, SIOCCHGPRL ioctls for adding,
      deleting, and changing the potential router list for ISATAP tunnels.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      af31f412
  28. 06 10月, 2012 1 次提交