1. 29 5月, 2018 1 次提交
  2. 25 5月, 2018 1 次提交
  3. 24 5月, 2018 2 次提交
  4. 18 5月, 2018 1 次提交
  5. 28 3月, 2018 1 次提交
  6. 13 2月, 2018 1 次提交
    • K
      net: Convert pernet_subsys, registered from inet_init() · f84c6821
      Kirill Tkhai 提交于
      arp_net_ops just addr/removes /proc entry.
      
      devinet_ops allocates and frees duplicate of init_net tables
      and (un)registers sysctl entries.
      
      fib_net_ops allocates and frees pernet tables, creates/destroys
      netlink socket and (un)initializes /proc entries. Foreign
      pernet_operations do not touch them.
      
      ip_rt_proc_ops only modifies pernet /proc entries.
      
      xfrm_net_ops creates/destroys /proc entries, allocates/frees
      pernet statistics, hashes and tables, and (un)initializes
      sysctl files. These are not touched by foreigh pernet_operations
      
      xfrm4_net_ops allocates/frees private pernet memory, and
      configures sysctls.
      
      sysctl_route_ops creates/destroys sysctls.
      
      rt_genid_ops only initializes fields of just allocated net.
      
      ipv4_inetpeer_ops allocated/frees net private memory.
      
      igmp_net_ops just creates/destroys /proc files and socket,
      noone else interested in.
      
      tcp_sk_ops seems to be safe, because tcp_sk_init() does not
      depend on any other pernet_operations modifications. Iteration
      over hash table in inet_twsk_purge() is made under RCU lock,
      and it's safe to iterate the table this way. Removing from
      the table happen from inet_twsk_deschedule_put(), but this
      function is safe without any extern locks, as it's synchronized
      inside itself. There are many examples, it's used in different
      context. So, it's safe to leave tcp_sk_exit_batch() unlocked.
      
      tcp_net_metrics_ops is synchronized on tcp_metrics_lock and safe.
      
      udplite4_net_ops only creates/destroys pernet /proc file.
      
      icmp_sk_ops creates percpu sockets, not touched by foreign
      pernet_operations.
      
      ipmr_net_ops creates/destroys pernet fib tables, (un)registers
      fib rules and /proc files. This seem to be safe to execute
      in parallel with foreign pernet_operations.
      
      af_inet_ops just sets up default parameters of newly created net.
      
      ipv4_mib_ops creates and destroys pernet percpu statistics.
      
      raw_net_ops, tcp4_net_ops, udp4_net_ops, ping_v4_net_ops
      and ip_proc_ops only create/destroy pernet /proc files.
      
      ip4_frags_ops creates and destroys sysctl file.
      
      So, it's safe to make the pernet_operations async.
      Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com>
      Acked-by: NAndrei Vagin <avagin@virtuozzo.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f84c6821
  7. 25 1月, 2018 1 次提交
  8. 21 12月, 2017 1 次提交
  9. 01 11月, 2017 1 次提交
  10. 22 9月, 2017 1 次提交
    • P
      net: avoid a full fib lookup when rp_filter is disabled. · 6e617de8
      Paolo Abeni 提交于
      Since commit 1dced6a8 ("ipv4: Restore accept_local behaviour
      in fib_validate_source()") a full fib lookup is needed even if
      the rp_filter is disabled, if accept_local is false - which is
      the default.
      
      What we really need in the above scenario is just checking
      that the source IP address is not local, and in most case we
      can do that is a cheaper way looking up the ifaddr hash table.
      
      This commit adds a helper for such lookup, and uses it to
      validate the src address when rp_filter is disabled and no
      'local' routes are created by the user space in the relevant
      namespace.
      
      A new ipv4 netns flag is added to account for such routes.
      We need that to preserve the same behavior we had before this
      patch.
      
      It also drops the checks to bail early from __fib_validate_source,
      added by the commit 1dced6a8 ("ipv4: Restore accept_local
      behaviour in fib_validate_source()") they do not give any
      measurable performance improvement: if we do the lookup with are
      on a slower path.
      
      This improves UDP performances for unconnected sockets
      when rp_filter is disabled by 5% and also gives small but
      measurable performance improvement for TCP flood scenarios.
      
      v1 -> v2:
       - use the ifaddr lookup helper in __ip_dev_find(), as suggested
         by Eric
       - fall-back to full lookup if custom local routes are present
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6e617de8
  11. 10 8月, 2017 1 次提交
  12. 04 8月, 2017 1 次提交
    • I
      net: core: Make the FIB notification chain generic · 04b1d4e5
      Ido Schimmel 提交于
      The FIB notification chain is currently soley used by IPv4 code.
      However, we're going to introduce IPv6 FIB offload support, which
      requires these notification as well.
      
      As explained in commit c3852ef7 ("ipv4: fib: Replay events when
      registering FIB notifier"), upon registration to the chain, the callee
      receives a full dump of the FIB tables and rules by traversing all the
      net namespaces. The integrity of the dump is ensured by a per-namespace
      sequence counter that is incremented whenever a change to the tables or
      rules occurs.
      
      In order to allow more address families to use the chain, each family is
      expected to register its fib_notifier_ops in its pernet init. These
      operations allow the common code to read the family's sequence counter
      as well as dump its tables and rules in the given net namespace.
      
      Additionally, a 'family' parameter is added to sent notifications, so
      that listeners could distinguish between the different families.
      
      Implement the common code that allows listeners to register to the chain
      and for address families to register their fib_notifier_ops. Subsequent
      patches will implement these operations in IPv6.
      
      In the future, ipmr and ip6mr will be extended to provide these
      notifications as well.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      04b1d4e5
  13. 21 7月, 2017 1 次提交
    • M
      ipv4: initialize fib_trie prior to register_netdev_notifier call. · 8799a221
      Mahesh Bandewar 提交于
      Net stack initialization currently initializes fib-trie after the
      first call to netdevice_notifier() call. In fact fib_trie initialization
      needs to happen before first rtnl_register(). It does not cause any problem
      since there are no devices UP at this moment, but trying to bring 'lo'
      UP at initialization would make this assumption wrong and exposes the issue.
      
      Fixes following crash
      
       Call Trace:
        ? alternate_node_alloc+0x76/0xa0
        fib_table_insert+0x1b7/0x4b0
        fib_magic.isra.17+0xea/0x120
        fib_add_ifaddr+0x7b/0x190
        fib_netdev_event+0xc0/0x130
        register_netdevice_notifier+0x1c1/0x1d0
        ip_fib_init+0x72/0x85
        ip_rt_init+0x187/0x1e9
        ip_init+0xe/0x1a
        inet_init+0x171/0x26c
        ? ipv4_offload_init+0x66/0x66
        do_one_initcall+0x43/0x160
        kernel_init_freeable+0x191/0x219
        ? rest_init+0x80/0x80
        kernel_init+0xe/0x150
        ret_from_fork+0x22/0x30
       Code: f6 46 23 04 74 86 4c 89 f7 e8 ae 45 01 00 49 89 c7 4d 85 ff 0f 85 7b ff ff ff 31 db eb 08 4c 89 ff e8 16 47 01 00 48 8b 44 24 38 <45> 8b 6e 14 4d 63 76 74 48 89 04 24 0f 1f 44 00 00 48 83 c4 08
       RIP: kmem_cache_alloc+0xcf/0x1c0 RSP: ffff9b1500017c28
       CR2: 0000000000000014
      
      Fixes: 7b1a74fd ("[NETNS]: Refactor fib initialization so it can handle multiple namespaces.")
      Fixes: 7f9b8052 ("[IPV4]: fib hash|trie initialization")
      Signed-off-by: NMahesh Bandewar <maheshb@google.com>
      Acked-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8799a221
  14. 05 7月, 2017 1 次提交
  15. 04 7月, 2017 1 次提交
  16. 30 5月, 2017 2 次提交
  17. 23 5月, 2017 2 次提交
  18. 17 5月, 2017 1 次提交
    • D
      net: Improve handling of failures on link and route dumps · f6c5775f
      David Ahern 提交于
      In general, rtnetlink dumps do not anticipate failure to dump a single
      object (e.g., link or route) on a single pass. As both route and link
      objects have grown via more attributes, that is no longer a given.
      
      netlink dumps can handle a failure if the dump function returns an
      error; specifically, netlink_dump adds the return code to the response
      if it is <= 0 so userspace is notified of the failure. The missing
      piece is the rtnetlink dump functions returning the error.
      
      Fix route and link dump functions to return the errors if no object is
      added to an skb (detected by skb->len != 0). IPv6 route dumps
      (rt6_dump_route) already return the error; this patch updates IPv4 and
      link dumps. Other dump functions may need to be ajusted as well.
      Reported-by: NJan Moskyto Matejka <mq@ucw.cz>
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f6c5775f
  19. 28 4月, 2017 1 次提交
  20. 18 4月, 2017 1 次提交
  21. 14 4月, 2017 1 次提交
  22. 23 3月, 2017 1 次提交
  23. 02 3月, 2017 1 次提交
  24. 27 2月, 2017 1 次提交
  25. 19 1月, 2017 1 次提交
    • D
      lwtunnel: fix autoload of lwt modules · 9ed59592
      David Ahern 提交于
      Trying to add an mpls encap route when the MPLS modules are not loaded
      hangs. For example:
      
          CONFIG_MPLS=y
          CONFIG_NET_MPLS_GSO=m
          CONFIG_MPLS_ROUTING=m
          CONFIG_MPLS_IPTUNNEL=m
      
          $ ip route add 10.10.10.10/32 encap mpls 100 via inet 10.100.1.2
      
      The ip command hangs:
      root       880   826  0 21:25 pts/0    00:00:00 ip route add 10.10.10.10/32 encap mpls 100 via inet 10.100.1.2
      
          $ cat /proc/880/stack
          [<ffffffff81065a9b>] call_usermodehelper_exec+0xd6/0x134
          [<ffffffff81065efc>] __request_module+0x27b/0x30a
          [<ffffffff814542f6>] lwtunnel_build_state+0xe4/0x178
          [<ffffffff814aa1e4>] fib_create_info+0x47f/0xdd4
          [<ffffffff814ae451>] fib_table_insert+0x90/0x41f
          [<ffffffff814a8010>] inet_rtm_newroute+0x4b/0x52
          ...
      
      modprobe is trying to load rtnl-lwt-MPLS:
      
      root       881     5  0 21:25 ?        00:00:00 /sbin/modprobe -q -- rtnl-lwt-MPLS
      
      and it hangs after loading mpls_router:
      
          $ cat /proc/881/stack
          [<ffffffff81441537>] rtnl_lock+0x12/0x14
          [<ffffffff8142ca2a>] register_netdevice_notifier+0x16/0x179
          [<ffffffffa0033025>] mpls_init+0x25/0x1000 [mpls_router]
          [<ffffffff81000471>] do_one_initcall+0x8e/0x13f
          [<ffffffff81119961>] do_init_module+0x5a/0x1e5
          [<ffffffff810bd070>] load_module+0x13bd/0x17d6
          ...
      
      The problem is that lwtunnel_build_state is called with rtnl lock
      held preventing mpls_init from registering.
      
      Given the potential references held by the time lwtunnel_build_state it
      can not drop the rtnl lock to the load module. So, extract the module
      loading code from lwtunnel_build_state into a new function to validate
      the encap type. The new function is called while converting the user
      request into a fib_config which is well before any table, device or
      fib entries are examined.
      
      Fixes: 745041e2 ("lwtunnel: autoload of lwt modules")
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9ed59592
  26. 03 1月, 2017 1 次提交
  27. 25 12月, 2016 1 次提交
  28. 04 12月, 2016 1 次提交
  29. 17 11月, 2016 1 次提交
  30. 05 11月, 2016 1 次提交
  31. 28 9月, 2016 2 次提交
  32. 07 9月, 2016 1 次提交
  33. 10 8月, 2016 1 次提交
  34. 07 5月, 2016 1 次提交
  35. 25 4月, 2016 1 次提交
  36. 23 3月, 2016 1 次提交