1. 25 1月, 2017 2 次提交
  2. 18 1月, 2017 1 次提交
  3. 12 1月, 2017 1 次提交
  4. 01 12月, 2016 1 次提交
  5. 18 11月, 2016 1 次提交
    • A
      netns: make struct pernet_operations::id unsigned int · c7d03a00
      Alexey Dobriyan 提交于
      Make struct pernet_operations::id unsigned.
      
      There are 2 reasons to do so:
      
      1)
      This field is really an index into an zero based array and
      thus is unsigned entity. Using negative value is out-of-bound
      access by definition.
      
      2)
      On x86_64 unsigned 32-bit data which are mixed with pointers
      via array indexing or offsets added or subtracted to pointers
      are preffered to signed 32-bit data.
      
      "int" being used as an array index needs to be sign-extended
      to 64-bit before being used.
      
      	void f(long *p, int i)
      	{
      		g(p[i]);
      	}
      
        roughly translates to
      
      	movsx	rsi, esi
      	mov	rdi, [rsi+...]
      	call 	g
      
      MOVSX is 3 byte instruction which isn't necessary if the variable is
      unsigned because x86_64 is zero extending by default.
      
      Now, there is net_generic() function which, you guessed it right, uses
      "int" as an array index:
      
      	static inline void *net_generic(const struct net *net, int id)
      	{
      		...
      		ptr = ng->ptr[id - 1];
      		...
      	}
      
      And this function is used a lot, so those sign extensions add up.
      
      Patch snipes ~1730 bytes on allyesconfig kernel (without all junk
      messing with code generation):
      
      	add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)
      
      Unfortunately some functions actually grow bigger.
      This is a semmingly random artefact of code generation with register
      allocator being used differently. gcc decides that some variable
      needs to live in new r8+ registers and every access now requires REX
      prefix. Or it is shifted into r12, so [r12+0] addressing mode has to be
      used which is longer than [r8]
      
      However, overall balance is in negative direction:
      
      	add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)
      	function                                     old     new   delta
      	nfsd4_lock                                  3886    3959     +73
      	tipc_link_build_proto_msg                   1096    1140     +44
      	mac80211_hwsim_new_radio                    2776    2808     +32
      	tipc_mon_rcv                                1032    1058     +26
      	svcauth_gss_legacy_init                     1413    1429     +16
      	tipc_bcbase_select_primary                   379     392     +13
      	nfsd4_exchange_id                           1247    1260     +13
      	nfsd4_setclientid_confirm                    782     793     +11
      		...
      	put_client_renew_locked                      494     480     -14
      	ip_set_sockfn_get                            730     716     -14
      	geneve_sock_add                              829     813     -16
      	nfsd4_sequence_done                          721     703     -18
      	nlmclnt_lookup_host                          708     686     -22
      	nfsd4_lockt                                 1085    1063     -22
      	nfs_get_client                              1077    1050     -27
      	tcf_bpf_init                                1106    1076     -30
      	nfsd4_encode_fattr                          5997    5930     -67
      	Total: Before=154856051, After=154854321, chg -0.00%
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c7d03a00
  6. 16 11月, 2016 7 次提交
  7. 10 11月, 2016 1 次提交
  8. 30 10月, 2016 1 次提交
  9. 21 10月, 2016 2 次提交
    • J
      net: use core MTU range checking in core net infra · 91572088
      Jarod Wilson 提交于
      geneve:
      - Merge __geneve_change_mtu back into geneve_change_mtu, set max_mtu
      - This one isn't quite as straight-forward as others, could use some
        closer inspection and testing
      
      macvlan:
      - set min/max_mtu
      
      tun:
      - set min/max_mtu, remove tun_net_change_mtu
      
      vxlan:
      - Merge __vxlan_change_mtu back into vxlan_change_mtu
      - Set max_mtu to IP_MAX_MTU and retain dynamic MTU range checks in
        change_mtu function
      - This one is also not as straight-forward and could use closer inspection
        and testing from vxlan folks
      
      bridge:
      - set max_mtu of IP_MAX_MTU and retain dynamic MTU range checks in
        change_mtu function
      
      openvswitch:
      - set min/max_mtu, remove internal_dev_change_mtu
      - note: max_mtu wasn't checked previously, it's been set to 65535, which
        is the largest possible size supported
      
      sch_teql:
      - set min/max_mtu (note: max_mtu previously unchecked, used max of 65535)
      
      macsec:
      - min_mtu = 0, max_mtu = 65535
      
      macvlan:
      - min_mtu = 0, max_mtu = 65535
      
      ntb_netdev:
      - min_mtu = 0, max_mtu = 65535
      
      veth:
      - min_mtu = 68, max_mtu = 65535
      
      8021q:
      - min_mtu = 0, max_mtu = 65535
      
      CC: netdev@vger.kernel.org
      CC: Nicolas Dichtel <nicolas.dichtel@6wind.com>
      CC: Hannes Frederic Sowa <hannes@stressinduktion.org>
      CC: Tom Herbert <tom@herbertland.com>
      CC: Daniel Borkmann <daniel@iogearbox.net>
      CC: Alexander Duyck <alexander.h.duyck@intel.com>
      CC: Paolo Abeni <pabeni@redhat.com>
      CC: Jiri Benc <jbenc@redhat.com>
      CC: WANG Cong <xiyou.wangcong@gmail.com>
      CC: Roopa Prabhu <roopa@cumulusnetworks.com>
      CC: Pravin B Shelar <pshelar@ovn.org>
      CC: Sabrina Dubroca <sd@queasysnail.net>
      CC: Patrick McHardy <kaber@trash.net>
      CC: Stephen Hemminger <stephen@networkplumber.org>
      CC: Pravin Shelar <pshelar@nicira.com>
      CC: Maxim Krasnyansky <maxk@qti.qualcomm.com>
      Signed-off-by: NJarod Wilson <jarod@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      91572088
    • S
      net: add recursion limit to GRO · fcd91dd4
      Sabrina Dubroca 提交于
      Currently, GRO can do unlimited recursion through the gro_receive
      handlers.  This was fixed for tunneling protocols by limiting tunnel GRO
      to one level with encap_mark, but both VLAN and TEB still have this
      problem.  Thus, the kernel is vulnerable to a stack overflow, if we
      receive a packet composed entirely of VLAN headers.
      
      This patch adds a recursion counter to the GRO layer to prevent stack
      overflow.  When a gro_receive function hits the recursion limit, GRO is
      aborted for this skb and it is processed normally.  This recursion
      counter is put in the GRO CB, but could be turned into a percpu counter
      if we run out of space in the CB.
      
      Thanks to Vladimír Beneš <vbenes@redhat.com> for the initial bug report.
      
      Fixes: CVE-2016-7039
      Fixes: 9b174d88 ("net: Add Transparent Ethernet Bridging GRO support.")
      Fixes: 66e5133f ("vlan: Add GRO support for non hardware accelerated vlan")
      Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
      Reviewed-by: NJiri Benc <jbenc@redhat.com>
      Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Acked-by: NTom Herbert <tom@herbertland.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fcd91dd4
  10. 11 9月, 2016 1 次提交
  11. 07 9月, 2016 1 次提交
  12. 05 9月, 2016 3 次提交
    • J
      vxlan: fix duplicated and wrong error messages · 3555621d
      Jiri Benc 提交于
      vxlan_dev_configure outputs error messages before returning, no need to
      print again the same mesages in vxlan_newlink. Also, vxlan_dev_configure may
      return a particular error code for a different reason than vxlan_newlink
      thinks.
      
      Move the remaining error messages into vxlan_dev_configure and let
      vxlan_newlink just pass on the error code.
      Signed-off-by: NJiri Benc <jbenc@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3555621d
    • J
      vxlan: reject multicast destination without an interface · 9b4cdd51
      Jiri Benc 提交于
      Currently, kernel accepts configurations such as:
      
        ip l a type vxlan dstport 4789 id 1 group 239.192.0.1
        ip l a type vxlan dstport 4789 id 1 group ff0e::110
      
      However, neither of those really works. In the IPv4 case, the interface
      cannot be brought up ("RTNETLINK answers: No such device"). This is because
      multicast join will be rejected without the interface being specified.
      
      In the IPv6 case, multicast wil be joined on the first interface found. This
      is not what the user wants as it depends on random factors (order of
      interfaces).
      
      Note that it's possible to add a local address but it doesn't solve
      anything. For IPv4, it's not considered in the multicast join (thus the same
      error as above is returned on ifup). This could be added but it wouldn't
      help for IPv6 anyway. For IPv6, we do need the interface.
      
      Just reject a configuration that sets multicast address and does not provide
      an interface. Nobody can depend on the previous behavior as it never worked.
      Signed-off-by: NJiri Benc <jbenc@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9b4cdd51
    • W
      vxlan: call peernet2id() in fdb notification · 38f507f1
      WANG Cong 提交于
      netns id should be already allocated each time we change
      netns, that is, in dev_change_net_namespace() (more precisely
      in rtnl_fill_ifinfo()). It is safe to just call peernet2id() here.
      
      Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      38f507f1
  13. 02 9月, 2016 1 次提交
    • R
      rtnetlink: fdb dump: optimize by saving last interface markers · d297653d
      Roopa Prabhu 提交于
      fdb dumps spanning multiple skb's currently restart from the first
      interface again for every skb. This results in unnecessary
      iterations on the already visited interfaces and their fdb
      entries. In large scale setups, we have seen this to slow
      down fdb dumps considerably. On a system with 30k macs we
      see fdb dumps spanning across more than 300 skbs.
      
      To fix the problem, this patch replaces the existing single fdb
      marker with three markers: netdev hash entries, netdevs and fdb
      index to continue where we left off instead of restarting from the
      first netdev. This is consistent with link dumps.
      
      In the process of fixing the performance issue, this patch also
      re-implements fix done by
      commit 472681d5 ("net: ndo_fdb_dump should report -EMSGSIZE to rtnl_fdb_dump")
      (with an internal fix from Wilson Kok) in the following ways:
      - change ndo_fdb_dump handlers to return error code instead
      of the last fdb index
      - use cb->args strictly for dump frag markers and not error codes.
      This is consistent with other dump functions.
      
      Below results were taken on a system with 1000 netdevs
      and 35085 fdb entries:
      before patch:
      $time bridge fdb show | wc -l
      15065
      
      real    1m11.791s
      user    0m0.070s
      sys 1m8.395s
      
      (existing code does not return all macs)
      
      after patch:
      $time bridge fdb show | wc -l
      35085
      
      real    0m2.017s
      user    0m0.113s
      sys 0m1.942s
      Signed-off-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: NWilson Kok <wkok@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d297653d
  14. 27 8月, 2016 1 次提交
  15. 09 8月, 2016 2 次提交
  16. 12 7月, 2016 1 次提交
  17. 18 6月, 2016 4 次提交
  18. 15 6月, 2016 1 次提交
    • N
      ovs/vxlan: fix rtnl notifications on iface deletion · cf5da330
      Nicolas Dichtel 提交于
      The function vxlan_dev_create() (only used by ovs) never calls
      rtnl_configure_link(). The consequence is that dev->rtnl_link_state is
      never set to RTNL_LINK_INITIALIZED.
      During the deletion phase, the function rollback_registered_many() sends
      a RTM_DELLINK only if dev->rtnl_link_state is set to RTNL_LINK_INITIALIZED.
      
      Note that the function vxlan_dev_create() is moved after the rtnl stuff so
      that vxlan_dellink() can be called in this function.
      
      Fixes: dcc38c03 ("openvswitch: Re-add CONFIG_OPENVSWITCH_VXLAN")
      CC: Thomas Graf <tgraf@suug.ch>
      CC: Pravin B Shelar <pshelar@nicira.com>
      Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cf5da330
  19. 01 6月, 2016 1 次提交
  20. 21 5月, 2016 1 次提交
  21. 17 5月, 2016 1 次提交
  22. 07 5月, 2016 2 次提交
  23. 30 4月, 2016 1 次提交
  24. 22 4月, 2016 1 次提交
    • H
      vxlan: break dependency with netdev drivers · b7aade15
      Hannes Frederic Sowa 提交于
      Currently all drivers depend and autoload the vxlan module because how
      vxlan_get_rx_port is linked into them. Remove this dependency:
      
      By using a new event type in the netdevice notifier call chain we proxy
      the request from the drivers to flush and resetup the vxlan ports not
      directly via function call but by the already existing netdevice
      notifier call chain.
      
      I added a separate new event type, NETDEV_OFFLOAD_PUSH_VXLAN, to do so.
      We don't need to save those ids, as the event type field is an unsigned
      long and using specialized event types for this purpose seemed to be a
      more elegant way. This also comes in beneficial if in future we want to
      add offloading knobs for vxlan.
      
      Cc: Jesse Gross <jesse@kernel.org>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b7aade15
  25. 17 4月, 2016 1 次提交