1. 02 11月, 2015 1 次提交
    • J
      ipv4: fix to not remove local route on link down · 4f823def
      Julian Anastasov 提交于
      When fib_netdev_event calls fib_disable_ip on NETDEV_DOWN event
      we should not delete the local routes if the local address
      is still present. The confusion comes from the fact that both
      fib_netdev_event and fib_inetaddr_event use the NETDEV_DOWN
      constant. Fix it by returning back the variable 'force'.
      
      Steps to reproduce:
      modprobe dummy
      ifconfig dummy0 192.168.168.1 up
      ifconfig dummy0 down
      ip route list table local | grep dummy | grep host
      local 192.168.168.1 dev dummy0  proto kernel  scope host  src 192.168.168.1
      
      Fixes: 8a3d0316 ("net: track link-status of ipv4 nexthops")
      Signed-off-by: NJulian Anastasov <ja@ssi.bg>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4f823def
  2. 23 10月, 2015 1 次提交
    • P
      openvswitch: Fix egress tunnel info. · fc4099f1
      Pravin B Shelar 提交于
      While transitioning to netdev based vport we broke OVS
      feature which allows user to retrieve tunnel packet egress
      information for lwtunnel devices.  Following patch fixes it
      by introducing ndo operation to get the tunnel egress info.
      Same ndo operation can be used for lwtunnel devices and compat
      ovs-tnl-vport devices. So after adding such device operation
      we can remove similar operation from ovs-vport.
      
      Fixes: 614732ea ("openvswitch: Use regular VXLAN net_device device").
      Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fc4099f1
  3. 17 10月, 2015 1 次提交
    • E
      net: add pfmemalloc check in sk_add_backlog() · c7c49b8f
      Eric Dumazet 提交于
      Greg reported crashes hitting the following check in __sk_backlog_rcv()
      
      	BUG_ON(!sock_flag(sk, SOCK_MEMALLOC));
      
      The pfmemalloc bit is currently checked in sk_filter().
      
      This works correctly for TCP, because sk_filter() is ran in
      tcp_v[46]_rcv() before hitting the prequeue or backlog checks.
      
      For UDP or other protocols, this does not work, because the sk_filter()
      is ran from sock_queue_rcv_skb(), which might be called _after_ backlog
      queuing if socket is owned by user by the time packet is processed by
      softirq handler.
      
      Fixes: b4b9e355 ("netvm: set PF_MEMALLOC as appropriate during SKB processing")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NGreg Thelen <gthelen@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c7c49b8f
  4. 08 10月, 2015 1 次提交
  5. 05 10月, 2015 1 次提交
    • R
      tcp/dccp: fix old style declarations · 8695a144
      Raanan Avargil 提交于
      I’m using the compilation flag -Werror=old-style-declaration, which
      requires that the “inline” word would come at the beginning of the code
      line.
      
      $ make drivers/net/ethernet/intel/e1000e/e1000e.ko
      ...
      include/net/inet_timewait_sock.h:116:1: error: ‘inline’ is not at
      beginning of declaration [-Werror=old-style-declaration]
      static void inline inet_twsk_schedule(struct inet_timewait_sock *tw, int
      timeo)
      
      include/net/inet_timewait_sock.h:121:1: error: ‘inline’ is not at
      beginning of declaration [-Werror=old-style-declaration]
      static void inline inet_twsk_reschedule(struct inet_timewait_sock *tw,
      int timeo)
      
      Fixes: ed2e9239 ("tcp/dccp: fix timewait races in timer handling")
      Signed-off-by: NRaanan Avargil <raanan.avargil@intel.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8695a144
  6. 30 9月, 2015 1 次提交
  7. 25 9月, 2015 1 次提交
    • J
      ipv4: send arp replies to the correct tunnel · 63d008a4
      Jiri Benc 提交于
      When using ip lwtunnels, the additional data for xmit (basically, the actual
      tunnel to use) are carried in ip_tunnel_info either in dst->lwtstate or in
      metadata dst. When replying to ARP requests, we need to send the reply to
      the same tunnel the request came from. This means we need to construct
      proper metadata dst for ARP replies.
      
      We could perform another route lookup to get a dst entry with the correct
      lwtstate. However, this won't always ensure that the outgoing tunnel is the
      same as the incoming one, and it won't work anyway for IPv4 duplicate
      address detection.
      
      The only thing to do is to "reverse" the ip_tunnel_info.
      Signed-off-by: NJiri Benc <jbenc@redhat.com>
      Acked-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      63d008a4
  8. 22 9月, 2015 1 次提交
    • E
      tcp/dccp: fix timewait races in timer handling · ed2e9239
      Eric Dumazet 提交于
      When creating a timewait socket, we need to arm the timer before
      allowing other cpus to find it. The signal allowing cpus to find
      the socket is setting tw_refcnt to non zero value.
      
      As we set tw_refcnt in __inet_twsk_hashdance(), we therefore need to
      call inet_twsk_schedule() first.
      
      This also means we need to remove tw_refcnt changes from
      inet_twsk_schedule() and let the caller handle it.
      
      Note that because we use mod_timer_pinned(), we have the guarantee
      the timer wont expire before we set tw_refcnt as we run in BH context.
      
      To make things more readable I introduced inet_twsk_reschedule() helper.
      
      When rearming the timer, we can use mod_timer_pending() to make sure
      we do not rearm a canceled timer.
      
      Note: This bug can possibly trigger if packets of a flow can hit
      multiple cpus. This does not normally happen, unless flow steering
      is broken somehow. This explains this bug was spotted ~5 months after
      its introduction.
      
      A similar fix is needed for SYN_RECV sockets in reqsk_queue_hash_req(),
      but will be provided in a separate patch for proper tracking.
      
      Fixes: 789f558c ("tcp/dccp: get rid of central timewait timer")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NYing Cai <ycai@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ed2e9239
  9. 21 9月, 2015 2 次提交
    • N
      ip6tunnel: make rx/tx bytes counters consistent · 83cf9a25
      Nicolas Dichtel 提交于
      Like the previous patch, which fixes ipv4 tunnels, here is the ipv6 part.
      
      Before the patch, the external ipv6 header + gre header were included on
      tx.
      
      After the patch:
      $ ping -c1 192.168.6.121 ; ip -s l ls dev ip6gre1
      PING 192.168.6.121 (192.168.6.121) 56(84) bytes of data.
      64 bytes from 192.168.6.121: icmp_req=1 ttl=64 time=1.92 ms
      
      --- 192.168.6.121 ping statistics ---
      1 packets transmitted, 1 received, 0% packet loss, time 0ms
      rtt min/avg/max/mdev = 1.923/1.923/1.923/0.000 ms
      7: ip6gre1@NONE: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1440 qdisc noqueue state UNKNOWN mode DEFAULT group default
          link/gre6 20:01:06:60:30:08:c1:c3:00:00:00:00:00:00:01:23 peer 20:01:06:60:30:08:c1:c3:00:00:00:00:00:00:01:21
          RX: bytes  packets  errors  dropped overrun mcast
          84         1        0       0       0       0
          TX: bytes  packets  errors  dropped carrier collsns
          84         1        0       0       0       0
      $ ping -c1 192.168.1.121 ; ip -s l ls dev ip6tnl1
      PING 192.168.1.121 (192.168.1.121) 56(84) bytes of data.
      64 bytes from 192.168.1.121: icmp_req=1 ttl=64 time=2.28 ms
      
      --- 192.168.1.121 ping statistics ---
      1 packets transmitted, 1 received, 0% packet loss, time 0ms
      rtt min/avg/max/mdev = 2.288/2.288/2.288/0.000 ms
      8: ip6tnl1@NONE: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1452 qdisc noqueue state UNKNOWN mode DEFAULT group default
          link/tunnel6 2001:660:3008:c1c3::123 peer 2001:660:3008:c1c3::121
          RX: bytes  packets  errors  dropped overrun mcast
          84         1        0       0       0       0
          TX: bytes  packets  errors  dropped carrier collsns
          84         1        0       0       0       0
      Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      83cf9a25
    • N
      net: Fix behaviour of unreachable, blackhole and prohibit routes · 0315e382
      Nikola Forró 提交于
      Man page of ip-route(8) says following about route types:
      
        unreachable - these destinations are unreachable.  Packets are dis‐
        carded and the ICMP message host unreachable is generated.  The local
        senders get an EHOSTUNREACH error.
      
        blackhole - these destinations are unreachable.  Packets are dis‐
        carded silently.  The local senders get an EINVAL error.
      
        prohibit - these destinations are unreachable.  Packets are discarded
        and the ICMP message communication administratively prohibited is
        generated.  The local senders get an EACCES error.
      
      In the inet6 address family, this was correct, except the local senders
      got ENETUNREACH error instead of EHOSTUNREACH in case of unreachable route.
      In the inet address family, all three route types generated ICMP message
      net unreachable, and the local senders got ENETUNREACH error.
      
      In both address families all three route types now behave consistently
      with documentation.
      Signed-off-by: NNikola Forró <nforro@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0315e382
  10. 18 9月, 2015 2 次提交
  11. 16 9月, 2015 3 次提交
  12. 10 9月, 2015 1 次提交
    • P
      net: ipv6: use common fib_default_rule_pref · f53de1e9
      Phil Sutter 提交于
      This switches IPv6 policy routing to use the shared
      fib_default_rule_pref() function of IPv4 and DECnet. It is also used in
      multicast routing for IPv4 as well as IPv6.
      
      The motivation for this patch is a complaint about iproute2 behaving
      inconsistent between IPv4 and IPv6 when adding policy rules: Formerly,
      IPv6 rules were assigned a fixed priority of 0x3FFF whereas for IPv4 the
      assigned priority value was decreased with each rule added.
      
      Since then all users of the default_pref field have been converted to
      assign the generic function fib_default_rule_pref(), fib_nl_newrule()
      may just use it directly instead. Therefore get rid of the function
      pointer altogether and make fib_default_rule_pref() static, as it's not
      used outside fib_rules.c anymore.
      Signed-off-by: NPhil Sutter <phil@nwl.cc>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f53de1e9
  13. 09 9月, 2015 2 次提交
    • M
      memcg: move memcg_proto_active from sock.h · e752eb68
      Michal Hocko 提交于
      The only user is sock_update_memcg which is living in memcontrol.c so it
      doesn't make much sense to pollute sock.h by this inline helper.  Move it
      to memcontrol.c and open code it into its only caller.
      Signed-off-by: NMichal Hocko <mhocko@suse.com>
      Cc: Vladimir Davydov <vdavydov@parallels.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e752eb68
    • M
      memcg: export struct mem_cgroup · 33398cf2
      Michal Hocko 提交于
      mem_cgroup structure is defined in mm/memcontrol.c currently which means
      that the code outside of this file has to use external API even for
      trivial access stuff.
      
      This patch exports mm_struct with its dependencies and makes some of the
      exported functions inlines.  This even helps to reduce the code size a bit
      (make defconfig + CONFIG_MEMCG=y)
      
        text		data    bss     dec     	 hex 	filename
        12355346        1823792 1089536 15268674         e8fb42 vmlinux.before
        12354970        1823792 1089536 15268298         e8f9ca vmlinux.after
      
      This is not much (370B) but better than nothing.
      
      We also save a function call in some hot paths like callers of
      mem_cgroup_count_vm_event which is used for accounting.
      
      The patch doesn't introduce any functional changes.
      
      [vdavykov@parallels.com: inline memcg_kmem_is_active]
      [vdavykov@parallels.com: do not expose type outside of CONFIG_MEMCG]
      [akpm@linux-foundation.org: memcontrol.h needs eventfd.h for eventfd_ctx]
      [akpm@linux-foundation.org: export mem_cgroup_from_task() to modules]
      Signed-off-by: NMichal Hocko <mhocko@suse.cz>
      Reviewed-by: NVladimir Davydov <vdavydov@parallels.com>
      Suggested-by: NJohannes Weiner <hannes@cmpxchg.org>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      33398cf2
  14. 04 9月, 2015 1 次提交
    • A
      mac80211: protect non-HT BSS when HT TDLS traffic exists · 22f66895
      Avri Altman 提交于
      HT TDLS traffic should be protected in a non-HT BSS to avoid
      collisions. Therefore, when TDLS peers join/leave, check if
      protection is (now) needed and set the ht_operation_mode of
      the virtual interface according to the HT capabilities of the
      TDLS peer(s).
      
      This works because a non-HT BSS connection never sets (or
      otherwise uses) the ht_operation_mode; it just means that
      drivers must be aware that this field applies to all HT
      traffic for this virtual interface, not just the traffic
      within the BSS. Document that.
      Signed-off-by: NAvri Altman <avri.altman@intel.com>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      22f66895
  15. 03 9月, 2015 1 次提交
    • D
      netfilter: nf_conntrack: make nf_ct_zone_dflt built-in · 62da9865
      Daniel Borkmann 提交于
      Fengguang reported, that some randconfig generated the following linker
      issue with nf_ct_zone_dflt object involved:
      
        [...]
        CC      init/version.o
        LD      init/built-in.o
        net/built-in.o: In function `ipv4_conntrack_defrag':
        nf_defrag_ipv4.c:(.text+0x93e95): undefined reference to `nf_ct_zone_dflt'
        net/built-in.o: In function `ipv6_defrag':
        nf_defrag_ipv6_hooks.c:(.text+0xe3ffe): undefined reference to `nf_ct_zone_dflt'
        make: *** [vmlinux] Error 1
      
      Given that configurations exist where we have a built-in part, which is
      accessing nf_ct_zone_dflt such as the two handlers nf_ct_defrag_user()
      and nf_ct6_defrag_user(), and a part that configures nf_conntrack as a
      module, we must move nf_ct_zone_dflt into a fixed, guaranteed built-in
      area when netfilter is configured in general.
      
      Therefore, split the more generic parts into a common header under
      include/linux/netfilter/ and move nf_ct_zone_dflt into the built-in
      section that already holds parts related to CONFIG_NF_CONNTRACK in the
      netfilter core. This fixes the issue on my side.
      
      Fixes: 308ac914 ("netfilter: nf_conntrack: push zone object into functions")
      Reported-by: NFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      62da9865
  16. 02 9月, 2015 10 次提交
  17. 01 9月, 2015 5 次提交
  18. 31 8月, 2015 3 次提交
  19. 30 8月, 2015 2 次提交
    • J
      vxlan: do not receive IPv4 packets on IPv6 socket · a43a9ef6
      Jiri Benc 提交于
      By default (subject to the sysctl settings), IPv6 sockets listen also for
      IPv4 traffic. Vxlan is not prepared for that and expects IPv6 header in
      packets received through an IPv6 socket.
      
      In addition, it's currently not possible to have both IPv4 and IPv6 vxlan
      tunnel on the same port (unless bindv6only sysctl is enabled), as it's not
      possible to create and bind both IPv4 and IPv6 vxlan interfaces and there's
      no way to specify both IPv4 and IPv6 remote/group IP addresses.
      
      Set IPV6_V6ONLY on vxlan sockets to fix both of these issues. This is not
      done globally in udp_tunnel, as l2tp and tipc seems to work okay when
      receiving IPv4 packets on IPv6 socket and people may rely on this behavior.
      The other tunnels (geneve and fou) do not support IPv6.
      Signed-off-by: NJiri Benc <jbenc@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a43a9ef6
    • J
      ip_tunnels: record IP version in tunnel info · 7f9562a1
      Jiri Benc 提交于
      There's currently nothing preventing directing packets with IPv6
      encapsulation data to IPv4 tunnels (and vice versa). If this happens,
      IPv6 addresses are incorrectly interpreted as IPv4 ones.
      
      Track whether the given ip_tunnel_key contains IPv4 or IPv6 data. Store this
      in ip_tunnel_info. Reject packets at appropriate places if they are supposed
      to be encapsulated into an incompatible protocol.
      Signed-off-by: NJiri Benc <jbenc@redhat.com>
      Acked-by: NAlexei Starovoitov <ast@plumgrid.com>
      Acked-by: NThomas Graf <tgraf@suug.ch>
      Acked-by: NPravin B Shelar <pshelar@nicira.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7f9562a1