1. 15 11月, 2012 1 次提交
  2. 10 11月, 2012 2 次提交
  3. 28 9月, 2012 2 次提交
    • S
      tunnel: drop packet if ECN present with not-ECT · eccc1bb8
      stephen hemminger 提交于
      Linux tunnels were written before RFC6040 and therefore never
      implemented the corner case of ECN getting set in the outer header
      and the inner header not being ready for it.
      
      Section 4.2.  Default Tunnel Egress Behaviour.
       o If the inner ECN field is Not-ECT, the decapsulator MUST NOT
            propagate any other ECN codepoint onwards.  This is because the
            inner Not-ECT marking is set by transports that rely on dropped
            packets as an indication of congestion and would not understand or
            respond to any other ECN codepoint [RFC4774].  Specifically:
      
            *  If the inner ECN field is Not-ECT and the outer ECN field is
               CE, the decapsulator MUST drop the packet.
      
            *  If the inner ECN field is Not-ECT and the outer ECN field is
               Not-ECT, ECT(0), or ECT(1), the decapsulator MUST forward the
               outgoing packet with the ECN field cleared to Not-ECT.
      
      This patch moves the ECN decap logic out of the individual tunnels
      into a common place.
      
      It also adds logging to allow detecting broken systems that
      set ECN bits incorrectly when tunneling (or an intermediate
      router might be changing the header).
      
      Overloads rx_frame_error to keep track of ECN related error.
      
      Thanks to Chris Wright who caught this while reviewing the new VXLAN
      tunnel.
      
      This code was tested by injecting faulty logic in other end GRE
      to send incorrectly encapsulated packets.
      Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      eccc1bb8
    • S
      xfrm: remove extranous rcu_read_lock · b0558ef2
      stephen hemminger 提交于
      The handlers for xfrm_tunnel are always invoked with rcu read lock
      already.
      Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b0558ef2
  4. 21 7月, 2012 1 次提交
  5. 17 7月, 2012 1 次提交
    • D
      net: Pass optional SKB and SK arguments to dst_ops->{update_pmtu,redirect}() · 6700c270
      David S. Miller 提交于
      This will be used so that we can compose a full flow key.
      
      Even though we have a route in this context, we need more.  In the
      future the routes will be without destination address, source address,
      etc. keying.  One ipv4 route will cover entire subnets, etc.
      
      In this environment we have to have a way to possess persistent storage
      for redirects and PMTU information.  This persistent storage will exist
      in the FIB tables, and that's why we'll need to be able to rebuild a
      full lookup flow key here.  Using that flow key will do a fib_lookup()
      and create/update the persistent entry.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6700c270
  6. 12 7月, 2012 1 次提交
  7. 15 6月, 2012 1 次提交
    • D
      ipv4: Handle PMTU in all ICMP error handlers. · 36393395
      David S. Miller 提交于
      With ip_rt_frag_needed() removed, we have to explicitly update PMTU
      information in every ICMP error handler.
      
      Create two helper functions to facilitate this.
      
      1) ipv4_sk_update_pmtu()
      
         This updates the PMTU when we have a socket context to
         work with.
      
      2) ipv4_update_pmtu()
      
         Raw version, used when no socket context is available.  For this
         interface, we essentially just pass in explicit arguments for
         the flow identity information we would have extracted from the
         socket.
      
         And you'll notice that ipv4_sk_update_pmtu() is simply implemented
         in terms of ipv4_update_pmtu()
      
      Note that __ip_route_output_key() is used, rather than something like
      ip_route_output_flow() or ip_route_output_key().  This is because we
      absolutely do not want to end up with a route that does IPSEC
      encapsulation and the like.  Instead, we only want the route that
      would get us to the node described by the outermost IP header.
      Reported-by: NSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      36393395
  8. 16 4月, 2012 1 次提交
  9. 15 4月, 2012 1 次提交
  10. 12 3月, 2012 1 次提交
    • J
      net: Convert printks to pr_<level> · 058bd4d2
      Joe Perches 提交于
      Use a more current kernel messaging style.
      
      Convert a printk block to print_hex_dump.
      Coalesce formats, align arguments.
      Use %s, __func__ instead of embedding function names.
      
      Some messages that were prefixed with <foo>_close are
      now prefixed with <foo>_fini.  Some ah4 and esp messages
      are now not prefixed with "ip ".
      
      The intent of this patch is to later add something like
        #define pr_fmt(fmt) "IPv4: " fmt.
      to standardize the output messages.
      
      Text size is trivially reduced. (x86-32 allyesconfig)
      
      $ size net/ipv4/built-in.o*
         text	   data	    bss	    dec	    hex	filename
       887888	  31558	 249696	1169142	 11d6f6	net/ipv4/built-in.o.new
       887934	  31558	 249800	1169292	 11d78c	net/ipv4/built-in.o.old
      Signed-off-by: NJoe Perches <joe@perches.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      058bd4d2
  11. 25 1月, 2012 2 次提交
  12. 13 1月, 2012 1 次提交
  13. 13 12月, 2011 1 次提交
    • T
      ipip, sit: copy parms.name after register_netdevice · 72b36015
      Ted Feng 提交于
      Same fix as 731abb9c for ipip and sit tunnel.
      Commit 1c5cae81 removed an explicit call to dev_alloc_name in
      ipip_tunnel_locate and ipip6_tunnel_locate, because register_netdevice
      will now create a valid name, however the tunnel keeps a copy of the
      name in the private parms structure. Fix this by copying the name back
      after register_netdevice has successfully returned.
      
      This shows up if you do a simple tunnel add, followed by a tunnel show:
      
      $ sudo ip tunnel add mode ipip remote 10.2.20.211
      $ ip tunnel
      tunl0: ip/ip  remote any  local any  ttl inherit  nopmtudisc
      tunl%d: ip/ip  remote 10.2.20.211  local any  ttl inherit
      $ sudo ip tunnel add mode sit remote 10.2.20.212
      $ ip tunnel
      sit0: ipv6/ip  remote any  local any  ttl 64  nopmtudisc 6rd-prefix 2002::/16
      sit%d: ioctl 89f8 failed: No such device
      sit%d: ipv6/ip  remote 10.2.20.212  local any  ttl inherit
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NTed Feng <artisdom@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      72b36015
  14. 09 11月, 2011 1 次提交
  15. 02 8月, 2011 1 次提交
  16. 06 5月, 2011 1 次提交
  17. 05 5月, 2011 1 次提交
  18. 04 5月, 2011 1 次提交
  19. 23 4月, 2011 1 次提交
  20. 13 3月, 2011 1 次提交
  21. 10 3月, 2011 1 次提交
    • V
      net: don't allow CAP_NET_ADMIN to load non-netdev kernel modules · 8909c9ad
      Vasiliy Kulikov 提交于
      Since a8f80e8f any process with
      CAP_NET_ADMIN may load any module from /lib/modules/.  This doesn't mean
      that CAP_NET_ADMIN is a superset of CAP_SYS_MODULE as modules are
      limited to /lib/modules/**.  However, CAP_NET_ADMIN capability shouldn't
      allow anybody load any module not related to networking.
      
      This patch restricts an ability of autoloading modules to netdev modules
      with explicit aliases.  This fixes CVE-2011-1019.
      
      Arnd Bergmann suggested to leave untouched the old pre-v2.6.32 behavior
      of loading netdev modules by name (without any prefix) for processes
      with CAP_SYS_MODULE to maintain the compatibility with network scripts
      that use autoloading netdev modules by aliases like "eth0", "wlan0".
      
      Currently there are only three users of the feature in the upstream
      kernel: ipip, ip_gre and sit.
      
          root@albatros:~# capsh --drop=$(seq -s, 0 11),$(seq -s, 13 34) --
          root@albatros:~# grep Cap /proc/$$/status
          CapInh:	0000000000000000
          CapPrm:	fffffff800001000
          CapEff:	fffffff800001000
          CapBnd:	fffffff800001000
          root@albatros:~# modprobe xfs
          FATAL: Error inserting xfs
          (/lib/modules/2.6.38-rc6-00001-g2bf4ca3/kernel/fs/xfs/xfs.ko): Operation not permitted
          root@albatros:~# lsmod | grep xfs
          root@albatros:~# ifconfig xfs
          xfs: error fetching interface information: Device not found
          root@albatros:~# lsmod | grep xfs
          root@albatros:~# lsmod | grep sit
          root@albatros:~# ifconfig sit
          sit: error fetching interface information: Device not found
          root@albatros:~# lsmod | grep sit
          root@albatros:~# ifconfig sit0
          sit0      Link encap:IPv6-in-IPv4
      	      NOARP  MTU:1480  Metric:1
      
          root@albatros:~# lsmod | grep sit
          sit                    10457  0
          tunnel4                 2957  1 sit
      
      For CAP_SYS_MODULE module loading is still relaxed:
      
          root@albatros:~# grep Cap /proc/$$/status
          CapInh:	0000000000000000
          CapPrm:	ffffffffffffffff
          CapEff:	ffffffffffffffff
          CapBnd:	ffffffffffffffff
          root@albatros:~# ifconfig xfs
          xfs: error fetching interface information: Device not found
          root@albatros:~# lsmod | grep xfs
          xfs                   745319  0
      
      Reference: https://lkml.org/lkml/2011/2/24/203Signed-off-by: NVasiliy Kulikov <segoon@openwall.com>
      Signed-off-by: NMichael Tokarev <mjt@tls.msk.ru>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Acked-by: NKees Cook <kees.cook@canonical.com>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      8909c9ad
  22. 03 3月, 2011 1 次提交
  23. 02 12月, 2010 1 次提交
  24. 18 11月, 2010 1 次提交
  25. 28 10月, 2010 1 次提交
  26. 06 10月, 2010 1 次提交
    • E
      net: add a core netdev->rx_dropped counter · caf586e5
      Eric Dumazet 提交于
      In various situations, a device provides a packet to our stack and we
      drop it before it enters protocol stack :
      - softnet backlog full (accounted in /proc/net/softnet_stat)
      - bad vlan tag (not accounted)
      - unknown/unregistered protocol (not accounted)
      
      We can handle a per-device counter of such dropped frames at core level,
      and automatically adds it to the device provided stats (rx_dropped), so
      that standard tools can be used (ifconfig, ip link, cat /proc/net/dev)
      
      This is a generalization of commit 8990f468 (net: rx_dropped
      accounting), thus reverting it.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      caf586e5
  27. 30 9月, 2010 2 次提交
    • E
      ipip: enable lockless xmits · 153f0943
      Eric Dumazet 提交于
      IPIP tunnels can benefit from lockless xmits, using NETIF_F_LLTX
      
      Bench on a 16 cpus machine (dual E5540 cpus), 16 threads sending
      10000000 UDP frames via one ipip tunnel (size:200 bytes per frame)
      
      Before patch :
      real	2m53.321s
      user	0m10.277s
      sys	46m0.597s
      
      After patch:
      real	0m32.063s
      user	0m9.237s
      sys	8m16.255s
      
      Last problem to solve is the contention on dst :
      
      16118.00 28.3% __ip_route_output_key         vmlinux
       6135.00 10.8% dst_release                   vmlinux
       3220.00  5.6% ip_finish_output              vmlinux
       2149.00  3.8% ip_route_output_flow          vmlinux
       1575.00  2.8% ip_append_data                vmlinux
       1481.00  2.6% ip_push_pending_frames        vmlinux
       1349.00  2.4% __xfrm_lookup                 vmlinux
       1216.00  2.1% csum_partial_copy_generic     vmlinux
       1208.00  2.1% udp_sendmsg                   vmlinux
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      153f0943
    • E
      ipip: fix percpu stats accounting · fada5636
      Eric Dumazet 提交于
      commit 3c97af99 (ipip: percpu stats accounting) forgot the fallback
      tunnel case (tunl0), and can crash pretty fast.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fada5636
  28. 28 9月, 2010 1 次提交
  29. 21 9月, 2010 1 次提交
  30. 16 9月, 2010 1 次提交
  31. 31 8月, 2010 1 次提交
  32. 11 6月, 2010 1 次提交
  33. 18 5月, 2010 1 次提交
  34. 30 3月, 2010 1 次提交
    • T
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo 提交于
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
        file.
      
      The conversion was done in the following steps.
      
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
         files.
      
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
         necessary.
      
      6. percpu.h was updated not to include slab.h.
      
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
      
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      5a0e3ad6
  35. 17 2月, 2010 1 次提交
  36. 18 1月, 2010 1 次提交