1. 20 10月, 2011 5 次提交
    • R
      neigh: fix rcu splat in neigh_update() · e049f288
      roy.qing.li@gmail.com 提交于
      when use dst_get_neighbour to get neighbour, we need
      rcu_read_lock to protect, since dst_get_neighbour uses
      rcu_dereference.
      
      The bug was reported by Ari Savolainen <ari.m.savolainen@gmail.com>
      
      [  105.612095]
      [  105.612096] ===================================================
      [  105.612100] [ INFO: suspicious rcu_dereference_check() usage. ]
      [  105.612101] ---------------------------------------------------
      [  105.612103] include/net/dst.h:91 invoked rcu_dereference_check()
      without protection!
      [  105.612105]
      [  105.612106] other info that might help us debug this:
      [  105.612106]
      [  105.612108]
      [  105.612108] rcu_scheduler_active = 1, debug_locks = 0
      [  105.612110] 1 lock held by dnsmasq/2618:
      [  105.612111]  #0:  (rtnl_mutex){+.+.+.}, at: [<ffffffff815df8c7>]
      rtnl_lock+0x17/0x20
      [  105.612120]
      [  105.612121] stack backtrace:
      [  105.612123] Pid: 2618, comm: dnsmasq Not tainted 3.1.0-rc1 #41
      [  105.612125] Call Trace:
      [  105.612129]  [<ffffffff810ccdcb>] lockdep_rcu_dereference+0xbb/0xc0
      [  105.612132]  [<ffffffff815dc5a9>] neigh_update+0x4f9/0x5f0
      [  105.612135]  [<ffffffff815da001>] ? neigh_lookup+0xe1/0x220
      [  105.612139]  [<ffffffff81639298>] arp_req_set+0xb8/0x230
      [  105.612142]  [<ffffffff8163a59f>] arp_ioctl+0x1bf/0x310
      [  105.612146]  [<ffffffff810baa40>] ? lock_hrtimer_base.isra.26+0x30/0x60
      [  105.612150]  [<ffffffff8163fb75>] inet_ioctl+0x85/0x90
      [  105.612154]  [<ffffffff815b5520>] sock_do_ioctl+0x30/0x70
      [  105.612157]  [<ffffffff815b55d3>] sock_ioctl+0x73/0x280
      [  105.612162]  [<ffffffff811b7698>] do_vfs_ioctl+0x98/0x570
      [  105.612165]  [<ffffffff811a5c40>] ? fget_light+0x340/0x3a0
      [  105.612168]  [<ffffffff811b7bbf>] sys_ioctl+0x4f/0x80
      [  105.612172]  [<ffffffff816fdcab>] system_call_fastpath+0x16/0x1b
      Reported-by: NAri Savolainen <ari.m.savolainen@gmail.com>
      Signed-off-by: NRongQing <roy.qing.li@gmail.com>
      Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e049f288
    • D
      filter: use unsigned int to silence static checker warning · 4f25af27
      Dan Carpenter 提交于
      This is just a cleanup.
      
      My testing version of Smatch warns about this:
      net/core/filter.c +380 check_load_and_stores(6)
      	warn: check 'flen' for negative values
      
      flen comes from the user.  We try to clamp the values here between 1
      and BPF_MAXINSNS but the clamp doesn't work because it could be
      negative.  This is a bug, but it's not exploitable.
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4f25af27
    • R
      net: validate HWTSTAMP ioctl parameters · 4dc360c5
      Richard Cochran 提交于
      This patch adds a sanity check on the values provided by user space for
      the hardware time stamping configuration. If the values lie outside of
      the absolute limits, then the ioctl request will be denied.
      Signed-off-by: NRichard Cochran <richard.cochran@omicron.at>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4dc360c5
    • E
      net: Move rcu_barrier from rollback_registered_many to netdev_run_todo. · 850a545b
      Eric W. Biederman 提交于
      This patch moves the rcu_barrier from rollback_registered_many
      (inside the rtnl_lock) into netdev_run_todo (just outside the rtnl_lock).
      This allows us to gain the full benefit of sychronize_net calling
      synchronize_rcu_expedited when the rtnl_lock is held.
      
      The rcu_barrier in rollback_registered_many was originally a synchronize_net
      but was promoted to be a rcu_barrier() when it was found that people were
      unnecessarily hitting the 250ms wait in netdev_wait_allrefs().  Changing
      the rcu_barrier back to a synchronize_net is therefore safe.
      
      Since we only care about waiting for the rcu callbacks before we get
      to netdev_wait_allrefs() it is also safe to move the wait into
      netdev_run_todo.
      
      This was tested by creating and destroying 1000 tap devices and observing
      /proc/lock_stat.  /proc/lock_stat reports this change reduces the hold
      times of the rtnl_lock by a factor of 10.  There was no observable
      difference in the amount of time it takes to destroy a network device.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      850a545b
    • A
      net: Allow skb_recycle_check to be done in stages · 3d153a7c
      Andy Fleming 提交于
      skb_recycle_check resets the skb if it's eligible for recycling.
      However, there are times when a driver might want to optionally
      manipulate the skb data with the skb before resetting the skb,
      but after it has determined eligibility.  We do this by splitting the
      eligibility check from the skb reset, creating two inline functions to
      accomplish that task.
      Signed-off-by: NAndy Fleming <afleming@freescale.com>
      Acked-by: NDavid Daney <david.daney@cavium.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3d153a7c
  2. 19 10月, 2011 2 次提交
    • E
      net: add skb frag size accessors · 9e903e08
      Eric Dumazet 提交于
      To ease skb->truesize sanitization, its better to be able to localize
      all references to skb frags size.
      
      Define accessors : skb_frag_size() to fetch frag size, and
      skb_frag_size_{set|add|sub}() to manipulate it.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9e903e08
    • J
      net: allow vlan traffic to be received under bond · 2425717b
      John Fastabend 提交于
      The following configuration used to work as I expected. At least
      we could use the fcoe interfaces to do MPIO and the bond0 iface
      to do load balancing or failover.
      
             ---eth2.228-fcoe
             |
      eth2 -----|
                |
                |---- bond0
                |
      eth3 -----|
             |
             ---eth3.228-fcoe
      
      This worked because of a change we added to allow inactive slaves
      to rx 'exact' matches. This functionality was kept intact with the
      rx_handler mechanism. However now the vlan interface attached to the
      active slave never receives traffic because the bonding rx_handler
      updates the skb->dev and goto's another_round. Previously, the
      vlan_do_receive() logic was called before the bonding rx_handler.
      
      Now by the time vlan_do_receive calls vlan_find_dev() the
      skb->dev is set to bond0 and it is clear no vlan is attached
      to this iface. The vlan lookup fails.
      
      This patch moves the VLAN check above the rx_handler. A VLAN
      tagged frame is now routed to the eth2.228-fcoe iface in the
      above schematic. Untagged frames continue to the bond0 as
      normal. This case also remains intact,
      
      eth2 --> bond0 --> vlan.228
      
      Here the skb is VLAN tagged but the vlan lookup fails on eth2
      causing the bonding rx_handler to be called. On the second
      pass the vlan lookup is on the bond0 iface and completes as
      expected.
      
      Putting a VLAN.228 on both the bond0 and eth2 device will
      result in eth2.228 receiving the skb. I don't think this is
      completely unexpected and was the result prior to the rx_handler
      result.
      
      Note, the same setup is also used for other storage traffic that
      MPIO is used with eg. iSCSI and similar setups can be contrived
      without storage protocols.
      Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com>
      Acked-by: NJesse Gross <jesse@nicira.com>
      Reviewed-by: NJiri Pirko <jpirko@redhat.com>
      Tested-by: NHans Schillstrom <hams.schillstrom@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2425717b
  3. 18 10月, 2011 1 次提交
  4. 17 10月, 2011 1 次提交
    • G
      if_link: Add additional parameter to IFLA_VF_INFO for spoof checking · 5f8444a3
      Greg Rose 提交于
      Add configuration setting for drivers to turn spoof checking on or off
      for discrete VFs.
      
      v2 - Fix indentation problem, wrap the ifla_vf_info structure in
           #ifdef __KERNEL__ to prevent user space from accessing and
           change function paramater for the spoof check setting netdev
           op from u8 to bool.
      v3 - Preset spoof check setting to -1 so that user space tools such
           as ip can detect that the driver didn't report a spoofcheck
           setting.  Prevents incorrect display of spoof check settings
           for drivers that don't report it.
      Signed-off-by: NGreg Rose <gregory.v.rose@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      5f8444a3
  5. 14 10月, 2011 1 次提交
    • E
      net: more accurate skb truesize · 87fb4b7b
      Eric Dumazet 提交于
      skb truesize currently accounts for sk_buff struct and part of skb head.
      kmalloc() roundings are also ignored.
      
      Considering that skb_shared_info is larger than sk_buff, its time to
      take it into account for better memory accounting.
      
      This patch introduces SKB_TRUESIZE(X) macro to centralize various
      assumptions into a single place.
      
      At skb alloc phase, we put skb_shared_info struct at the exact end of
      skb head, to allow a better use of memory (lowering number of
      reallocations), since kmalloc() gives us power-of-two memory blocks.
      
      Unless SLUB/SLUB debug is active, both skb->head and skb_shared_info are
      aligned to cache lines, as before.
      
      Note: This patch might trigger performance regressions because of
      misconfigured protocol stacks, hitting per socket or global memory
      limits that were previously not reached. But its a necessary step for a
      more accurate memory accounting.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      CC: Andi Kleen <ak@linux.intel.com>
      CC: Ben Hutchings <bhutchings@solarflare.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      87fb4b7b
  6. 08 10月, 2011 1 次提交
  7. 04 10月, 2011 1 次提交
  8. 29 9月, 2011 2 次提交
    • C
      net: rps: fix the support for PPPOE · 5dd17e08
      Changli Gao 提交于
      The upper protocol numbers of PPPOE are different, and should be treated
      specially.
      Signed-off-by: NChangli Gao <xiaosuo@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5dd17e08
    • E
      af_unix: dont send SCM_CREDENTIALS by default · 16e57262
      Eric Dumazet 提交于
      Since commit 7361c36c (af_unix: Allow credentials to work across
      user and pid namespaces) af_unix performance dropped a lot.
      
      This is because we now take a reference on pid and cred in each write(),
      and release them in read(), usually done from another process,
      eventually from another cpu. This triggers false sharing.
      
      # Events: 154K cycles
      #
      # Overhead  Command       Shared Object        Symbol
      # ........  .......  ..................  .........................
      #
          10.40%  hackbench  [kernel.kallsyms]   [k] put_pid
           8.60%  hackbench  [kernel.kallsyms]   [k] unix_stream_recvmsg
           7.87%  hackbench  [kernel.kallsyms]   [k] unix_stream_sendmsg
           6.11%  hackbench  [kernel.kallsyms]   [k] do_raw_spin_lock
           4.95%  hackbench  [kernel.kallsyms]   [k] unix_scm_to_skb
           4.87%  hackbench  [kernel.kallsyms]   [k] pid_nr_ns
           4.34%  hackbench  [kernel.kallsyms]   [k] cred_to_ucred
           2.39%  hackbench  [kernel.kallsyms]   [k] unix_destruct_scm
           2.24%  hackbench  [kernel.kallsyms]   [k] sub_preempt_count
           1.75%  hackbench  [kernel.kallsyms]   [k] fget_light
           1.51%  hackbench  [kernel.kallsyms]   [k]
      __mutex_lock_interruptible_slowpath
           1.42%  hackbench  [kernel.kallsyms]   [k] sock_alloc_send_pskb
      
      This patch includes SCM_CREDENTIALS information in a af_unix message/skb
      only if requested by the sender, [man 7 unix for details how to include
      ancillary data using sendmsg() system call]
      
      Note: This might break buggy applications that expected SCM_CREDENTIAL
      from an unaware write() system call, and receiver not using SO_PASSCRED
      socket option.
      
      If SOCK_PASSCRED is set on source or destination socket, we still
      include credentials for mere write() syscalls.
      
      Performance boost in hackbench : more than 50% gain on a 16 thread
      machine (2 quad-core cpus, 2 threads per core)
      
      hackbench 20 thread 2000
      
      4.228 sec instead of 9.102 sec
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Acked-by: NTim Chen <tim.c.chen@linux.intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      16e57262
  9. 22 9月, 2011 1 次提交
  10. 17 9月, 2011 1 次提交
  11. 16 9月, 2011 4 次提交
    • J
      net: consolidate and fix ethtool_ops->get_settings calling · 4bc71cb9
      Jiri Pirko 提交于
      This patch does several things:
      - introduces __ethtool_get_settings which is called from ethtool code and
        from drivers as well. Put ASSERT_RTNL there.
      - dev_ethtool_get_settings() is replaced by __ethtool_get_settings()
      - changes calling in drivers so rtnl locking is respected. In
        iboe_get_rate was previously ->get_settings() called unlocked. This
        fixes it. Also prb_calc_retire_blk_tmo() in af_packet.c had the same
        problem. Also fixed by calling __dev_get_by_index() instead of
        dev_get_by_index() and holding rtnl_lock for both calls.
      - introduces rtnl_lock in bnx2fc_vport_create() and fcoe_vport_create()
        so bnx2fc_if_create() and fcoe_if_create() are called locked as they
        are from other places.
      - use __ethtool_get_settings() in bonding code
      Signed-off-by: NJiri Pirko <jpirko@redhat.com>
      
      v2->v3:
      	-removed dev_ethtool_get_settings()
      	-added ASSERT_RTNL into __ethtool_get_settings()
      	-prb_calc_retire_blk_tmo - use __dev_get_by_index() and lock
      	 around it and __ethtool_get_settings() call
      v1->v2:
              add missing export_symbol
      Reviewed-by: Ben Hutchings <bhutchings@solarflare.com> [except FCoE bits]
      Acked-by: NRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4bc71cb9
    • E
      net: linkwatch: allow vlans to get carrier changes faster · c37e0c99
      Eric Dumazet 提交于
      There is a time-lag of IFF_RUNNING flag consistency between vlan and
      real devices when the real devices are in problem such as link or cable
      broken.
      
      This leads to a degradation of Availability such as a delay of failover
      in HA systems using vlan since the detection of the problem at real
      device is delayed.
      
      We can avoid the linkwatch delay (~1 sec) for devices linked to another
      ones, since delay is already done for the realdev.
      
      Based on a previous patch from Mitsuo Hayasaka
      Reported-by: NMitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com>
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Patrick McHardy <kaber@trash.net>
      Cc: "Michał Mirosław" <mirq-linux@rere.qmqm.pl>
      Cc: Tom Herbert <therbert@google.com>
      Cc: Stephen Hemminger <shemminger@vyatta.com>
      Cc: Jesse Gross <jesse@nicira.com>
      Tested-by: NMitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c37e0c99
    • M
      net: copy userspace buffers on device forwarding · 48c83012
      Michael S. Tsirkin 提交于
      dev_forward_skb loops an skb back into host networking
      stack which might hang on the memory indefinitely.
      In particular, this can happen in macvtap in bridged mode.
      Copy the userspace fragments to avoid blocking the
      sender in that case.
      
      As this patch makes skb_copy_ubufs extern now,
      I also added some documentation and made it clear
      the SKBTX_DEV_ZEROCOPY flag automatically instead
      of doing it in all callers. This can be made into a separate
      patch if people feel it's worth it.
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      48c83012
    • D
      net: Make flow cache namespace-aware · 0542b69e
      dpward 提交于
      flow_cache_lookup will return a cached object (or null pointer) that the
      resolver (i.e. xfrm_policy_lookup) previously found for another namespace
      using the same key/family/dir.  Instead, make the namespace part of what
      identifies entries in the cache.
      
      As before, flow_entry_valid will return 0 for entries where the namespace
      has been deleted, and they will be removed from the cache the next time
      flow_cache_gc_task is run.
      Reported-by: NAndrew Dickinson <whydna@whydna.net>
      Signed-off-by: NDavid Ward <david.ward@ll.mit.edu>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0542b69e
  12. 27 8月, 2011 1 次提交
  13. 25 8月, 2011 3 次提交
    • E
      arp: fix rcu lockdep splat in arp_process() · 20e6074e
      Eric Dumazet 提交于
      Dave Jones reported a lockdep splat triggered by an arp_process() call
      from parp_redo().
      
      Commit faa9dcf7 (arp: RCU changes) is the origin of the bug, since
      it assumed arp_process() was called under rcu_read_lock(), which is not
      true in this particular path.
      
      Instead of adding rcu_read_lock() in parp_redo(), I chose to add it in
      neigh_proxy_process() to take care of IPv6 side too.
      
       ===================================================
       [ INFO: suspicious rcu_dereference_check() usage. ]
       ---------------------------------------------------
       include/linux/inetdevice.h:209 invoked rcu_dereference_check() without
      protection!
      
       other info that might help us debug this:
      
       rcu_scheduler_active = 1, debug_locks = 0
       4 locks held by setfiles/2123:
        #0:  (&sb->s_type->i_mutex_key#13){+.+.+.}, at: [<ffffffff8114cbc4>]
      walk_component+0x1ef/0x3e8
        #1:  (&isec->lock){+.+.+.}, at: [<ffffffff81204bca>]
      inode_doinit_with_dentry+0x3f/0x41f
        #2:  (&tbl->proxy_timer){+.-...}, at: [<ffffffff8106a803>]
      run_timer_softirq+0x157/0x372
        #3:  (class){+.-...}, at: [<ffffffff8141f256>] neigh_proxy_process
      +0x36/0x103
      
       stack backtrace:
       Pid: 2123, comm: setfiles Tainted: G        W
      3.1.0-0.rc2.git7.2.fc16.x86_64 #1
       Call Trace:
        <IRQ>  [<ffffffff8108ca23>] lockdep_rcu_dereference+0xa7/0xaf
        [<ffffffff8146a0b7>] __in_dev_get_rcu+0x55/0x5d
        [<ffffffff8146a751>] arp_process+0x25/0x4d7
        [<ffffffff8146ac11>] parp_redo+0xe/0x10
        [<ffffffff8141f2ba>] neigh_proxy_process+0x9a/0x103
        [<ffffffff8106a8c4>] run_timer_softirq+0x218/0x372
        [<ffffffff8106a803>] ? run_timer_softirq+0x157/0x372
        [<ffffffff8141f220>] ? neigh_stat_seq_open+0x41/0x41
        [<ffffffff8108f2f0>] ? mark_held_locks+0x6d/0x95
        [<ffffffff81062bb6>] __do_softirq+0x112/0x25a
        [<ffffffff8150d27c>] call_softirq+0x1c/0x30
        [<ffffffff81010bf5>] do_softirq+0x4b/0xa2
        [<ffffffff81062f65>] irq_exit+0x5d/0xcf
        [<ffffffff8150dc11>] smp_apic_timer_interrupt+0x7c/0x8a
        [<ffffffff8150baf3>] apic_timer_interrupt+0x73/0x80
        <EOI>  [<ffffffff8108f439>] ? trace_hardirqs_on_caller+0x121/0x158
        [<ffffffff814fc285>] ? __slab_free+0x30/0x24c
        [<ffffffff814fc283>] ? __slab_free+0x2e/0x24c
        [<ffffffff81204e74>] ? inode_doinit_with_dentry+0x2e9/0x41f
        [<ffffffff81204e74>] ? inode_doinit_with_dentry+0x2e9/0x41f
        [<ffffffff81204e74>] ? inode_doinit_with_dentry+0x2e9/0x41f
        [<ffffffff81130cb0>] kfree+0x108/0x131
        [<ffffffff81204e74>] inode_doinit_with_dentry+0x2e9/0x41f
        [<ffffffff81204fc6>] selinux_d_instantiate+0x1c/0x1e
        [<ffffffff81200f4f>] security_d_instantiate+0x21/0x23
        [<ffffffff81154625>] d_instantiate+0x5c/0x61
        [<ffffffff811563ca>] d_splice_alias+0xbc/0xd2
        [<ffffffff811b17ff>] ext4_lookup+0xba/0xeb
        [<ffffffff8114bf1e>] d_alloc_and_lookup+0x45/0x6b
        [<ffffffff8114cbea>] walk_component+0x215/0x3e8
        [<ffffffff8114cdf8>] lookup_last+0x3b/0x3d
        [<ffffffff8114daf3>] path_lookupat+0x82/0x2af
        [<ffffffff8110fc53>] ? might_fault+0xa5/0xac
        [<ffffffff8110fc0a>] ? might_fault+0x5c/0xac
        [<ffffffff8114c564>] ? getname_flags+0x31/0x1ca
        [<ffffffff8114dd48>] do_path_lookup+0x28/0x97
        [<ffffffff8114df2c>] user_path_at+0x59/0x96
        [<ffffffff811467ad>] ? cp_new_stat+0xf7/0x10d
        [<ffffffff811469a6>] vfs_fstatat+0x44/0x6e
        [<ffffffff811469ee>] vfs_lstat+0x1e/0x20
        [<ffffffff81146b3d>] sys_newlstat+0x1a/0x33
        [<ffffffff8108f439>] ? trace_hardirqs_on_caller+0x121/0x158
        [<ffffffff812535fe>] ? trace_hardirqs_on_thunk+0x3a/0x3f
        [<ffffffff8150af82>] system_call_fastpath+0x16/0x1b
      Reported-by: NDave Jones <davej@redhat.com>
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      20e6074e
    • I
      net: convert core to skb paged frag APIs · ea2ab693
      Ian Campbell 提交于
      Signed-off-by: NIan Campbell <ian.campbell@citrix.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: "Michał Mirosław" <mirq-linux@rere.qmqm.pl>
      Cc: netdev@vger.kernel.org
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ea2ab693
    • E
      rps: support IPIP encapsulation · ec5efe79
      Eric Dumazet 提交于
      Skip IPIP header to get proper layer-4 information.
      
      Like GRE tunnels, this only works if rxhash is not already provided by
      the device itself (ethtool -K ethX rxhash off), to allow kernel compute
      a software rxhash.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ec5efe79
  14. 23 8月, 2011 1 次提交
  15. 21 8月, 2011 1 次提交
  16. 19 8月, 2011 2 次提交
  17. 18 8月, 2011 6 次提交
  18. 12 8月, 2011 2 次提交
    • E
      net: cleanup some rcu_dereference_raw · 33d480ce
      Eric Dumazet 提交于
      RCU api had been completed and rcu_access_pointer() or
      rcu_dereference_protected() are better than generic
      rcu_dereference_raw()
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      33d480ce
    • E
      neigh: reduce arp latency · cd28ca0a
      Eric Dumazet 提交于
      Remove the artificial HZ latency on arp resolution.
      
      Instead of firing a timer in one jiffy (up to 10 ms if HZ=100), lets
      send the ARP message immediately.
      
      Before patch :
      
      # arp -d 192.168.20.108 ; ping -c 3 192.168.20.108
      PING 192.168.20.108 (192.168.20.108) 56(84) bytes of data.
      64 bytes from 192.168.20.108: icmp_seq=1 ttl=64 time=9.91 ms
      64 bytes from 192.168.20.108: icmp_seq=2 ttl=64 time=0.065 ms
      64 bytes from 192.168.20.108: icmp_seq=3 ttl=64 time=0.061 ms
      
      After patch :
      
      $ arp -d 192.168.20.108 ; ping -c 3 192.168.20.108
      PING 192.168.20.108 (192.168.20.108) 56(84) bytes of data.
      64 bytes from 192.168.20.108: icmp_seq=1 ttl=64 time=0.152 ms
      64 bytes from 192.168.20.108: icmp_seq=2 ttl=64 time=0.064 ms
      64 bytes from 192.168.20.108: icmp_seq=3 ttl=64 time=0.074 ms
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cd28ca0a
  19. 11 8月, 2011 2 次提交
  20. 10 8月, 2011 1 次提交
  21. 07 8月, 2011 1 次提交
    • D
      net: Compute protocol sequence numbers and fragment IDs using MD5. · 6e5714ea
      David S. Miller 提交于
      Computers have become a lot faster since we compromised on the
      partial MD4 hash which we use currently for performance reasons.
      
      MD5 is a much safer choice, and is inline with both RFC1948 and
      other ISS generators (OpenBSD, Solaris, etc.)
      
      Furthermore, only having 24-bits of the sequence number be truly
      unpredictable is a very serious limitation.  So the periodic
      regeneration and 8-bit counter have been removed.  We compute and
      use a full 32-bit sequence number.
      
      For ipv6, DCCP was found to use a 32-bit truncated initial sequence
      number (it needs 43-bits) and that is fixed here as well.
      Reported-by: NDan Kaminsky <dan@doxpara.com>
      Tested-by: NWilly Tarreau <w@1wt.eu>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6e5714ea