1. 03 1月, 2014 5 次提交
    • F
      {pktgen, xfrm} Construct skb dst for tunnel mode transformation · cf93d47e
      Fan Du 提交于
      IPsec tunnel mode encapuslation needs to set outter ip header
      with right protocol/ttl/id value with regard to skb->dst->child.
      
      Looking up a rt in a standard way is absolutely wrong for every
      packet transmission. In a simple way, construct a dst by setting
      neccessary information to make tunnel mode encapuslation working.
      Signed-off-by: NFan Du <fan.du@windriver.com>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      cf93d47e
    • F
      {pktgen, xfrm} Using "pgset spi xxx" to spedifiy SA for a given flow · de4aee7d
      Fan Du 提交于
      User could set specific SPI value to arm pktgen flow with IPsec
      transformation, instead of looking up SA by sadr/daddr. The reaseon
      to do so is because current state lookup scheme is both slow and, most
      important of all, in fact pktgen doesn't need to match any SA state
      addresses information, all it needs is the SA transfromation shell to
      do the encapuslation.
      
      And this option also provide user an alternative to using pktgen
      test existing SA without creating new ones.
      Signed-off-by: NFan Du <fan.du@windriver.com>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      de4aee7d
    • F
      {pktgen, xfrm} Correct xfrm_state_lock usage in xfrm_stateonly_find · 4ae770bf
      Fan Du 提交于
      Acquiring xfrm_state_lock in process context is expected to turn BH off,
      as this lock is also used in BH context, namely xfrm state timer handler.
      Otherwise it surprises LOCKDEP with below messages.
      
      [   81.422781] pktgen: Packet Generator for packet performance testing. Version: 2.74
      [   81.725194]
      [   81.725211] =========================================================
      [   81.725212] [ INFO: possible irq lock inversion dependency detected ]
      [   81.725215] 3.13.0-rc2+ #92 Not tainted
      [   81.725216] ---------------------------------------------------------
      [   81.725218] kpktgend_0/2780 just changed the state of lock:
      [   81.725220]  (xfrm_state_lock){+.+...}, at: [<ffffffff816dd751>] xfrm_stateonly_find+0x41/0x1f0
      [   81.725231] but this lock was taken by another, SOFTIRQ-safe lock in the past:
      [   81.725232]  (&(&x->lock)->rlock){+.-...}
      [   81.725232]
      [   81.725232] and interrupts could create inverse lock ordering between them.
      [   81.725232]
      [   81.725235]
      [   81.725235] other info that might help us debug this:
      [   81.725237]  Possible interrupt unsafe locking scenario:
      [   81.725237]
      [   81.725238]        CPU0                    CPU1
      [   81.725240]        ----                    ----
      [   81.725241]   lock(xfrm_state_lock);
      [   81.725243]                                local_irq_disable();
      [   81.725244]                                lock(&(&x->lock)->rlock);
      [   81.725246]                                lock(xfrm_state_lock);
      [   81.725248]   <Interrupt>
      [   81.725249]     lock(&(&x->lock)->rlock);
      [   81.725251]
      [   81.725251]  *** DEADLOCK ***
      [   81.725251]
      [   81.725254] no locks held by kpktgend_0/2780.
      [   81.725255]
      [   81.725255] the shortest dependencies between 2nd lock and 1st lock:
      [   81.725269]  -> (&(&x->lock)->rlock){+.-...} ops: 8 {
      [   81.725274]     HARDIRQ-ON-W at:
      [   81.725276]                       [<ffffffff8109a64b>] __lock_acquire+0x65b/0x1d70
      [   81.725282]                       [<ffffffff8109c3c7>] lock_acquire+0x97/0x130
      [   81.725284]                       [<ffffffff81774af6>] _raw_spin_lock+0x36/0x70
      [   81.725289]                       [<ffffffff816dc3a3>] xfrm_timer_handler+0x43/0x290
      [   81.725292]                       [<ffffffff81059437>] __tasklet_hrtimer_trampoline+0x17/0x40
      [   81.725300]                       [<ffffffff8105a1b7>] tasklet_hi_action+0xd7/0xf0
      [   81.725303]                       [<ffffffff81059ac6>] __do_softirq+0xe6/0x2d0
      [   81.725305]                       [<ffffffff8105a026>] irq_exit+0x96/0xc0
      [   81.725308]                       [<ffffffff8177fd0a>] smp_apic_timer_interrupt+0x4a/0x60
      [   81.725313]                       [<ffffffff8177e96f>] apic_timer_interrupt+0x6f/0x80
      [   81.725316]                       [<ffffffff8100b7c6>] arch_cpu_idle+0x26/0x30
      [   81.725329]                       [<ffffffff810ace28>] cpu_startup_entry+0x88/0x2b0
      [   81.725333]                       [<ffffffff8102e5b0>] start_secondary+0x190/0x1f0
      [   81.725338]     IN-SOFTIRQ-W at:
      [   81.725340]                       [<ffffffff8109a61d>] __lock_acquire+0x62d/0x1d70
      [   81.725342]                       [<ffffffff8109c3c7>] lock_acquire+0x97/0x130
      [   81.725344]                       [<ffffffff81774af6>] _raw_spin_lock+0x36/0x70
      [   81.725347]                       [<ffffffff816dc3a3>] xfrm_timer_handler+0x43/0x290
      [   81.725349]                       [<ffffffff81059437>] __tasklet_hrtimer_trampoline+0x17/0x40
      [   81.725352]                       [<ffffffff8105a1b7>] tasklet_hi_action+0xd7/0xf0
      [   81.725355]                       [<ffffffff81059ac6>] __do_softirq+0xe6/0x2d0
      [   81.725358]                       [<ffffffff8105a026>] irq_exit+0x96/0xc0
      [   81.725360]                       [<ffffffff8177fd0a>] smp_apic_timer_interrupt+0x4a/0x60
      [   81.725363]                       [<ffffffff8177e96f>] apic_timer_interrupt+0x6f/0x80
      [   81.725365]                       [<ffffffff8100b7c6>] arch_cpu_idle+0x26/0x30
      [   81.725368]                       [<ffffffff810ace28>] cpu_startup_entry+0x88/0x2b0
      [   81.725370]                       [<ffffffff8102e5b0>] start_secondary+0x190/0x1f0
      [   81.725373]     INITIAL USE at:
      [   81.725375]                      [<ffffffff8109a31a>] __lock_acquire+0x32a/0x1d70
      [   81.725385]                      [<ffffffff8109c3c7>] lock_acquire+0x97/0x130
      [   81.725388]                      [<ffffffff81774af6>] _raw_spin_lock+0x36/0x70
      [   81.725390]                      [<ffffffff816dc3a3>] xfrm_timer_handler+0x43/0x290
      [   81.725394]                      [<ffffffff81059437>] __tasklet_hrtimer_trampoline+0x17/0x40
      [   81.725398]                      [<ffffffff8105a1b7>] tasklet_hi_action+0xd7/0xf0
      [   81.725401]                      [<ffffffff81059ac6>] __do_softirq+0xe6/0x2d0
      [   81.725404]                      [<ffffffff8105a026>] irq_exit+0x96/0xc0
      [   81.725407]                      [<ffffffff8177fd0a>] smp_apic_timer_interrupt+0x4a/0x60
      [   81.725409]                      [<ffffffff8177e96f>] apic_timer_interrupt+0x6f/0x80
      [   81.725412]                      [<ffffffff8100b7c6>] arch_cpu_idle+0x26/0x30
      [   81.725415]                      [<ffffffff810ace28>] cpu_startup_entry+0x88/0x2b0
      [   81.725417]                      [<ffffffff8102e5b0>] start_secondary+0x190/0x1f0
      [   81.725420]   }
      [   81.725421]   ... key      at: [<ffffffff8295b9c8>] __key.46349+0x0/0x8
      [   81.725445]   ... acquired at:
      [   81.725446]    [<ffffffff8109c3c7>] lock_acquire+0x97/0x130
      [   81.725449]    [<ffffffff81774af6>] _raw_spin_lock+0x36/0x70
      [   81.725452]    [<ffffffff816dc057>] __xfrm_state_delete+0x37/0x140
      [   81.725454]    [<ffffffff816dc18c>] xfrm_state_delete+0x2c/0x50
      [   81.725456]    [<ffffffff816dc277>] xfrm_state_flush+0xc7/0x1b0
      [   81.725458]    [<ffffffffa005f6cc>] pfkey_flush+0x7c/0x100 [af_key]
      [   81.725465]    [<ffffffffa005efb7>] pfkey_process+0x1c7/0x1f0 [af_key]
      [   81.725468]    [<ffffffffa005f139>] pfkey_sendmsg+0x159/0x260 [af_key]
      [   81.725471]    [<ffffffff8162c16f>] sock_sendmsg+0xaf/0xc0
      [   81.725476]    [<ffffffff8162c99c>] SYSC_sendto+0xfc/0x130
      [   81.725479]    [<ffffffff8162cf3e>] SyS_sendto+0xe/0x10
      [   81.725482]    [<ffffffff8177dd12>] system_call_fastpath+0x16/0x1b
      [   81.725484]
      [   81.725486] -> (xfrm_state_lock){+.+...} ops: 11 {
      [   81.725490]    HARDIRQ-ON-W at:
      [   81.725493]                     [<ffffffff8109a64b>] __lock_acquire+0x65b/0x1d70
      [   81.725504]                     [<ffffffff8109c3c7>] lock_acquire+0x97/0x130
      [   81.725507]                     [<ffffffff81774e4b>] _raw_spin_lock_bh+0x3b/0x70
      [   81.725510]                     [<ffffffff816dc1df>] xfrm_state_flush+0x2f/0x1b0
      [   81.725513]                     [<ffffffffa005f6cc>] pfkey_flush+0x7c/0x100 [af_key]
      [   81.725516]                     [<ffffffffa005efb7>] pfkey_process+0x1c7/0x1f0 [af_key]
      [   81.725519]                     [<ffffffffa005f139>] pfkey_sendmsg+0x159/0x260 [af_key]
      [   81.725522]                     [<ffffffff8162c16f>] sock_sendmsg+0xaf/0xc0
      [   81.725525]                     [<ffffffff8162c99c>] SYSC_sendto+0xfc/0x130
      [   81.725527]                     [<ffffffff8162cf3e>] SyS_sendto+0xe/0x10
      [   81.725530]                     [<ffffffff8177dd12>] system_call_fastpath+0x16/0x1b
      [   81.725533]    SOFTIRQ-ON-W at:
      [   81.725534]                     [<ffffffff8109a67a>] __lock_acquire+0x68a/0x1d70
      [   81.725537]                     [<ffffffff8109c3c7>] lock_acquire+0x97/0x130
      [   81.725539]                     [<ffffffff81774af6>] _raw_spin_lock+0x36/0x70
      [   81.725541]                     [<ffffffff816dd751>] xfrm_stateonly_find+0x41/0x1f0
      [   81.725544]                     [<ffffffffa008af03>] mod_cur_headers+0x793/0x7f0 [pktgen]
      [   81.725547]                     [<ffffffffa008bca2>] pktgen_thread_worker+0xd42/0x1880 [pktgen]
      [   81.725550]                     [<ffffffff81078f84>] kthread+0xe4/0x100
      [   81.725555]                     [<ffffffff8177dc6c>] ret_from_fork+0x7c/0xb0
      [   81.725565]    INITIAL USE at:
      [   81.725567]                    [<ffffffff8109a31a>] __lock_acquire+0x32a/0x1d70
      [   81.725569]                    [<ffffffff8109c3c7>] lock_acquire+0x97/0x130
      [   81.725572]                    [<ffffffff81774e4b>] _raw_spin_lock_bh+0x3b/0x70
      [   81.725574]                    [<ffffffff816dc1df>] xfrm_state_flush+0x2f/0x1b0
      [   81.725576]                    [<ffffffffa005f6cc>] pfkey_flush+0x7c/0x100 [af_key]
      [   81.725580]                    [<ffffffffa005efb7>] pfkey_process+0x1c7/0x1f0 [af_key]
      [   81.725583]                    [<ffffffffa005f139>] pfkey_sendmsg+0x159/0x260 [af_key]
      [   81.725586]                    [<ffffffff8162c16f>] sock_sendmsg+0xaf/0xc0
      [   81.725589]                    [<ffffffff8162c99c>] SYSC_sendto+0xfc/0x130
      [   81.725594]                    [<ffffffff8162cf3e>] SyS_sendto+0xe/0x10
      [   81.725597]                    [<ffffffff8177dd12>] system_call_fastpath+0x16/0x1b
      [   81.725599]  }
      [   81.725600]  ... key      at: [<ffffffff81cadef8>] xfrm_state_lock+0x18/0x50
      [   81.725606]  ... acquired at:
      [   81.725607]    [<ffffffff810995c0>] check_usage_backwards+0x110/0x150
      [   81.725609]    [<ffffffff81099e96>] mark_lock+0x196/0x2f0
      [   81.725611]    [<ffffffff8109a67a>] __lock_acquire+0x68a/0x1d70
      [   81.725614]    [<ffffffff8109c3c7>] lock_acquire+0x97/0x130
      [   81.725616]    [<ffffffff81774af6>] _raw_spin_lock+0x36/0x70
      [   81.725627]    [<ffffffff816dd751>] xfrm_stateonly_find+0x41/0x1f0
      [   81.725629]    [<ffffffffa008af03>] mod_cur_headers+0x793/0x7f0 [pktgen]
      [   81.725632]    [<ffffffffa008bca2>] pktgen_thread_worker+0xd42/0x1880 [pktgen]
      [   81.725635]    [<ffffffff81078f84>] kthread+0xe4/0x100
      [   81.725637]    [<ffffffff8177dc6c>] ret_from_fork+0x7c/0xb0
      [   81.725640]
      [   81.725641]
      [   81.725641] stack backtrace:
      [   81.725645] CPU: 0 PID: 2780 Comm: kpktgend_0 Not tainted 3.13.0-rc2+ #92
      [   81.725647] Hardware name: innotek GmbH VirtualBox, BIOS VirtualBox 12/01/2006
      [   81.725649]  ffffffff82537b80 ffff880018199988 ffffffff8176af37 0000000000000007
      [   81.725652]  ffff8800181999f0 ffff8800181999d8 ffffffff81099358 ffffffff82537b80
      [   81.725655]  ffffffff81a32def ffff8800181999f4 0000000000000000 ffff880002cbeaa8
      [   81.725659] Call Trace:
      [   81.725664]  [<ffffffff8176af37>] dump_stack+0x46/0x58
      [   81.725667]  [<ffffffff81099358>] print_irq_inversion_bug.part.42+0x1e8/0x1f0
      [   81.725670]  [<ffffffff810995c0>] check_usage_backwards+0x110/0x150
      [   81.725672]  [<ffffffff81099e96>] mark_lock+0x196/0x2f0
      [   81.725675]  [<ffffffff810994b0>] ? check_usage_forwards+0x150/0x150
      [   81.725685]  [<ffffffff8109a67a>] __lock_acquire+0x68a/0x1d70
      [   81.725691]  [<ffffffff810899a5>] ? sched_clock_local+0x25/0x90
      [   81.725694]  [<ffffffff81089b38>] ? sched_clock_cpu+0xa8/0x120
      [   81.725697]  [<ffffffff8109a31a>] ? __lock_acquire+0x32a/0x1d70
      [   81.725699]  [<ffffffff816dd751>] ? xfrm_stateonly_find+0x41/0x1f0
      [   81.725702]  [<ffffffff8109c3c7>] lock_acquire+0x97/0x130
      [   81.725704]  [<ffffffff816dd751>] ? xfrm_stateonly_find+0x41/0x1f0
      [   81.725707]  [<ffffffff810899a5>] ? sched_clock_local+0x25/0x90
      [   81.725710]  [<ffffffff81774af6>] _raw_spin_lock+0x36/0x70
      [   81.725712]  [<ffffffff816dd751>] ? xfrm_stateonly_find+0x41/0x1f0
      [   81.725715]  [<ffffffff810971ec>] ? lock_release_holdtime.part.26+0x1c/0x1a0
      [   81.725717]  [<ffffffff816dd751>] xfrm_stateonly_find+0x41/0x1f0
      [   81.725721]  [<ffffffffa008af03>] mod_cur_headers+0x793/0x7f0 [pktgen]
      [   81.725724]  [<ffffffffa008bca2>] pktgen_thread_worker+0xd42/0x1880 [pktgen]
      [   81.725727]  [<ffffffffa008ba71>] ? pktgen_thread_worker+0xb11/0x1880 [pktgen]
      [   81.725729]  [<ffffffff8109cf9d>] ? trace_hardirqs_on+0xd/0x10
      [   81.725733]  [<ffffffff81775410>] ? _raw_spin_unlock_irq+0x30/0x40
      [   81.725745]  [<ffffffff8151faa0>] ? e1000_clean+0x9d0/0x9d0
      [   81.725751]  [<ffffffff81094310>] ? __init_waitqueue_head+0x60/0x60
      [   81.725753]  [<ffffffff81094310>] ? __init_waitqueue_head+0x60/0x60
      [   81.725757]  [<ffffffffa008af60>] ? mod_cur_headers+0x7f0/0x7f0 [pktgen]
      [   81.725759]  [<ffffffff81078f84>] kthread+0xe4/0x100
      [   81.725762]  [<ffffffff81078ea0>] ? flush_kthread_worker+0x170/0x170
      [   81.725765]  [<ffffffff8177dc6c>] ret_from_fork+0x7c/0xb0
      [   81.725768]  [<ffffffff81078ea0>] ? flush_kthread_worker+0x170/0x170
      Signed-off-by: NFan Du <fan.du@windriver.com>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      4ae770bf
    • F
      {pktgen, xfrm} Add statistics counting when transforming · 6de9ace4
      Fan Du 提交于
      so /proc/net/xfrm_stat could give user clue about what's
      wrong in this process.
      Signed-off-by: NFan Du <fan.du@windriver.com>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      6de9ace4
    • F
      {pktgen, xfrm} Correct xfrm state lock usage when transforming · 0af0a413
      Fan Du 提交于
      xfrm_state lock protects its state, i.e., VALID/DEAD and statistics,
      not the transforming procedure, as both mode/type output functions
      are reentrant.
      
      Another issue is state lock can be used in BH context when state timer
      alarmed, after transformation in pktgen, update state statistics acquiring
      state lock should disabled BH context for a moment. Otherwise LOCKDEP
      critisize this:
      
      [   62.354339] pktgen: Packet Generator for packet performance testing. Version: 2.74
      [   62.655444]
      [   62.655448] =================================
      [   62.655451] [ INFO: inconsistent lock state ]
      [   62.655455] 3.13.0-rc2+ #70 Not tainted
      [   62.655457] ---------------------------------
      [   62.655459] inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
      [   62.655463] kpktgend_0/2764 [HC0[0]:SC0[0]:HE1:SE1] takes:
      [   62.655466]  (&(&x->lock)->rlock){+.?...}, at: [<ffffffffa00886f6>] pktgen_thread_worker+0x1796/0x1860 [pktgen]
      [   62.655479] {IN-SOFTIRQ-W} state was registered at:
      [   62.655484]   [<ffffffff8109a61d>] __lock_acquire+0x62d/0x1d70
      [   62.655492]   [<ffffffff8109c3c7>] lock_acquire+0x97/0x130
      [   62.655498]   [<ffffffff81774af6>] _raw_spin_lock+0x36/0x70
      [   62.655505]   [<ffffffff816dc3a3>] xfrm_timer_handler+0x43/0x290
      [   62.655511]   [<ffffffff81059437>] __tasklet_hrtimer_trampoline+0x17/0x40
      [   62.655519]   [<ffffffff8105a1b7>] tasklet_hi_action+0xd7/0xf0
      [   62.655523]   [<ffffffff81059ac6>] __do_softirq+0xe6/0x2d0
      [   62.655526]   [<ffffffff8105a026>] irq_exit+0x96/0xc0
      [   62.655530]   [<ffffffff8177fd0a>] smp_apic_timer_interrupt+0x4a/0x60
      [   62.655537]   [<ffffffff8177e96f>] apic_timer_interrupt+0x6f/0x80
      [   62.655541]   [<ffffffff8100b7c6>] arch_cpu_idle+0x26/0x30
      [   62.655547]   [<ffffffff810ace28>] cpu_startup_entry+0x88/0x2b0
      [   62.655552]   [<ffffffff81761c3c>] rest_init+0xbc/0xd0
      [   62.655557]   [<ffffffff81ea5e5e>] start_kernel+0x3c4/0x3d1
      [   62.655583]   [<ffffffff81ea55a8>] x86_64_start_reservations+0x2a/0x2c
      [   62.655588]   [<ffffffff81ea569f>] x86_64_start_kernel+0xf5/0xfc
      [   62.655592] irq event stamp: 77
      [   62.655594] hardirqs last  enabled at (77): [<ffffffff810ab7f2>] vprintk_emit+0x1b2/0x520
      [   62.655597] hardirqs last disabled at (76): [<ffffffff810ab684>] vprintk_emit+0x44/0x520
      [   62.655601] softirqs last  enabled at (22): [<ffffffff81059b57>] __do_softirq+0x177/0x2d0
      [   62.655605] softirqs last disabled at (15): [<ffffffff8105a026>] irq_exit+0x96/0xc0
      [   62.655609]
      [   62.655609] other info that might help us debug this:
      [   62.655613]  Possible unsafe locking scenario:
      [   62.655613]
      [   62.655616]        CPU0
      [   62.655617]        ----
      [   62.655618]   lock(&(&x->lock)->rlock);
      [   62.655622]   <Interrupt>
      [   62.655623]     lock(&(&x->lock)->rlock);
      [   62.655626]
      [   62.655626]  *** DEADLOCK ***
      [   62.655626]
      [   62.655629] no locks held by kpktgend_0/2764.
      [   62.655631]
      [   62.655631] stack backtrace:
      [   62.655636] CPU: 0 PID: 2764 Comm: kpktgend_0 Not tainted 3.13.0-rc2+ #70
      [   62.655638] Hardware name: innotek GmbH VirtualBox, BIOS VirtualBox 12/01/2006
      [   62.655642]  ffffffff8216b7b0 ffff88001be43ab8 ffffffff8176af37 0000000000000007
      [   62.655652]  ffff88001c8d4fc0 ffff88001be43b18 ffffffff81766d78 0000000000000000
      [   62.655663]  ffff880000000001 ffff880000000001 ffffffff8101025f ffff88001be43b18
      [   62.655671] Call Trace:
      [   62.655680]  [<ffffffff8176af37>] dump_stack+0x46/0x58
      [   62.655685]  [<ffffffff81766d78>] print_usage_bug+0x1f1/0x202
      [   62.655691]  [<ffffffff8101025f>] ? save_stack_trace+0x2f/0x50
      [   62.655696]  [<ffffffff81099f8c>] mark_lock+0x28c/0x2f0
      [   62.655700]  [<ffffffff810994b0>] ? check_usage_forwards+0x150/0x150
      [   62.655704]  [<ffffffff8109a67a>] __lock_acquire+0x68a/0x1d70
      [   62.655712]  [<ffffffff81115b09>] ? irq_work_queue+0x69/0xb0
      [   62.655717]  [<ffffffff810ab7f2>] ? vprintk_emit+0x1b2/0x520
      [   62.655722]  [<ffffffff8109cec5>] ? trace_hardirqs_on_caller+0x105/0x1d0
      [   62.655730]  [<ffffffffa00886f6>] ? pktgen_thread_worker+0x1796/0x1860 [pktgen]
      [   62.655734]  [<ffffffff8109c3c7>] lock_acquire+0x97/0x130
      [   62.655741]  [<ffffffffa00886f6>] ? pktgen_thread_worker+0x1796/0x1860 [pktgen]
      [   62.655745]  [<ffffffff81774af6>] _raw_spin_lock+0x36/0x70
      [   62.655752]  [<ffffffffa00886f6>] ? pktgen_thread_worker+0x1796/0x1860 [pktgen]
      [   62.655758]  [<ffffffffa00886f6>] pktgen_thread_worker+0x1796/0x1860 [pktgen]
      [   62.655766]  [<ffffffffa0087a79>] ? pktgen_thread_worker+0xb19/0x1860 [pktgen]
      [   62.655771]  [<ffffffff8109cf9d>] ? trace_hardirqs_on+0xd/0x10
      [   62.655777]  [<ffffffff81775410>] ? _raw_spin_unlock_irq+0x30/0x40
      [   62.655785]  [<ffffffff8151faa0>] ? e1000_clean+0x9d0/0x9d0
      [   62.655791]  [<ffffffff81094310>] ? __init_waitqueue_head+0x60/0x60
      [   62.655795]  [<ffffffff81094310>] ? __init_waitqueue_head+0x60/0x60
      [   62.655800]  [<ffffffffa0086f60>] ? mod_cur_headers+0x7f0/0x7f0 [pktgen]
      [   62.655806]  [<ffffffff81078f84>] kthread+0xe4/0x100
      [   62.655813]  [<ffffffff81078ea0>] ? flush_kthread_worker+0x170/0x170
      [   62.655819]  [<ffffffff8177dc6c>] ret_from_fork+0x7c/0xb0
      [   62.655824]  [<ffffffff81078ea0>] ? flush_kthread_worker+0x170/0x170
      Signed-off-by: NFan Du <fan.du@windriver.com>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      0af0a413
  2. 02 1月, 2014 5 次提交
  3. 20 12月, 2013 9 次提交
  4. 19 12月, 2013 11 次提交
  5. 18 12月, 2013 10 次提交
    • A
      packet: deliver VLAN TPID to userspace · a0cdfcf3
      Atzm Watanabe 提交于
      This enables userspace to get VLAN TPID as well as the VLAN TCI.
      Signed-off-by: NAtzm Watanabe <atzm@stratosphere.co.jp>
      Acked-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a0cdfcf3
    • A
      packet: fill the gap of TPACKET_ALIGNMENT with zeros · e4d26f4b
      Atzm Watanabe 提交于
      struct tpacket{2,3}_hdr is aligned to a multiple of TPACKET_ALIGNMENT.
      Explicitly defining and zeroing the gap of this makes additional changes
      easier.
      Signed-off-by: NAtzm Watanabe <atzm@stratosphere.co.jp>
      Acked-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e4d26f4b
    • A
      packet: make aligned size of struct tpacket{2,3}_hdr clear · 51846355
      Atzm Watanabe 提交于
      struct tpacket{2,3}_hdr is aligned to a multiple of TPACKET_ALIGNMENT.
      We may add members to them until current aligned size without forcing
      userspace to call getsockopt(..., PACKET_HDRLEN, ...).
      Signed-off-by: NAtzm Watanabe <atzm@stratosphere.co.jp>
      Acked-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      51846355
    • T
      net: Add utility function to copy skb hash · 3df7a74e
      Tom Herbert 提交于
      Adds skb_copy_hash to copy rxhash and l4_rxhash from one skb to another.
      Signed-off-by: NTom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3df7a74e
    • T
      net: Add utility functions to clear rxhash · 7539fadc
      Tom Herbert 提交于
      In several places 'skb->rxhash = 0' is being done to clear the
      rxhash value in an skb.  This does not clear l4_rxhash which could
      still be set so that the rxhash wouldn't be recalculated on subsequent
      call to skb_get_rxhash.  This patch adds an explict function to clear
      all the rxhash related information in the skb properly.
      
      skb_clear_hash_if_not_l4 clears the rxhash only if it is not marked as
      l4_rxhash.
      
      Fixed up places where 'skb->rxhash = 0' was being called.
      Signed-off-by: NTom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7539fadc
    • T
      net: Change skb_get_rxhash to skb_get_hash · 3958afa1
      Tom Herbert 提交于
      Changing name of function as part of making the hash in skbuff to be
      generic property, not just for receive path.
      Signed-off-by: NTom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3958afa1
    • W
      net/hsr: using kfree_rcu() to simplify the code · 1aee6cc2
      Wei Yongjun 提交于
      The callback function of call_rcu() just calls a kfree(), so we
      can use kfree_rcu() instead of call_rcu() + callback function.
      Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn>
      Acked-by: NArvid Brodin <arvid.brodin@alten.se>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1aee6cc2
    • B
      neigh: Netlink notification for administrative NUD state change · 53385d2d
      Bob Gilligan 提交于
      The neighbour code sends up an RTM_NEWNEIGH netlink notification if
      the NUD state of a neighbour cache entry is changed by a timer (e.g.
      from REACHABLE to STALE), even if the lladdr of the entry has not
      changed.
      
      But an administrative change to the the NUD state of a neighbour cache
      entry that does not change the lladdr (e.g. via "ip -4 neigh change
      ...  nud ...") does not trigger a netlink notification.  This means
      that netlink listeners will not hear about administrative NUD state
      changes such as from a resolved state to PERMANENT.
      
      This patch changes the neighbor code to generate an RTM_NEWNEIGH
      message when the NUD state of an entry is changed administratively.
      Signed-off-by: NBob Gilligan <gilligan@aristanetworks.com>
      Acked-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      53385d2d
    • E
      pkt_sched: fq: more robust memory allocation · c3bd8549
      Eric Dumazet 提交于
      This patch brings NUMA support and automatic fallback to vmalloc()
      in case kmalloc() failed to allocate FQ hash table.
      
      NUMA support depends on XPS being setup for the device before
      qdisc allocation. After a XPS change, it might be worth creating
      qdisc hierarchy again.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c3bd8549
    • E
      tcp: refine TSO splits · d4589926
      Eric Dumazet 提交于
      While investigating performance problems on small RPC workloads,
      I noticed linux TCP stack was always splitting the last TSO skb
      into two parts (skbs). One being a multiple of MSS, and a small one
      with the Push flag. This split is done even if TCP_NODELAY is set,
      or if no small packet is in flight.
      
      Example with request/response of 4K/4K
      
      IP A > B: . ack 68432 win 2783 <nop,nop,timestamp 6524593 6525001>
      IP A > B: . 65537:68433(2896) ack 69632 win 2783 <nop,nop,timestamp 6524593 6525001>
      IP A > B: P 68433:69633(1200) ack 69632 win 2783 <nop,nop,timestamp 6524593 6525001>
      IP B > A: . ack 68433 win 2768 <nop,nop,timestamp 6525001 6524593>
      IP B > A: . 69632:72528(2896) ack 69633 win 2768 <nop,nop,timestamp 6525001 6524593>
      IP B > A: P 72528:73728(1200) ack 69633 win 2768 <nop,nop,timestamp 6525001 6524593>
      IP A > B: . ack 72528 win 2783 <nop,nop,timestamp 6524593 6525001>
      IP A > B: . 69633:72529(2896) ack 73728 win 2783 <nop,nop,timestamp 6524593 6525001>
      IP A > B: P 72529:73729(1200) ack 73728 win 2783 <nop,nop,timestamp 6524593 6525001>
      
      We can avoid this split by including the Nagle tests at the right place.
      
      Note : If some NIC had trouble sending TSO packets with a partial
      last segment, we would have hit the problem in GRO/forwarding workload already.
      
      tcp_minshall_update() is moved to tcp_output.c and is updated as we might
      feed a TSO packet with a partial last segment.
      
      This patch tremendously improves performance, as the traffic now looks
      like :
      
      IP A > B: . ack 98304 win 2783 <nop,nop,timestamp 6834277 6834685>
      IP A > B: P 94209:98305(4096) ack 98304 win 2783 <nop,nop,timestamp 6834277 6834685>
      IP B > A: . ack 98305 win 2768 <nop,nop,timestamp 6834686 6834277>
      IP B > A: P 98304:102400(4096) ack 98305 win 2768 <nop,nop,timestamp 6834686 6834277>
      IP A > B: . ack 102400 win 2783 <nop,nop,timestamp 6834279 6834686>
      IP A > B: P 98305:102401(4096) ack 102400 win 2783 <nop,nop,timestamp 6834279 6834686>
      IP B > A: . ack 102401 win 2768 <nop,nop,timestamp 6834687 6834279>
      IP B > A: P 102400:106496(4096) ack 102401 win 2768 <nop,nop,timestamp 6834687 6834279>
      IP A > B: . ack 106496 win 2783 <nop,nop,timestamp 6834280 6834687>
      IP A > B: P 102401:106497(4096) ack 106496 win 2783 <nop,nop,timestamp 6834280 6834687>
      IP B > A: . ack 106497 win 2768 <nop,nop,timestamp 6834688 6834280>
      IP B > A: P 106496:110592(4096) ack 106497 win 2768 <nop,nop,timestamp 6834688 6834280>
      
      Before :
      
      lpq83:~# nstat >/dev/null;perf stat ./super_netperf 200 -t TCP_RR -H lpq84 -l 20 -- -r 4K,4K
      280774
      
       Performance counter stats for './super_netperf 200 -t TCP_RR -H lpq84 -l 20 -- -r 4K,4K':
      
           205719.049006 task-clock                #    9.278 CPUs utilized
               8,449,968 context-switches          #    0.041 M/sec
               1,935,997 CPU-migrations            #    0.009 M/sec
                 160,541 page-faults               #    0.780 K/sec
         548,478,722,290 cycles                    #    2.666 GHz                     [83.20%]
         455,240,670,857 stalled-cycles-frontend   #   83.00% frontend cycles idle    [83.48%]
         272,881,454,275 stalled-cycles-backend    #   49.75% backend  cycles idle    [66.73%]
         166,091,460,030 instructions              #    0.30  insns per cycle
                                                   #    2.74  stalled cycles per insn [83.39%]
          29,150,229,399 branches                  #  141.699 M/sec                   [83.30%]
           1,943,814,026 branch-misses             #    6.67% of all branches         [83.32%]
      
            22.173517844 seconds time elapsed
      
      lpq83:~# nstat | egrep "IpOutRequests|IpExtOutOctets"
      IpOutRequests                   16851063           0.0
      IpExtOutOctets                  23878580777        0.0
      
      After patch :
      
      lpq83:~# nstat >/dev/null;perf stat ./super_netperf 200 -t TCP_RR -H lpq84 -l 20 -- -r 4K,4K
      280877
      
       Performance counter stats for './super_netperf 200 -t TCP_RR -H lpq84 -l 20 -- -r 4K,4K':
      
           107496.071918 task-clock                #    4.847 CPUs utilized
               5,635,458 context-switches          #    0.052 M/sec
               1,374,707 CPU-migrations            #    0.013 M/sec
                 160,920 page-faults               #    0.001 M/sec
         281,500,010,924 cycles                    #    2.619 GHz                     [83.28%]
         228,865,069,307 stalled-cycles-frontend   #   81.30% frontend cycles idle    [83.38%]
         142,462,742,658 stalled-cycles-backend    #   50.61% backend  cycles idle    [66.81%]
          95,227,712,566 instructions              #    0.34  insns per cycle
                                                   #    2.40  stalled cycles per insn [83.43%]
          16,209,868,171 branches                  #  150.795 M/sec                   [83.20%]
             874,252,952 branch-misses             #    5.39% of all branches         [83.37%]
      
            22.175821286 seconds time elapsed
      
      lpq83:~# nstat | egrep "IpOutRequests|IpExtOutOctets"
      IpOutRequests                   11239428           0.0
      IpExtOutOctets                  23595191035        0.0
      
      Indeed, the occupancy of tx skbs (IpExtOutOctets/IpOutRequests) is higher :
      2099 instead of 1417, thus helping GRO to be more efficient when using FQ packet
      scheduler.
      
      Many thanks to Neal for review and ideas.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Nandita Dukkipati <nanditad@google.com>
      Cc: Van Jacobson <vanj@google.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Tested-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d4589926