1. 02 9月, 2017 10 次提交
  2. 01 9月, 2017 6 次提交
  3. 31 8月, 2017 18 次提交
    • S
      xfrm: Fix return value check of copy_sec_ctx. · 8598112d
      Steffen Klassert 提交于
      A recent commit added an output_mark. When copying
      this output_mark, the return value of copy_sec_ctx
      is overwitten without a check. Fix this by copying
      the output_mark before the security context.
      
      Fixes: 077fbac4 ("net: xfrm: support setting an output mark.")
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      8598112d
    • Y
      xfrm: Add support for network devices capable of removing the ESP trailer · 47ebcc0b
      Yossi Kuperman 提交于
      In conjunction with crypto offload [1], removing the ESP trailer by
      hardware can potentially improve the performance by avoiding (1) a
      cache miss incurred by reading the nexthdr field and (2) the necessity
      to calculate the csum value of the trailer in order to keep skb->csum
      valid.
      
      This patch introduces the changes to the xfrm stack and merely serves
      as an infrastructure. Subsequent patch to mlx5 driver will put this to
      a good use.
      
      [1] https://www.mail-archive.com/netdev@vger.kernel.org/msg175733.htmlSigned-off-by: NYossi Kuperman <yossiku@mellanox.com>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      47ebcc0b
    • D
      devlink: Maintain consistency in mac field name · 12bdc5e1
      David Ahern 提交于
      IPv4 name uses "destination ip" as does the IPv6 patch set.
      Make the mac field consistent.
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      12bdc5e1
    • E
      kcm: do not attach PF_KCM sockets to avoid deadlock · 351050ec
      Eric Dumazet 提交于
      syzkaller had no problem to trigger a deadlock, attaching a KCM socket
      to another one (or itself). (original syzkaller report was a very
      confusing lockdep splat during a sendmsg())
      
      It seems KCM claims to only support TCP, but no enforcement is done,
      so we might need to add additional checks.
      
      Fixes: ab7ac4eb ("kcm: Kernel Connection Multiplexor module")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Acked-by: NTom Herbert <tom@quantonium.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      351050ec
    • N
      sch_tbf: fix two null pointer dereferences on init failure · c2d6511e
      Nikolay Aleksandrov 提交于
      sch_tbf calls qdisc_watchdog_cancel() in both its ->reset and ->destroy
      callbacks but it may fail before the timer is initialized due to missing
      options (either not supplied by user-space or set as a default qdisc),
      also q->qdisc is used by ->reset and ->destroy so we need it initialized.
      
      Reproduce:
      $ sysctl net.core.default_qdisc=tbf
      $ ip l set ethX up
      
      Crash log:
      [  959.160172] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
      [  959.160323] IP: qdisc_reset+0xa/0x5c
      [  959.160400] PGD 59cdb067
      [  959.160401] P4D 59cdb067
      [  959.160466] PUD 59ccb067
      [  959.160532] PMD 0
      [  959.160597]
      [  959.160706] Oops: 0000 [#1] SMP
      [  959.160778] Modules linked in: sch_tbf sch_sfb sch_prio sch_netem
      [  959.160891] CPU: 2 PID: 1562 Comm: ip Not tainted 4.13.0-rc6+ #62
      [  959.160998] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
      [  959.161157] task: ffff880059c9a700 task.stack: ffff8800376d0000
      [  959.161263] RIP: 0010:qdisc_reset+0xa/0x5c
      [  959.161347] RSP: 0018:ffff8800376d3610 EFLAGS: 00010286
      [  959.161531] RAX: ffffffffa001b1dd RBX: ffff8800373a2800 RCX: 0000000000000000
      [  959.161733] RDX: ffffffff8215f160 RSI: ffffffff8215f160 RDI: 0000000000000000
      [  959.161939] RBP: ffff8800376d3618 R08: 00000000014080c0 R09: 00000000ffffffff
      [  959.162141] R10: ffff8800376d3578 R11: 0000000000000020 R12: ffffffffa001d2c0
      [  959.162343] R13: ffff880037538000 R14: 00000000ffffffff R15: 0000000000000001
      [  959.162546] FS:  00007fcc5126b740(0000) GS:ffff88005d900000(0000) knlGS:0000000000000000
      [  959.162844] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  959.163030] CR2: 0000000000000018 CR3: 000000005abc4000 CR4: 00000000000406e0
      [  959.163233] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  959.163436] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  959.163638] Call Trace:
      [  959.163788]  tbf_reset+0x19/0x64 [sch_tbf]
      [  959.163957]  qdisc_destroy+0x8b/0xe5
      [  959.164119]  qdisc_create_dflt+0x86/0x94
      [  959.164284]  ? dev_activate+0x129/0x129
      [  959.164449]  attach_one_default_qdisc+0x36/0x63
      [  959.164623]  netdev_for_each_tx_queue+0x3d/0x48
      [  959.164795]  dev_activate+0x4b/0x129
      [  959.164957]  __dev_open+0xe7/0x104
      [  959.165118]  __dev_change_flags+0xc6/0x15c
      [  959.165287]  dev_change_flags+0x25/0x59
      [  959.165451]  do_setlink+0x30c/0xb3f
      [  959.165613]  ? check_chain_key+0xb0/0xfd
      [  959.165782]  rtnl_newlink+0x3a4/0x729
      [  959.165947]  ? rtnl_newlink+0x117/0x729
      [  959.166121]  ? ns_capable_common+0xd/0xb1
      [  959.166288]  ? ns_capable+0x13/0x15
      [  959.166450]  rtnetlink_rcv_msg+0x188/0x197
      [  959.166617]  ? rcu_read_unlock+0x3e/0x5f
      [  959.166783]  ? rtnl_newlink+0x729/0x729
      [  959.166948]  netlink_rcv_skb+0x6c/0xce
      [  959.167113]  rtnetlink_rcv+0x23/0x2a
      [  959.167273]  netlink_unicast+0x103/0x181
      [  959.167439]  netlink_sendmsg+0x326/0x337
      [  959.167607]  sock_sendmsg_nosec+0x14/0x3f
      [  959.167772]  sock_sendmsg+0x29/0x2e
      [  959.167932]  ___sys_sendmsg+0x209/0x28b
      [  959.168098]  ? do_raw_spin_unlock+0xcd/0xf8
      [  959.168267]  ? _raw_spin_unlock+0x27/0x31
      [  959.168432]  ? __handle_mm_fault+0x651/0xdb1
      [  959.168602]  ? check_chain_key+0xb0/0xfd
      [  959.168773]  __sys_sendmsg+0x45/0x63
      [  959.168934]  ? __sys_sendmsg+0x45/0x63
      [  959.169100]  SyS_sendmsg+0x19/0x1b
      [  959.169260]  entry_SYSCALL_64_fastpath+0x23/0xc2
      [  959.169432] RIP: 0033:0x7fcc5097e690
      [  959.169592] RSP: 002b:00007ffd0d5c7b48 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      [  959.169887] RAX: ffffffffffffffda RBX: ffffffff810d278c RCX: 00007fcc5097e690
      [  959.170089] RDX: 0000000000000000 RSI: 00007ffd0d5c7b90 RDI: 0000000000000003
      [  959.170292] RBP: ffff8800376d3f98 R08: 0000000000000001 R09: 0000000000000003
      [  959.170494] R10: 00007ffd0d5c7910 R11: 0000000000000246 R12: 0000000000000006
      [  959.170697] R13: 000000000066f1a0 R14: 00007ffd0d5cfc40 R15: 0000000000000000
      [  959.170900]  ? trace_hardirqs_off_caller+0xa7/0xcf
      [  959.171076] Code: 00 41 c7 84 24 14 01 00 00 00 00 00 00 41 c7 84 24
      98 00 00 00 00 00 00 00 41 5c 41 5d 41 5e 5d c3 66 66 66 66 90 55 48 89
      e5 53 <48> 8b 47 18 48 89 fb 48 8b 40 48 48 85 c0 74 02 ff d0 48 8b bb
      [  959.171637] RIP: qdisc_reset+0xa/0x5c RSP: ffff8800376d3610
      [  959.171821] CR2: 0000000000000018
      
      Fixes: 87b60cfa ("net_sched: fix error recovery at qdisc creation")
      Fixes: 0fbbeb1b ("[PKT_SCHED]: Fix missing qdisc_destroy() in qdisc_create_dflt()")
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c2d6511e
    • N
      sch_sfq: fix null pointer dereference on init failure · e2326576
      Nikolay Aleksandrov 提交于
      Currently only a memory allocation failure can lead to this, so let's
      initialize the timer first.
      
      Fixes: 6529eaba ("net: sched: introduce tcf block infractructure")
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e2326576
    • N
      sch_netem: avoid null pointer deref on init failure · 634576a1
      Nikolay Aleksandrov 提交于
      netem can fail in ->init due to missing options (either not supplied by
      user-space or used as a default qdisc) causing a timer->base null
      pointer deref in its ->destroy() and ->reset() callbacks.
      
      Reproduce:
      $ sysctl net.core.default_qdisc=netem
      $ ip l set ethX up
      
      Crash log:
      [ 1814.846943] BUG: unable to handle kernel NULL pointer dereference at (null)
      [ 1814.847181] IP: hrtimer_active+0x17/0x8a
      [ 1814.847270] PGD 59c34067
      [ 1814.847271] P4D 59c34067
      [ 1814.847337] PUD 37374067
      [ 1814.847403] PMD 0
      [ 1814.847468]
      [ 1814.847582] Oops: 0000 [#1] SMP
      [ 1814.847655] Modules linked in: sch_netem(O) sch_fq_codel(O)
      [ 1814.847761] CPU: 3 PID: 1573 Comm: ip Tainted: G           O 4.13.0-rc6+ #62
      [ 1814.847884] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
      [ 1814.848043] task: ffff88003723a700 task.stack: ffff88005adc8000
      [ 1814.848235] RIP: 0010:hrtimer_active+0x17/0x8a
      [ 1814.848407] RSP: 0018:ffff88005adcb590 EFLAGS: 00010246
      [ 1814.848590] RAX: 0000000000000000 RBX: ffff880058e359d8 RCX: 0000000000000000
      [ 1814.848793] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880058e359d8
      [ 1814.848998] RBP: ffff88005adcb5b0 R08: 00000000014080c0 R09: 00000000ffffffff
      [ 1814.849204] R10: ffff88005adcb660 R11: 0000000000000020 R12: 0000000000000000
      [ 1814.849410] R13: ffff880058e359d8 R14: 00000000ffffffff R15: 0000000000000001
      [ 1814.849616] FS:  00007f733bbca740(0000) GS:ffff88005d980000(0000) knlGS:0000000000000000
      [ 1814.849919] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 1814.850107] CR2: 0000000000000000 CR3: 0000000059f0d000 CR4: 00000000000406e0
      [ 1814.850313] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [ 1814.850518] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [ 1814.850723] Call Trace:
      [ 1814.850875]  hrtimer_try_to_cancel+0x1a/0x93
      [ 1814.851047]  hrtimer_cancel+0x15/0x20
      [ 1814.851211]  qdisc_watchdog_cancel+0x12/0x14
      [ 1814.851383]  netem_reset+0xe6/0xed [sch_netem]
      [ 1814.851561]  qdisc_destroy+0x8b/0xe5
      [ 1814.851723]  qdisc_create_dflt+0x86/0x94
      [ 1814.851890]  ? dev_activate+0x129/0x129
      [ 1814.852057]  attach_one_default_qdisc+0x36/0x63
      [ 1814.852232]  netdev_for_each_tx_queue+0x3d/0x48
      [ 1814.852406]  dev_activate+0x4b/0x129
      [ 1814.852569]  __dev_open+0xe7/0x104
      [ 1814.852730]  __dev_change_flags+0xc6/0x15c
      [ 1814.852899]  dev_change_flags+0x25/0x59
      [ 1814.853064]  do_setlink+0x30c/0xb3f
      [ 1814.853228]  ? check_chain_key+0xb0/0xfd
      [ 1814.853396]  ? check_chain_key+0xb0/0xfd
      [ 1814.853565]  rtnl_newlink+0x3a4/0x729
      [ 1814.853728]  ? rtnl_newlink+0x117/0x729
      [ 1814.853905]  ? ns_capable_common+0xd/0xb1
      [ 1814.854072]  ? ns_capable+0x13/0x15
      [ 1814.854234]  rtnetlink_rcv_msg+0x188/0x197
      [ 1814.854404]  ? rcu_read_unlock+0x3e/0x5f
      [ 1814.854572]  ? rtnl_newlink+0x729/0x729
      [ 1814.854737]  netlink_rcv_skb+0x6c/0xce
      [ 1814.854902]  rtnetlink_rcv+0x23/0x2a
      [ 1814.855064]  netlink_unicast+0x103/0x181
      [ 1814.855230]  netlink_sendmsg+0x326/0x337
      [ 1814.855398]  sock_sendmsg_nosec+0x14/0x3f
      [ 1814.855584]  sock_sendmsg+0x29/0x2e
      [ 1814.855747]  ___sys_sendmsg+0x209/0x28b
      [ 1814.855912]  ? do_raw_spin_unlock+0xcd/0xf8
      [ 1814.856082]  ? _raw_spin_unlock+0x27/0x31
      [ 1814.856251]  ? __handle_mm_fault+0x651/0xdb1
      [ 1814.856421]  ? check_chain_key+0xb0/0xfd
      [ 1814.856592]  __sys_sendmsg+0x45/0x63
      [ 1814.856755]  ? __sys_sendmsg+0x45/0x63
      [ 1814.856923]  SyS_sendmsg+0x19/0x1b
      [ 1814.857083]  entry_SYSCALL_64_fastpath+0x23/0xc2
      [ 1814.857256] RIP: 0033:0x7f733b2dd690
      [ 1814.857419] RSP: 002b:00007ffe1d3387d8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      [ 1814.858238] RAX: ffffffffffffffda RBX: ffffffff810d278c RCX: 00007f733b2dd690
      [ 1814.858445] RDX: 0000000000000000 RSI: 00007ffe1d338820 RDI: 0000000000000003
      [ 1814.858651] RBP: ffff88005adcbf98 R08: 0000000000000001 R09: 0000000000000003
      [ 1814.858856] R10: 00007ffe1d3385a0 R11: 0000000000000246 R12: 0000000000000002
      [ 1814.859060] R13: 000000000066f1a0 R14: 00007ffe1d3408d0 R15: 0000000000000000
      [ 1814.859267]  ? trace_hardirqs_off_caller+0xa7/0xcf
      [ 1814.859446] Code: 10 55 48 89 c7 48 89 e5 e8 45 a1 fb ff 31 c0 5d c3
      31 c0 c3 66 66 66 66 90 55 48 89 e5 41 56 41 55 41 54 53 49 89 fd 49 8b
      45 30 <4c> 8b 20 41 8b 5c 24 38 31 c9 31 d2 48 c7 c7 50 8e 1d 82 41 89
      [ 1814.860022] RIP: hrtimer_active+0x17/0x8a RSP: ffff88005adcb590
      [ 1814.860214] CR2: 0000000000000000
      
      Fixes: 87b60cfa ("net_sched: fix error recovery at qdisc creation")
      Fixes: 0fbbeb1b ("[PKT_SCHED]: Fix missing qdisc_destroy() in qdisc_create_dflt()")
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      634576a1
    • N
      sch_fq_codel: avoid double free on init failure · 30c31d74
      Nikolay Aleksandrov 提交于
      It is very unlikely to happen but the backlogs memory allocation
      could fail and will free q->flows, but then ->destroy() will free
      q->flows too. For correctness remove the first free and let ->destroy
      clean up.
      
      Fixes: 87b60cfa ("net_sched: fix error recovery at qdisc creation")
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      30c31d74
    • N
      sch_cbq: fix null pointer dereferences on init failure · 3501d059
      Nikolay Aleksandrov 提交于
      CBQ can fail on ->init by wrong nl attributes or simply for missing any,
      f.e. if it's set as a default qdisc then TCA_OPTIONS (opt) will be NULL
      when it is activated. The first thing init does is parse opt but it will
      dereference a null pointer if used as a default qdisc, also since init
      failure at default qdisc invokes ->reset() which cancels all timers then
      we'll also dereference two more null pointers (timer->base) as they were
      never initialized.
      
      To reproduce:
      $ sysctl net.core.default_qdisc=cbq
      $ ip l set ethX up
      
      Crash log of the first null ptr deref:
      [44727.907454] BUG: unable to handle kernel NULL pointer dereference at (null)
      [44727.907600] IP: cbq_init+0x27/0x205
      [44727.907676] PGD 59ff4067
      [44727.907677] P4D 59ff4067
      [44727.907742] PUD 59c70067
      [44727.907807] PMD 0
      [44727.907873]
      [44727.907982] Oops: 0000 [#1] SMP
      [44727.908054] Modules linked in:
      [44727.908126] CPU: 1 PID: 21312 Comm: ip Not tainted 4.13.0-rc6+ #60
      [44727.908235] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
      [44727.908477] task: ffff88005ad42700 task.stack: ffff880037214000
      [44727.908672] RIP: 0010:cbq_init+0x27/0x205
      [44727.908838] RSP: 0018:ffff8800372175f0 EFLAGS: 00010286
      [44727.909018] RAX: ffffffff816c3852 RBX: ffff880058c53800 RCX: 0000000000000000
      [44727.909222] RDX: 0000000000000004 RSI: 0000000000000000 RDI: ffff8800372175f8
      [44727.909427] RBP: ffff880037217650 R08: ffffffff81b0f380 R09: 0000000000000000
      [44727.909631] R10: ffff880037217660 R11: 0000000000000020 R12: ffffffff822a44c0
      [44727.909835] R13: ffff880058b92000 R14: 00000000ffffffff R15: 0000000000000001
      [44727.910040] FS:  00007ff8bc583740(0000) GS:ffff88005d880000(0000) knlGS:0000000000000000
      [44727.910339] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [44727.910525] CR2: 0000000000000000 CR3: 00000000371e5000 CR4: 00000000000406e0
      [44727.910731] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [44727.910936] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [44727.911141] Call Trace:
      [44727.911291]  ? lockdep_init_map+0xb6/0x1ba
      [44727.911461]  ? qdisc_alloc+0x14e/0x187
      [44727.911626]  qdisc_create_dflt+0x7a/0x94
      [44727.911794]  ? dev_activate+0x129/0x129
      [44727.911959]  attach_one_default_qdisc+0x36/0x63
      [44727.912132]  netdev_for_each_tx_queue+0x3d/0x48
      [44727.912305]  dev_activate+0x4b/0x129
      [44727.912468]  __dev_open+0xe7/0x104
      [44727.912631]  __dev_change_flags+0xc6/0x15c
      [44727.912799]  dev_change_flags+0x25/0x59
      [44727.912966]  do_setlink+0x30c/0xb3f
      [44727.913129]  ? check_chain_key+0xb0/0xfd
      [44727.913294]  ? check_chain_key+0xb0/0xfd
      [44727.913463]  rtnl_newlink+0x3a4/0x729
      [44727.913626]  ? rtnl_newlink+0x117/0x729
      [44727.913801]  ? ns_capable_common+0xd/0xb1
      [44727.913968]  ? ns_capable+0x13/0x15
      [44727.914131]  rtnetlink_rcv_msg+0x188/0x197
      [44727.914300]  ? rcu_read_unlock+0x3e/0x5f
      [44727.914465]  ? rtnl_newlink+0x729/0x729
      [44727.914630]  netlink_rcv_skb+0x6c/0xce
      [44727.914796]  rtnetlink_rcv+0x23/0x2a
      [44727.914956]  netlink_unicast+0x103/0x181
      [44727.915122]  netlink_sendmsg+0x326/0x337
      [44727.915291]  sock_sendmsg_nosec+0x14/0x3f
      [44727.915459]  sock_sendmsg+0x29/0x2e
      [44727.915619]  ___sys_sendmsg+0x209/0x28b
      [44727.915784]  ? do_raw_spin_unlock+0xcd/0xf8
      [44727.915954]  ? _raw_spin_unlock+0x27/0x31
      [44727.916121]  ? __handle_mm_fault+0x651/0xdb1
      [44727.916290]  ? check_chain_key+0xb0/0xfd
      [44727.916461]  __sys_sendmsg+0x45/0x63
      [44727.916626]  ? __sys_sendmsg+0x45/0x63
      [44727.916792]  SyS_sendmsg+0x19/0x1b
      [44727.916950]  entry_SYSCALL_64_fastpath+0x23/0xc2
      [44727.917125] RIP: 0033:0x7ff8bbc96690
      [44727.917286] RSP: 002b:00007ffc360991e8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      [44727.917579] RAX: ffffffffffffffda RBX: ffffffff810d278c RCX: 00007ff8bbc96690
      [44727.917783] RDX: 0000000000000000 RSI: 00007ffc36099230 RDI: 0000000000000003
      [44727.917987] RBP: ffff880037217f98 R08: 0000000000000001 R09: 0000000000000003
      [44727.918190] R10: 00007ffc36098fb0 R11: 0000000000000246 R12: 0000000000000006
      [44727.918393] R13: 000000000066f1a0 R14: 00007ffc360a12e0 R15: 0000000000000000
      [44727.918597]  ? trace_hardirqs_off_caller+0xa7/0xcf
      [44727.918774] Code: 41 5f 5d c3 66 66 66 66 90 55 48 8d 56 04 45 31 c9
      49 c7 c0 80 f3 b0 81 48 89 e5 41 55 41 54 53 48 89 fb 48 8d 7d a8 48 83
      ec 48 <0f> b7 0e be 07 00 00 00 83 e9 04 e8 e6 f7 d8 ff 85 c0 0f 88 bb
      [44727.919332] RIP: cbq_init+0x27/0x205 RSP: ffff8800372175f0
      [44727.919516] CR2: 0000000000000000
      
      Fixes: 0fbbeb1b ("[PKT_SCHED]: Fix missing qdisc_destroy() in qdisc_create_dflt()")
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3501d059
    • N
      sch_hfsc: fix null pointer deref and double free on init failure · 3bdac362
      Nikolay Aleksandrov 提交于
      Depending on where ->init fails we can get a null pointer deref due to
      uninitialized hires timer (watchdog) or a double free of the qdisc hash
      because it is already freed by ->destroy().
      
      Fixes: 8d553738 ("net/sched/hfsc: allocate tcf block for hfsc root class")
      Fixes: 87b60cfa ("net_sched: fix error recovery at qdisc creation")
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3bdac362
    • N
      sch_hhf: fix null pointer dereference on init failure · 32db864d
      Nikolay Aleksandrov 提交于
      If sch_hhf fails in its ->init() function (either due to wrong
      user-space arguments as below or memory alloc failure of hh_flows) it
      will do a null pointer deref of q->hh_flows in its ->destroy() function.
      
      To reproduce the crash:
      $ tc qdisc add dev eth0 root hhf quantum 2000000 non_hh_weight 10000000
      
      Crash log:
      [  690.654882] BUG: unable to handle kernel NULL pointer dereference at (null)
      [  690.655565] IP: hhf_destroy+0x48/0xbc
      [  690.655944] PGD 37345067
      [  690.655948] P4D 37345067
      [  690.656252] PUD 58402067
      [  690.656554] PMD 0
      [  690.656857]
      [  690.657362] Oops: 0000 [#1] SMP
      [  690.657696] Modules linked in:
      [  690.658032] CPU: 3 PID: 920 Comm: tc Not tainted 4.13.0-rc6+ #57
      [  690.658525] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
      [  690.659255] task: ffff880058578000 task.stack: ffff88005acbc000
      [  690.659747] RIP: 0010:hhf_destroy+0x48/0xbc
      [  690.660146] RSP: 0018:ffff88005acbf9e0 EFLAGS: 00010246
      [  690.660601] RAX: 0000000000000000 RBX: 0000000000000020 RCX: 0000000000000000
      [  690.661155] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff821f63f0
      [  690.661710] RBP: ffff88005acbfa08 R08: ffffffff81b10a90 R09: 0000000000000000
      [  690.662267] R10: 00000000f42b7019 R11: ffff880058578000 R12: 00000000ffffffea
      [  690.662820] R13: ffff8800372f6400 R14: 0000000000000000 R15: 0000000000000000
      [  690.663769] FS:  00007f8ae5e8b740(0000) GS:ffff88005d980000(0000) knlGS:0000000000000000
      [  690.667069] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  690.667965] CR2: 0000000000000000 CR3: 0000000058523000 CR4: 00000000000406e0
      [  690.668918] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  690.669945] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  690.671003] Call Trace:
      [  690.671743]  qdisc_create+0x377/0x3fd
      [  690.672534]  tc_modify_qdisc+0x4d2/0x4fd
      [  690.673324]  rtnetlink_rcv_msg+0x188/0x197
      [  690.674204]  ? rcu_read_unlock+0x3e/0x5f
      [  690.675091]  ? rtnl_newlink+0x729/0x729
      [  690.675877]  netlink_rcv_skb+0x6c/0xce
      [  690.676648]  rtnetlink_rcv+0x23/0x2a
      [  690.677405]  netlink_unicast+0x103/0x181
      [  690.678179]  netlink_sendmsg+0x326/0x337
      [  690.678958]  sock_sendmsg_nosec+0x14/0x3f
      [  690.679743]  sock_sendmsg+0x29/0x2e
      [  690.680506]  ___sys_sendmsg+0x209/0x28b
      [  690.681283]  ? __handle_mm_fault+0xc7d/0xdb1
      [  690.681915]  ? check_chain_key+0xb0/0xfd
      [  690.682449]  __sys_sendmsg+0x45/0x63
      [  690.682954]  ? __sys_sendmsg+0x45/0x63
      [  690.683471]  SyS_sendmsg+0x19/0x1b
      [  690.683974]  entry_SYSCALL_64_fastpath+0x23/0xc2
      [  690.684516] RIP: 0033:0x7f8ae529d690
      [  690.685016] RSP: 002b:00007fff26d2d6b8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      [  690.685931] RAX: ffffffffffffffda RBX: ffffffff810d278c RCX: 00007f8ae529d690
      [  690.686573] RDX: 0000000000000000 RSI: 00007fff26d2d700 RDI: 0000000000000003
      [  690.687047] RBP: ffff88005acbff98 R08: 0000000000000001 R09: 0000000000000000
      [  690.687519] R10: 00007fff26d2d480 R11: 0000000000000246 R12: 0000000000000002
      [  690.687996] R13: 0000000001258070 R14: 0000000000000001 R15: 0000000000000000
      [  690.688475]  ? trace_hardirqs_off_caller+0xa7/0xcf
      [  690.688887] Code: 00 00 e8 2a 02 ae ff 49 8b bc 1d 60 02 00 00 48 83
      c3 08 e8 19 02 ae ff 48 83 fb 20 75 dc 45 31 f6 4d 89 f7 4d 03 bd 20 02
      00 00 <49> 8b 07 49 39 c7 75 24 49 83 c6 10 49 81 fe 00 40 00 00 75 e1
      [  690.690200] RIP: hhf_destroy+0x48/0xbc RSP: ffff88005acbf9e0
      [  690.690636] CR2: 0000000000000000
      
      Fixes: 87b60cfa ("net_sched: fix error recovery at qdisc creation")
      Fixes: 10239edf ("net-qdisc-hhf: Heavy-Hitter Filter (HHF) qdisc")
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      32db864d
    • N
      sch_multiq: fix double free on init failure · e89d469e
      Nikolay Aleksandrov 提交于
      The below commit added a call to ->destroy() on init failure, but multiq
      still frees ->queues on error in init, but ->queues is also freed by
      ->destroy() thus we get double free and corrupted memory.
      
      Very easy to reproduce (eth0 not multiqueue):
      $ tc qdisc add dev eth0 root multiq
      RTNETLINK answers: Operation not supported
      $ ip l add dumdum type dummy
      (crash)
      
      Trace log:
      [ 3929.467747] general protection fault: 0000 [#1] SMP
      [ 3929.468083] Modules linked in:
      [ 3929.468302] CPU: 3 PID: 967 Comm: ip Not tainted 4.13.0-rc6+ #56
      [ 3929.468625] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
      [ 3929.469124] task: ffff88003716a700 task.stack: ffff88005872c000
      [ 3929.469449] RIP: 0010:__kmalloc_track_caller+0x117/0x1be
      [ 3929.469746] RSP: 0018:ffff88005872f6a0 EFLAGS: 00010246
      [ 3929.470042] RAX: 00000000000002de RBX: 0000000058a59000 RCX: 00000000000002df
      [ 3929.470406] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff821f7020
      [ 3929.470770] RBP: ffff88005872f6e8 R08: 000000000001f010 R09: 0000000000000000
      [ 3929.471133] R10: ffff88005872f730 R11: 0000000000008cdd R12: ff006d75646d7564
      [ 3929.471496] R13: 00000000014000c0 R14: ffff88005b403c00 R15: ffff88005b403c00
      [ 3929.471869] FS:  00007f0b70480740(0000) GS:ffff88005d980000(0000) knlGS:0000000000000000
      [ 3929.472286] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 3929.472677] CR2: 00007ffcee4f3000 CR3: 0000000059d45000 CR4: 00000000000406e0
      [ 3929.473209] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [ 3929.474109] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [ 3929.474873] Call Trace:
      [ 3929.475337]  ? kstrdup_const+0x23/0x25
      [ 3929.475863]  kstrdup+0x2e/0x4b
      [ 3929.476338]  kstrdup_const+0x23/0x25
      [ 3929.478084]  __kernfs_new_node+0x28/0xbc
      [ 3929.478478]  kernfs_new_node+0x35/0x55
      [ 3929.478929]  kernfs_create_link+0x23/0x76
      [ 3929.479478]  sysfs_do_create_link_sd.isra.2+0x85/0xd7
      [ 3929.480096]  sysfs_create_link+0x33/0x35
      [ 3929.480649]  device_add+0x200/0x589
      [ 3929.481184]  netdev_register_kobject+0x7c/0x12f
      [ 3929.481711]  register_netdevice+0x373/0x471
      [ 3929.482174]  rtnl_newlink+0x614/0x729
      [ 3929.482610]  ? rtnl_newlink+0x17f/0x729
      [ 3929.483080]  rtnetlink_rcv_msg+0x188/0x197
      [ 3929.483533]  ? rcu_read_unlock+0x3e/0x5f
      [ 3929.483984]  ? rtnl_newlink+0x729/0x729
      [ 3929.484420]  netlink_rcv_skb+0x6c/0xce
      [ 3929.484858]  rtnetlink_rcv+0x23/0x2a
      [ 3929.485291]  netlink_unicast+0x103/0x181
      [ 3929.485735]  netlink_sendmsg+0x326/0x337
      [ 3929.486181]  sock_sendmsg_nosec+0x14/0x3f
      [ 3929.486614]  sock_sendmsg+0x29/0x2e
      [ 3929.486973]  ___sys_sendmsg+0x209/0x28b
      [ 3929.487340]  ? do_raw_spin_unlock+0xcd/0xf8
      [ 3929.487719]  ? _raw_spin_unlock+0x27/0x31
      [ 3929.488092]  ? __handle_mm_fault+0x651/0xdb1
      [ 3929.488471]  ? check_chain_key+0xb0/0xfd
      [ 3929.488847]  __sys_sendmsg+0x45/0x63
      [ 3929.489206]  ? __sys_sendmsg+0x45/0x63
      [ 3929.489576]  SyS_sendmsg+0x19/0x1b
      [ 3929.489901]  entry_SYSCALL_64_fastpath+0x23/0xc2
      [ 3929.490172] RIP: 0033:0x7f0b6fb93690
      [ 3929.490423] RSP: 002b:00007ffcee4ed588 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      [ 3929.490881] RAX: ffffffffffffffda RBX: ffffffff810d278c RCX: 00007f0b6fb93690
      [ 3929.491198] RDX: 0000000000000000 RSI: 00007ffcee4ed5d0 RDI: 0000000000000003
      [ 3929.491521] RBP: ffff88005872ff98 R08: 0000000000000001 R09: 0000000000000000
      [ 3929.491801] R10: 00007ffcee4ed350 R11: 0000000000000246 R12: 0000000000000002
      [ 3929.492075] R13: 000000000066f1a0 R14: 00007ffcee4f5680 R15: 0000000000000000
      [ 3929.492352]  ? trace_hardirqs_off_caller+0xa7/0xcf
      [ 3929.492590] Code: 8b 45 c0 48 8b 45 b8 74 17 48 8b 4d c8 83 ca ff 44
      89 ee 4c 89 f7 e8 83 ca ff ff 49 89 c4 eb 49 49 63 56 20 48 8d 48 01 4d
      8b 06 <49> 8b 1c 14 48 89 c2 4c 89 e0 65 49 0f c7 08 0f 94 c0 83 f0 01
      [ 3929.493335] RIP: __kmalloc_track_caller+0x117/0x1be RSP: ffff88005872f6a0
      
      Fixes: 87b60cfa ("net_sched: fix error recovery at qdisc creation")
      Fixes: f07d1501 ("multiq: Further multiqueue cleanup")
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e89d469e
    • N
      sch_htb: fix crash on init failure · 88c2ace6
      Nikolay Aleksandrov 提交于
      The commit below added a call to the ->destroy() callback for all qdiscs
      which failed in their ->init(), but some were not prepared for such
      change and can't handle partially initialized qdisc. HTB is one of them
      and if any error occurs before the qdisc watchdog timer and qdisc work are
      initialized then we can hit either a null ptr deref (timer->base) when
      canceling in ->destroy or lockdep error info about trying to register
      a non-static key and a stack dump. So to fix these two move the watchdog
      timer and workqueue init before anything that can err out.
      To reproduce userspace needs to send broken htb qdisc create request,
      tested with a modified tc (q_htb.c).
      
      Trace log:
      [ 2710.897602] BUG: unable to handle kernel NULL pointer dereference at (null)
      [ 2710.897977] IP: hrtimer_active+0x17/0x8a
      [ 2710.898174] PGD 58fab067
      [ 2710.898175] P4D 58fab067
      [ 2710.898353] PUD 586c0067
      [ 2710.898531] PMD 0
      [ 2710.898710]
      [ 2710.899045] Oops: 0000 [#1] SMP
      [ 2710.899232] Modules linked in:
      [ 2710.899419] CPU: 1 PID: 950 Comm: tc Not tainted 4.13.0-rc6+ #54
      [ 2710.899646] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
      [ 2710.900035] task: ffff880059ed2700 task.stack: ffff88005ad4c000
      [ 2710.900262] RIP: 0010:hrtimer_active+0x17/0x8a
      [ 2710.900467] RSP: 0018:ffff88005ad4f960 EFLAGS: 00010246
      [ 2710.900684] RAX: 0000000000000000 RBX: ffff88003701e298 RCX: 0000000000000000
      [ 2710.900933] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88003701e298
      [ 2710.901177] RBP: ffff88005ad4f980 R08: 0000000000000001 R09: 0000000000000001
      [ 2710.901419] R10: ffff88005ad4f800 R11: 0000000000000400 R12: 0000000000000000
      [ 2710.901663] R13: ffff88003701e298 R14: ffffffff822a4540 R15: ffff88005ad4fac0
      [ 2710.901907] FS:  00007f2f5e90f740(0000) GS:ffff88005d880000(0000) knlGS:0000000000000000
      [ 2710.902277] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 2710.902500] CR2: 0000000000000000 CR3: 0000000058ca3000 CR4: 00000000000406e0
      [ 2710.902744] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [ 2710.902977] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [ 2710.903180] Call Trace:
      [ 2710.903332]  hrtimer_try_to_cancel+0x1a/0x93
      [ 2710.903504]  hrtimer_cancel+0x15/0x20
      [ 2710.903667]  qdisc_watchdog_cancel+0x12/0x14
      [ 2710.903866]  htb_destroy+0x2e/0xf7
      [ 2710.904097]  qdisc_create+0x377/0x3fd
      [ 2710.904330]  tc_modify_qdisc+0x4d2/0x4fd
      [ 2710.904511]  rtnetlink_rcv_msg+0x188/0x197
      [ 2710.904682]  ? rcu_read_unlock+0x3e/0x5f
      [ 2710.904849]  ? rtnl_newlink+0x729/0x729
      [ 2710.905017]  netlink_rcv_skb+0x6c/0xce
      [ 2710.905183]  rtnetlink_rcv+0x23/0x2a
      [ 2710.905345]  netlink_unicast+0x103/0x181
      [ 2710.905511]  netlink_sendmsg+0x326/0x337
      [ 2710.905679]  sock_sendmsg_nosec+0x14/0x3f
      [ 2710.905847]  sock_sendmsg+0x29/0x2e
      [ 2710.906010]  ___sys_sendmsg+0x209/0x28b
      [ 2710.906176]  ? do_raw_spin_unlock+0xcd/0xf8
      [ 2710.906346]  ? _raw_spin_unlock+0x27/0x31
      [ 2710.906514]  ? __handle_mm_fault+0x651/0xdb1
      [ 2710.906685]  ? check_chain_key+0xb0/0xfd
      [ 2710.906855]  __sys_sendmsg+0x45/0x63
      [ 2710.907018]  ? __sys_sendmsg+0x45/0x63
      [ 2710.907185]  SyS_sendmsg+0x19/0x1b
      [ 2710.907344]  entry_SYSCALL_64_fastpath+0x23/0xc2
      
      Note that probably this bug goes further back because the default qdisc
      handling always calls ->destroy on init failure too.
      
      Fixes: 87b60cfa ("net_sched: fix error recovery at qdisc creation")
      Fixes: 0fbbeb1b ("[PKT_SCHED]: Fix missing qdisc_destroy() in qdisc_create_dflt()")
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      88c2ace6
    • A
      ipv6: sr: fix get_srh() to comply with IPv6 standard "RFC 8200" · 5829d70b
      Ahmed Abdelsalam 提交于
      IPv6 packet may carry more than one extension header, and IPv6 nodes must
      accept and attempt to process extension headers in any order and occurring
      any number of times in the same packet. Hence, there should be no
      assumption that Segment Routing extension header is to appear immediately
      after the IPv6 header.
      
      Moreover, section 4.1 of RFC 8200 gives a recommendation on the order of
      appearance of those extension headers within an IPv6 packet. According to
      this recommendation, Segment Routing extension header should appear after
      Hop-by-Hop and Destination Options headers (if they present).
      
      This patch fixes the get_srh(), so it gets the segment routing header
      regardless of its position in the chain of the extension headers in IPv6
      packet, and makes sure that the IPv6 routing extension header is of Type 4.
      Signed-off-by: NAhmed Abdelsalam <amsalam20@gmail.com>
      Acked-by: NDavid Lebrun <david.lebrun@uclouvain.be>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5829d70b
    • C
      net/sched: Change act_api and act_xxx modules to use IDR · 65a206c0
      Chris Mi 提交于
      Typically, each TC filter has its own action. All the actions of the
      same type are saved in its hash table. But the hash buckets are too
      small that it degrades to a list. And the performance is greatly
      affected. For example, it takes about 0m11.914s to insert 64K rules.
      If we convert the hash table to IDR, it only takes about 0m1.500s.
      The improvement is huge.
      
      But please note that the test result is based on previous patch that
      cls_flower uses IDR.
      Signed-off-by: NChris Mi <chrism@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      65a206c0
    • C
      net/sched: Change cls_flower to use IDR · c15ab236
      Chris Mi 提交于
      Currently, all filters with the same priority are linked in a doubly
      linked list. Every filter should have a unique handle. To make the
      handle unique, we need to iterate the list every time to see if the
      handle exists or not when inserting a new filter. It is time-consuming.
      For example, it takes about 5m3.169s to insert 64K rules.
      
      This patch changes cls_flower to use IDR. With this patch, it
      takes about 0m1.127s to insert 64K rules. The improvement is huge.
      
      But please note that in this testing, all filters share the same action.
      If every filter has a unique action, that is another bottleneck.
      Follow-up patch in this patchset addresses that.
      Signed-off-by: NChris Mi <chrism@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c15ab236
    • F
      tcp: Revert "tcp: remove header prediction" · 31770e34
      Florian Westphal 提交于
      This reverts commit 45f119bf.
      
      Eric Dumazet says:
        We found at Google a significant regression caused by
        45f119bf tcp: remove header prediction
      
        In typical RPC  (TCP_RR), when a TCP socket receives data, we now call
        tcp_ack() while we used to not call it.
      
        This touches enough cache lines to cause a slowdown.
      
      so problem does not seem to be HP removal itself but the tcp_ack()
      call.  Therefore, it might be possible to remove HP after all, provided
      one finds a way to elide tcp_ack for most cases.
      Reported-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      31770e34
    • F
      tcp: Revert "tcp: remove CA_ACK_SLOWPATH" · c1d2b4c3
      Florian Westphal 提交于
      This change was a followup to the header prediction removal,
      so first revert this as a prerequisite to back out hp removal.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c1d2b4c3
  4. 30 8月, 2017 6 次提交