1. 13 11月, 2019 40 次提交
    • T
      bonding: fix unexpected IFF_BONDING bit unset · 0d0ca85a
      Taehee Yoo 提交于
      [ Upstream commit 65de65d9033750d2cf1b336c9d6e9da3a8b5cc6e ]
      
      The IFF_BONDING means bonding master or bonding slave device.
      ->ndo_add_slave() sets IFF_BONDING flag and ->ndo_del_slave() unsets
      IFF_BONDING flag.
      
      bond0<--bond1
      
      Both bond0 and bond1 are bonding device and these should keep having
      IFF_BONDING flag until they are removed.
      But bond1 would lose IFF_BONDING at ->ndo_del_slave() because that routine
      do not check whether the slave device is the bonding type or not.
      This patch adds the interface type check routine before removing
      IFF_BONDING flag.
      
      Test commands:
          ip link add bond0 type bond
          ip link add bond1 type bond
          ip link set bond1 master bond0
          ip link set bond1 nomaster
          ip link del bond1 type bond
          ip link add bond1 type bond
      
      Splat looks like:
      [  226.665555] proc_dir_entry 'bonding/bond1' already registered
      [  226.666440] WARNING: CPU: 0 PID: 737 at fs/proc/generic.c:361 proc_register+0x2a9/0x3e0
      [  226.667571] Modules linked in: bonding af_packet sch_fq_codel ip_tables x_tables unix
      [  226.668662] CPU: 0 PID: 737 Comm: ip Not tainted 5.4.0-rc3+ #96
      [  226.669508] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
      [  226.670652] RIP: 0010:proc_register+0x2a9/0x3e0
      [  226.671612] Code: 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 39 01 00 00 48 8b 04 24 48 89 ea 48 c7 c7 a0 0b 14 9f 48 8b b0 e
      0 00 00 00 e8 07 e7 88 ff <0f> 0b 48 c7 c7 40 2d a5 9f e8 59 d6 23 01 48 8b 4c 24 10 48 b8 00
      [  226.675007] RSP: 0018:ffff888050e17078 EFLAGS: 00010282
      [  226.675761] RAX: dffffc0000000008 RBX: ffff88805fdd0f10 RCX: ffffffff9dd344e2
      [  226.676757] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffff88806c9f6b8c
      [  226.677751] RBP: ffff8880507160f3 R08: ffffed100d940019 R09: ffffed100d940019
      [  226.678761] R10: 0000000000000001 R11: ffffed100d940018 R12: ffff888050716008
      [  226.679757] R13: ffff8880507160f2 R14: dffffc0000000000 R15: ffffed100a0e2c1e
      [  226.680758] FS:  00007fdc217cc0c0(0000) GS:ffff88806c800000(0000) knlGS:0000000000000000
      [  226.681886] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  226.682719] CR2: 00007f49313424d0 CR3: 0000000050e46001 CR4: 00000000000606f0
      [  226.683727] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  226.684725] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  226.685681] Call Trace:
      [  226.687089]  proc_create_seq_private+0xb3/0xf0
      [  226.687778]  bond_create_proc_entry+0x1b3/0x3f0 [bonding]
      [  226.691458]  bond_netdev_event+0x433/0x970 [bonding]
      [  226.692139]  ? __module_text_address+0x13/0x140
      [  226.692779]  notifier_call_chain+0x90/0x160
      [  226.693401]  register_netdevice+0x9b3/0xd80
      [  226.694010]  ? alloc_netdev_mqs+0x854/0xc10
      [  226.694629]  ? netdev_change_features+0xa0/0xa0
      [  226.695278]  ? rtnl_create_link+0x2ed/0xad0
      [  226.695849]  bond_newlink+0x2a/0x60 [bonding]
      [  226.696422]  __rtnl_newlink+0xb9f/0x11b0
      [  226.696968]  ? rtnl_link_unregister+0x220/0x220
      [ ... ]
      
      Fixes: 0b680e75 ("[PATCH] bonding: Add priv_flag to avoid event mishandling")
      Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      0d0ca85a
    • E
      ipvs: move old_secure_tcp into struct netns_ipvs · 50e31318
      Eric Dumazet 提交于
      [ Upstream commit c24b75e0f9239e78105f81c5f03a751641eb07ef ]
      
      syzbot reported the following issue :
      
      BUG: KCSAN: data-race in update_defense_level / update_defense_level
      
      read to 0xffffffff861a6260 of 4 bytes by task 3006 on cpu 1:
       update_defense_level+0x621/0xb30 net/netfilter/ipvs/ip_vs_ctl.c:177
       defense_work_handler+0x3d/0xd0 net/netfilter/ipvs/ip_vs_ctl.c:225
       process_one_work+0x3d4/0x890 kernel/workqueue.c:2269
       worker_thread+0xa0/0x800 kernel/workqueue.c:2415
       kthread+0x1d4/0x200 drivers/block/aoe/aoecmd.c:1253
       ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:352
      
      write to 0xffffffff861a6260 of 4 bytes by task 7333 on cpu 0:
       update_defense_level+0xa62/0xb30 net/netfilter/ipvs/ip_vs_ctl.c:205
       defense_work_handler+0x3d/0xd0 net/netfilter/ipvs/ip_vs_ctl.c:225
       process_one_work+0x3d4/0x890 kernel/workqueue.c:2269
       worker_thread+0xa0/0x800 kernel/workqueue.c:2415
       kthread+0x1d4/0x200 drivers/block/aoe/aoecmd.c:1253
       ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:352
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 0 PID: 7333 Comm: kworker/0:5 Not tainted 5.4.0-rc3+ #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Workqueue: events defense_work_handler
      
      Indeed, old_secure_tcp is currently a static variable, while it
      needs to be a per netns variable.
      
      Fixes: a0840e2e ("IPVS: netns, ip_vs_ctl local vars moved to ipvs struct.")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NSimon Horman <horms@verge.net.au>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      50e31318
    • D
      ipvs: don't ignore errors in case refcounting ip_vs module fails · 102f4078
      Davide Caratti 提交于
      [ Upstream commit 62931f59ce9cbabb934a431f48f2f1f441c605ac ]
      
      if the IPVS module is removed while the sync daemon is starting, there is
      a small gap where try_module_get() might fail getting the refcount inside
      ip_vs_use_count_inc(). Then, the refcounts of IPVS module are unbalanced,
      and the subsequent call to stop_sync_thread() causes the following splat:
      
       WARNING: CPU: 0 PID: 4013 at kernel/module.c:1146 module_put.part.44+0x15b/0x290
        Modules linked in: ip_vs(-) nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 veth ip6table_filter ip6_tables iptable_filter binfmt_misc intel_rapl_msr intel_rapl_common crct10dif_pclmul crc32_pclmul ext4 mbcache jbd2 ghash_clmulni_intel snd_hda_codec_generic ledtrig_audio snd_hda_intel snd_intel_nhlt snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm aesni_intel crypto_simd cryptd glue_helper joydev pcspkr snd_timer virtio_balloon snd soundcore i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c ata_generic pata_acpi virtio_net net_failover virtio_blk failover virtio_console qxl drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ata_piix ttm crc32c_intel serio_raw drm virtio_pci libata virtio_ring virtio floppy dm_mirror dm_region_hash dm_log dm_mod [last unloaded: nf_defrag_ipv6]
        CPU: 0 PID: 4013 Comm: modprobe Tainted: G        W         5.4.0-rc1.upstream+ #741
        Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
        RIP: 0010:module_put.part.44+0x15b/0x290
        Code: 04 25 28 00 00 00 0f 85 18 01 00 00 48 83 c4 68 5b 5d 41 5c 41 5d 41 5e 41 5f c3 89 44 24 28 83 e8 01 89 c5 0f 89 57 ff ff ff <0f> 0b e9 78 ff ff ff 65 8b 1d 67 83 26 4a 89 db be 08 00 00 00 48
        RSP: 0018:ffff888050607c78 EFLAGS: 00010297
        RAX: 0000000000000003 RBX: ffffffffc1420590 RCX: ffffffffb5db0ef9
        RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffffffffc1420590
        RBP: 00000000ffffffff R08: fffffbfff82840b3 R09: fffffbfff82840b3
        R10: 0000000000000001 R11: fffffbfff82840b2 R12: 1ffff1100a0c0f90
        R13: ffffffffc1420200 R14: ffff88804f533300 R15: ffff88804f533ca0
        FS:  00007f8ea9720740(0000) GS:ffff888053800000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 00007f3245abe000 CR3: 000000004c28a006 CR4: 00000000001606f0
        Call Trace:
         stop_sync_thread+0x3a3/0x7c0 [ip_vs]
         ip_vs_sync_net_cleanup+0x13/0x50 [ip_vs]
         ops_exit_list.isra.5+0x94/0x140
         unregister_pernet_operations+0x29d/0x460
         unregister_pernet_device+0x26/0x60
         ip_vs_cleanup+0x11/0x38 [ip_vs]
         __x64_sys_delete_module+0x2d5/0x400
         do_syscall_64+0xa5/0x4e0
         entry_SYSCALL_64_after_hwframe+0x49/0xbe
        RIP: 0033:0x7f8ea8bf0db7
        Code: 73 01 c3 48 8b 0d b9 80 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 89 80 2c 00 f7 d8 64 89 01 48
        RSP: 002b:00007ffcd38d2fe8 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
        RAX: ffffffffffffffda RBX: 0000000002436240 RCX: 00007f8ea8bf0db7
        RDX: 0000000000000000 RSI: 0000000000000800 RDI: 00000000024362a8
        RBP: 0000000000000000 R08: 00007f8ea8eba060 R09: 00007f8ea8c658a0
        R10: 00007ffcd38d2a60 R11: 0000000000000206 R12: 0000000000000000
        R13: 0000000000000001 R14: 00000000024362a8 R15: 0000000000000000
        irq event stamp: 4538
        hardirqs last  enabled at (4537): [<ffffffffb6193dde>] quarantine_put+0x9e/0x170
        hardirqs last disabled at (4538): [<ffffffffb5a0556a>] trace_hardirqs_off_thunk+0x1a/0x20
        softirqs last  enabled at (4522): [<ffffffffb6f8ebe9>] sk_common_release+0x169/0x2d0
        softirqs last disabled at (4520): [<ffffffffb6f8eb3e>] sk_common_release+0xbe/0x2d0
      
      Check the return value of ip_vs_use_count_inc() and let its caller return
      proper error. Inside do_ip_vs_set_ctl() the module is already refcounted,
      we don't need refcount/derefcount there. Finally, in register_ip_vs_app()
      and start_sync_thread(), take the module refcount earlier and ensure it's
      released in the error path.
      
      Change since v1:
       - better return values in case of failure of ip_vs_use_count_inc(),
         thanks to Julian Anastasov
       - no need to increase/decrease the module refcount in ip_vs_set_ctl(),
         thanks to Julian Anastasov
      Signed-off-by: NDavide Caratti <dcaratti@redhat.com>
      Signed-off-by: NJulian Anastasov <ja@ssi.bg>
      Signed-off-by: NSimon Horman <horms@verge.net.au>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      102f4078
    • P
      netfilter: nf_flow_table: set timeout before insertion into hashes · 81de0b50
      Pablo Neira Ayuso 提交于
      [ Upstream commit daf61b026f4686250e6afa619e6d7b49edc61df7 ]
      
      Other garbage collector might remove an entry not fully set up yet.
      
      [570953.958293] RIP: 0010:memcmp+0x9/0x50
      [...]
      [570953.958567]  flow_offload_hash_cmp+0x1e/0x30 [nf_flow_table]
      [570953.958585]  flow_offload_lookup+0x8c/0x110 [nf_flow_table]
      [570953.958606]  nf_flow_offload_ip_hook+0x135/0xb30 [nf_flow_table]
      [570953.958624]  nf_flow_offload_inet_hook+0x35/0x37 [nf_flow_table_inet]
      [570953.958646]  nf_hook_slow+0x3c/0xb0
      [570953.958664]  __netif_receive_skb_core+0x90f/0xb10
      [570953.958678]  ? ip_rcv_finish+0x82/0xa0
      [570953.958692]  __netif_receive_skb_one_core+0x3b/0x80
      [570953.958711]  __netif_receive_skb+0x18/0x60
      [570953.958727]  netif_receive_skb_internal+0x45/0xf0
      [570953.958741]  napi_gro_receive+0xcd/0xf0
      [570953.958764]  ixgbe_clean_rx_irq+0x432/0xe00 [ixgbe]
      [570953.958782]  ixgbe_poll+0x27b/0x700 [ixgbe]
      [570953.958796]  net_rx_action+0x284/0x3c0
      [570953.958817]  __do_softirq+0xcc/0x27c
      [570953.959464]  irq_exit+0xe8/0x100
      [570953.960097]  do_IRQ+0x59/0xe0
      [570953.960734]  common_interrupt+0xf/0xf
      
      Fixes: 43c8f131184f ("netfilter: nf_flow_table: fix missing error check for rhashtable_insert_fast")
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      81de0b50
    • H
      scsi: qla2xxx: Initialized mailbox to prevent driver load failure · d45fc2ed
      Himanshu Madhani 提交于
      [ Upstream commit c2ff2a36eff60efb5e123c940115216d6bf65684 ]
      
      This patch fixes issue with Gen7 adapter in a blade environment where one
      of the ports will not be detected by driver. Firmware expects mailbox 11 to
      be set or cleared by driver for newer ISP.
      
      Following message is seen in the log file:
      
      [   18.810892] qla2xxx [0000:d8:00.0]-1820:1: **** Failed=102 mb[0]=4005 mb[1]=37 mb[2]=20 mb[3]=8
      [   18.819596]  cmd=2 ****
      
      [mkp: typos]
      
      Link: https://lore.kernel.org/r/20191022193643.7076-2-hmadhani@marvell.comSigned-off-by: NHimanshu Madhani <hmadhani@marvell.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      d45fc2ed
    • D
      scsi: lpfc: Honor module parameter lpfc_use_adisc · b6612a3d
      Daniel Wagner 提交于
      [ Upstream commit 0fd103ccfe6a06e40e2d9d8c91d96332cc9e1239 ]
      
      The initial lpfc_desc_set_adisc implementation in commit
      dea3101e ("lpfc: add Emulex FC driver version 8.0.28") enabled ADISC if
      
      	cfg_use_adisc && RSCN_MODE && FCP_2_DEVICE
      
      In commit 92d7f7b0 ("[SCSI] lpfc: NPIV: add NPIV support on top of
      SLI-3") this changed to
      
      	(cfg_use_adisc && RSC_MODE) || FCP_2_DEVICE
      
      and later in commit ffc95493 ("[SCSI] lpfc 8.3.13: FC Discovery Fixes
      and enhancements.") to
      
      	(cfg_use_adisc && RSC_MODE) || (FCP_2_DEVICE && FCP_TARGET)
      
      A customer reports that after a devloss, an ADISC failure is logged. It
      turns out the ADISC flag is set even the user explicitly set lpfc_use_adisc
      = 0.
      
      [Sat Dec 22 22:55:58 2018] lpfc 0000:82:00.0: 2:(0):0203 Devloss timeout on WWPN 50:01:43:80:12:8e:40:20 NPort x05df00 Data: x82000000 x8 xa
      [Sat Dec 22 23:08:20 2018] lpfc 0000:82:00.0: 2:(0):2755 ADISC failure DID:05DF00 Status:x9/x70000
      
      [mkp: fixed Hannes' email]
      
      Fixes: 92d7f7b0 ("[SCSI] lpfc: NPIV: add NPIV support on top of SLI-3")
      Cc: Dick Kennedy <dick.kennedy@broadcom.com>
      Cc: James Smart <james.smart@broadcom.com>
      Link: https://lore.kernel.org/r/20191022072112.132268-1-dwagner@suse.deReviewed-by: NHannes Reinecke <hare@suse.de>
      Reviewed-by: NJames Smart <james.smart@broadcom.com>
      Signed-off-by: NDaniel Wagner <dwagner@suse.de>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      b6612a3d
    • H
      net: openvswitch: free vport unless register_netdevice() succeeds · 4e80e561
      Hillf Danton 提交于
      [ Upstream commit 9464cc37f3671ee69cb1c00662b5e1f113a96b23 ]
      
      syzbot found the following crash on:
      
      HEAD commit:    1e78030e Merge tag 'mmc-v5.3-rc1' of git://git.kernel.org/..
      git tree:       upstream
      console output: https://syzkaller.appspot.com/x/log.txt?x=148d3d1a600000
      kernel config:  https://syzkaller.appspot.com/x/.config?x=30cef20daf3e9977
      dashboard link: https://syzkaller.appspot.com/bug?extid=13210896153522fe1ee5
      compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
      syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=136aa8c4600000
      C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=109ba792600000
      
      =====================================================================
      BUG: memory leak
      unreferenced object 0xffff8881207e4100 (size 128):
         comm "syz-executor032", pid 7014, jiffies 4294944027 (age 13.830s)
         hex dump (first 32 bytes):
           00 70 16 18 81 88 ff ff 80 af 8c 22 81 88 ff ff  .p........."....
           00 b6 23 17 81 88 ff ff 00 00 00 00 00 00 00 00  ..#.............
         backtrace:
           [<000000000eb78212>] kmemleak_alloc_recursive  include/linux/kmemleak.h:43 [inline]
           [<000000000eb78212>] slab_post_alloc_hook mm/slab.h:522 [inline]
           [<000000000eb78212>] slab_alloc mm/slab.c:3319 [inline]
           [<000000000eb78212>] kmem_cache_alloc_trace+0x145/0x2c0 mm/slab.c:3548
           [<00000000006ea6c6>] kmalloc include/linux/slab.h:552 [inline]
           [<00000000006ea6c6>] kzalloc include/linux/slab.h:748 [inline]
           [<00000000006ea6c6>] ovs_vport_alloc+0x37/0xf0  net/openvswitch/vport.c:130
           [<00000000f9a04a7d>] internal_dev_create+0x24/0x1d0  net/openvswitch/vport-internal_dev.c:164
           [<0000000056ee7c13>] ovs_vport_add+0x81/0x190  net/openvswitch/vport.c:199
           [<000000005434efc7>] new_vport+0x19/0x80 net/openvswitch/datapath.c:194
           [<00000000b7b253f1>] ovs_dp_cmd_new+0x22f/0x410  net/openvswitch/datapath.c:1614
           [<00000000e0988518>] genl_family_rcv_msg+0x2ab/0x5b0  net/netlink/genetlink.c:629
           [<00000000d0cc9347>] genl_rcv_msg+0x54/0x9c net/netlink/genetlink.c:654
           [<000000006694b647>] netlink_rcv_skb+0x61/0x170  net/netlink/af_netlink.c:2477
           [<0000000088381f37>] genl_rcv+0x29/0x40 net/netlink/genetlink.c:665
           [<00000000dad42a47>] netlink_unicast_kernel  net/netlink/af_netlink.c:1302 [inline]
           [<00000000dad42a47>] netlink_unicast+0x1ec/0x2d0  net/netlink/af_netlink.c:1328
           [<0000000067e6b079>] netlink_sendmsg+0x270/0x480  net/netlink/af_netlink.c:1917
           [<00000000aab08a47>] sock_sendmsg_nosec net/socket.c:637 [inline]
           [<00000000aab08a47>] sock_sendmsg+0x54/0x70 net/socket.c:657
           [<000000004cb7c11d>] ___sys_sendmsg+0x393/0x3c0 net/socket.c:2311
           [<00000000c4901c63>] __sys_sendmsg+0x80/0xf0 net/socket.c:2356
           [<00000000c10abb2d>] __do_sys_sendmsg net/socket.c:2365 [inline]
           [<00000000c10abb2d>] __se_sys_sendmsg net/socket.c:2363 [inline]
           [<00000000c10abb2d>] __x64_sys_sendmsg+0x23/0x30 net/socket.c:2363
      
      BUG: memory leak
      unreferenced object 0xffff88811723b600 (size 64):
         comm "syz-executor032", pid 7014, jiffies 4294944027 (age 13.830s)
         hex dump (first 32 bytes):
           01 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00  ................
           00 00 00 00 00 00 00 00 02 00 00 00 05 35 82 c1  .............5..
         backtrace:
           [<00000000352f46d8>] kmemleak_alloc_recursive  include/linux/kmemleak.h:43 [inline]
           [<00000000352f46d8>] slab_post_alloc_hook mm/slab.h:522 [inline]
           [<00000000352f46d8>] slab_alloc mm/slab.c:3319 [inline]
           [<00000000352f46d8>] __do_kmalloc mm/slab.c:3653 [inline]
           [<00000000352f46d8>] __kmalloc+0x169/0x300 mm/slab.c:3664
           [<000000008e48f3d1>] kmalloc include/linux/slab.h:557 [inline]
           [<000000008e48f3d1>] ovs_vport_set_upcall_portids+0x54/0xd0  net/openvswitch/vport.c:343
           [<00000000541e4f4a>] ovs_vport_alloc+0x7f/0xf0  net/openvswitch/vport.c:139
           [<00000000f9a04a7d>] internal_dev_create+0x24/0x1d0  net/openvswitch/vport-internal_dev.c:164
           [<0000000056ee7c13>] ovs_vport_add+0x81/0x190  net/openvswitch/vport.c:199
           [<000000005434efc7>] new_vport+0x19/0x80 net/openvswitch/datapath.c:194
           [<00000000b7b253f1>] ovs_dp_cmd_new+0x22f/0x410  net/openvswitch/datapath.c:1614
           [<00000000e0988518>] genl_family_rcv_msg+0x2ab/0x5b0  net/netlink/genetlink.c:629
           [<00000000d0cc9347>] genl_rcv_msg+0x54/0x9c net/netlink/genetlink.c:654
           [<000000006694b647>] netlink_rcv_skb+0x61/0x170  net/netlink/af_netlink.c:2477
           [<0000000088381f37>] genl_rcv+0x29/0x40 net/netlink/genetlink.c:665
           [<00000000dad42a47>] netlink_unicast_kernel  net/netlink/af_netlink.c:1302 [inline]
           [<00000000dad42a47>] netlink_unicast+0x1ec/0x2d0  net/netlink/af_netlink.c:1328
           [<0000000067e6b079>] netlink_sendmsg+0x270/0x480  net/netlink/af_netlink.c:1917
           [<00000000aab08a47>] sock_sendmsg_nosec net/socket.c:637 [inline]
           [<00000000aab08a47>] sock_sendmsg+0x54/0x70 net/socket.c:657
           [<000000004cb7c11d>] ___sys_sendmsg+0x393/0x3c0 net/socket.c:2311
           [<00000000c4901c63>] __sys_sendmsg+0x80/0xf0 net/socket.c:2356
      
      BUG: memory leak
      unreferenced object 0xffff8881228ca500 (size 128):
         comm "syz-executor032", pid 7015, jiffies 4294944622 (age 7.880s)
         hex dump (first 32 bytes):
           00 f0 27 18 81 88 ff ff 80 ac 8c 22 81 88 ff ff  ..'........"....
           40 b7 23 17 81 88 ff ff 00 00 00 00 00 00 00 00  @.#.............
         backtrace:
           [<000000000eb78212>] kmemleak_alloc_recursive  include/linux/kmemleak.h:43 [inline]
           [<000000000eb78212>] slab_post_alloc_hook mm/slab.h:522 [inline]
           [<000000000eb78212>] slab_alloc mm/slab.c:3319 [inline]
           [<000000000eb78212>] kmem_cache_alloc_trace+0x145/0x2c0 mm/slab.c:3548
           [<00000000006ea6c6>] kmalloc include/linux/slab.h:552 [inline]
           [<00000000006ea6c6>] kzalloc include/linux/slab.h:748 [inline]
           [<00000000006ea6c6>] ovs_vport_alloc+0x37/0xf0  net/openvswitch/vport.c:130
           [<00000000f9a04a7d>] internal_dev_create+0x24/0x1d0  net/openvswitch/vport-internal_dev.c:164
           [<0000000056ee7c13>] ovs_vport_add+0x81/0x190  net/openvswitch/vport.c:199
           [<000000005434efc7>] new_vport+0x19/0x80 net/openvswitch/datapath.c:194
           [<00000000b7b253f1>] ovs_dp_cmd_new+0x22f/0x410  net/openvswitch/datapath.c:1614
           [<00000000e0988518>] genl_family_rcv_msg+0x2ab/0x5b0  net/netlink/genetlink.c:629
           [<00000000d0cc9347>] genl_rcv_msg+0x54/0x9c net/netlink/genetlink.c:654
           [<000000006694b647>] netlink_rcv_skb+0x61/0x170  net/netlink/af_netlink.c:2477
           [<0000000088381f37>] genl_rcv+0x29/0x40 net/netlink/genetlink.c:665
           [<00000000dad42a47>] netlink_unicast_kernel  net/netlink/af_netlink.c:1302 [inline]
           [<00000000dad42a47>] netlink_unicast+0x1ec/0x2d0  net/netlink/af_netlink.c:1328
           [<0000000067e6b079>] netlink_sendmsg+0x270/0x480  net/netlink/af_netlink.c:1917
           [<00000000aab08a47>] sock_sendmsg_nosec net/socket.c:637 [inline]
           [<00000000aab08a47>] sock_sendmsg+0x54/0x70 net/socket.c:657
           [<000000004cb7c11d>] ___sys_sendmsg+0x393/0x3c0 net/socket.c:2311
           [<00000000c4901c63>] __sys_sendmsg+0x80/0xf0 net/socket.c:2356
           [<00000000c10abb2d>] __do_sys_sendmsg net/socket.c:2365 [inline]
           [<00000000c10abb2d>] __se_sys_sendmsg net/socket.c:2363 [inline]
           [<00000000c10abb2d>] __x64_sys_sendmsg+0x23/0x30 net/socket.c:2363
      =====================================================================
      
      The function in net core, register_netdevice(), may fail with vport's
      destruction callback either invoked or not. After commit 309b66970ee2
      ("net: openvswitch: do not free vport if register_netdevice() is failed."),
      the duty to destroy vport is offloaded from the driver OTOH, which ends
      up in the memory leak reported.
      
      It is fixed by releasing vport unless device is registered successfully.
      To do that, the callback assignment is defered until device is registered.
      
      Reported-by: syzbot+13210896153522fe1ee5@syzkaller.appspotmail.com
      Fixes: 309b66970ee2 ("net: openvswitch: do not free vport if register_netdevice() is failed.")
      Cc: Taehee Yoo <ap420073@gmail.com>
      Cc: Greg Rose <gvrose8192@gmail.com>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Cc: Ying Xue <ying.xue@windriver.com>
      Cc: Andrey Konovalov <andreyknvl@google.com>
      Signed-off-by: NHillf Danton <hdanton@sina.com>
      Acked-by: NPravin B Shelar <pshelar@ovn.org>
      [sbrivio: this was sent to dev@openvswitch.org and never made its way
       to netdev -- resending original patch]
      Signed-off-by: NStefano Brivio <sbrivio@redhat.com>
      Reviewed-by: NGreg Rose <gvrose8192@gmail.com>
      Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      4e80e561
    • D
      RDMA/uverbs: Prevent potential underflow · 02725331
      Dan Carpenter 提交于
      [ Upstream commit a9018adfde809d44e71189b984fa61cc89682b5e ]
      
      The issue is in drivers/infiniband/core/uverbs_std_types_cq.c in the
      UVERBS_HANDLER(UVERBS_METHOD_CQ_CREATE) function.  We check that:
      
              if (attr.comp_vector >= attrs->ufile->device->num_comp_vectors) {
      
      But we don't check if "attr.comp_vector" is negative.  It could
      potentially lead to an array underflow.  My concern would be where
      cq->vector is used in the create_cq() function from the cxgb4 driver.
      
      And really "attr.comp_vector" is appears as a u32 to user space so that's
      the right type to use.
      
      Fixes: 9ee79fce ("IB/core: Add completion queue (cq) object actions")
      Link: https://lore.kernel.org/r/20191011133419.GA22905@mwandaSigned-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Reviewed-by: NJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      02725331
    • H
      scsi: qla2xxx: fixup incorrect usage of host_byte · d582769a
      Hannes Reinecke 提交于
      [ Upstream commit 66cf50e65b183c863825f5c28a818e3f47a72e40 ]
      
      DRIVER_ERROR is a a driver byte setting, not a host byte.  The qla2xxx
      driver should rather return DID_ERROR here to be in line with the other
      drivers.
      
      Link: https://lore.kernel.org/r/20191018140458.108278-1-hare@suse.deSigned-off-by: NHannes Reinecke <hare@suse.com>
      Acked-by: NHimanshu Madhani <hmadhani@marvell.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      d582769a
    • N
      net/mlx5: prevent memory leak in mlx5_fpga_conn_create_cq · 42de3a90
      Navid Emamdoost 提交于
      [ Upstream commit c8c2a057fdc7de1cd16f4baa51425b932a42eb39 ]
      
      In mlx5_fpga_conn_create_cq if mlx5_vector2eqn fails the allocated
      memory should be released.
      
      Fixes: 537a5057 ("net/mlx5: FPGA, Add high-speed connection routines")
      Signed-off-by: NNavid Emamdoost <navid.emamdoost@gmail.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      42de3a90
    • T
      net/mlx5e: TX, Fix consumer index of error cqe dump · 7dfdcd94
      Tariq Toukan 提交于
      [ Upstream commit 61ea02d2c13106116c6e4916ac5d9dd41151c959 ]
      
      The completion queue consumer index increments upon a call to
      mlx5_cqwq_pop().
      When dumping an error CQE, the index is already incremented.
      Decrease one for the print command.
      
      Fixes: 16cc14d8 ("net/mlx5e: Dump xmit error completions")
      Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      7dfdcd94
    • K
      RDMA/qedr: Fix reported firmware version · 48dd7128
      Kamal Heib 提交于
      [ Upstream commit b806c94ee44e53233b8ce6c92d9078d9781786a5 ]
      
      Remove spaces from the reported firmware version string.
      Actual value:
      $ cat /sys/class/infiniband/qedr0/fw_ver
      8. 37. 7. 0
      
      Expected value:
      $ cat /sys/class/infiniband/qedr0/fw_ver
      8.37.7.0
      
      Fixes: ec72fce4 ("qedr: Add support for RoCE HW init")
      Signed-off-by: NKamal Heib <kamalheib1@gmail.com>
      Acked-by: Michal Kalderon <michal.kalderon@marvell.com>
      Link: https://lore.kernel.org/r/20191007210730.7173-1-kamalheib1@gmail.comSigned-off-by: NDoug Ledford <dledford@redhat.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      48dd7128
    • P
      iw_cxgb4: fix ECN check on the passive accept · 6208c2bf
      Potnuri Bharat Teja 提交于
      [ Upstream commit 612e0486ad0845c41ac10492e78144f99e326375 ]
      
      pass_accept_req() is using the same skb for handling accept request and
      sending accept reply to HW. Here req and rpl structures are pointing to
      same skb->data which is over written by INIT_TP_WR() and leads to
      accessing corrupt req fields in accept_cr() while checking for ECN flags.
      Reordered code in accept_cr() to fetch correct req fields.
      
      Fixes: 92e7ae71 ("iw_cxgb4: Choose appropriate hw mtu index and ISS for iWARP connections")
      Signed-off-by: NPotnuri Bharat Teja <bharat@chelsio.com>
      Link: https://lore.kernel.org/r/20191003104353.11590-1-bharat@chelsio.comSigned-off-by: NDoug Ledford <dledford@redhat.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      6208c2bf
    • R
      RDMA/mlx5: Clear old rate limit when closing QP · 89aa9e26
      Rafi Wiener 提交于
      [ Upstream commit c8973df2da677f375f8b12b6eefca2f44c8884d5 ]
      
      Before QP is closed it changes to ERROR state, when this happens
      the QP was left with old rate limit that was already removed from
      the table.
      
      Fixes: 7d29f349 ("IB/mlx5: Properly adjust rate limit on QP state transitions")
      Signed-off-by: NRafi Wiener <rafiw@mellanox.com>
      Signed-off-by: NOleg Kuporosov <olegk@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Link: https://lore.kernel.org/r/20191002120243.16971-1-leon@kernel.orgSigned-off-by: NDoug Ledford <dledford@redhat.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      89aa9e26
    • Z
      HID: intel-ish-hid: fix wrong error handling in ishtp_cl_alloc_tx_ring() · d6706b2e
      Zhang Lixu 提交于
      [ Upstream commit 16ff7bf6dbcc6f77d2eec1ac9120edf44213c2f1 ]
      
      When allocating tx ring buffers failed, should free tx buffers, not rx buffers.
      Signed-off-by: NZhang Lixu <lixu.zhang@intel.com>
      Acked-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
      Signed-off-by: NJiri Kosina <jkosina@suse.cz>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      d6706b2e
    • B
      dmaengine: sprd: Fix the possible memory leak issue · 113a154e
      Baolin Wang 提交于
      [ Upstream commit ec1ac309596a7bdf206743b092748205f6cd5720 ]
      
      If we terminate the channel to free all descriptors associated with this
      channel, we will leak the memory of current descriptor if the current
      descriptor is not completed, since it had been deteled from the desc_issued
      list and have not been added into the desc_completed list.
      
      Thus we should check if current descriptor is completed or not, when freeing
      the descriptors associated with one channel, if not, we should free it to
      avoid this issue.
      
      Fixes: 9b3b8171 ("dmaengine: sprd: Add Spreadtrum DMA driver")
      Reported-by: NZhenfang Wang <zhenfang.wang@unisoc.com>
      Tested-by: NZhenfang Wang <zhenfang.wang@unisoc.com>
      Signed-off-by: NBaolin Wang <baolin.wang@linaro.org>
      Link: https://lore.kernel.org/r/170dbbc6d5366b6fa974ce2d366652e23a334251.1570609788.git.baolin.wang@linaro.orgSigned-off-by: NVinod Koul <vkoul@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      113a154e
    • R
      dmaengine: xilinx_dma: Fix control reg update in vdma_channel_set_config · 6040f96d
      Radhey Shyam Pandey 提交于
      [ Upstream commit 6c6de1ddb1be3840f2ed5cc9d009a622720940c9 ]
      
      In vdma_channel_set_config clear the delay, frame count and master mask
      before updating their new values. It avoids programming incorrect state
      when input parameters are different from default.
      Signed-off-by: NRadhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
      Acked-by: NAppana Durga Kedareswara rao <appana.durga.rao@xilinx.com>
      Signed-off-by: NMichal Simek <michal.simek@xilinx.com>
      Link: https://lore.kernel.org/r/1569495060-18117-3-git-send-email-radhey.shyam.pandey@xilinx.comSigned-off-by: NVinod Koul <vkoul@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      6040f96d
    • N
      HID: google: add magnemite/masterball USB ids · 78e7e024
      Nicolas Boichat 提交于
      [ Upstream commit 9e4dbc4646a84b2562ea7c64a542740687ff7daf ]
      
      Add 2 additional hammer-like devices.
      Signed-off-by: NNicolas Boichat <drinkcat@chromium.org>
      Signed-off-by: NJiri Kosina <jkosina@suse.cz>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      78e7e024
    • V
      PCI: tegra: Enable Relaxed Ordering only for Tegra20 & Tegra30 · 8181146c
      Vidya Sagar 提交于
      commit 7be142caabc4780b13a522c485abc806de5c4114 upstream.
      
      The PCI Tegra controller conversion to a device tree configurable
      driver in commit d1523b52 ("PCI: tegra: Move PCIe driver
      to drivers/pci/host") implied that code for the driver can be
      compiled in for a kernel supporting multiple platforms.
      
      Unfortunately, a blind move of the code did not check that some of the
      quirks that were applied in arch/arm (eg enabling Relaxed Ordering on
      all PCI devices - since the quirk hook erroneously matches PCI_ANY_ID
      for both Vendor-ID and Device-ID) are now applied in all kernels that
      compile the PCI Tegra controlled driver, DT and ACPI alike.
      
      This is completely wrong, in that enablement of Relaxed Ordering is only
      required by default in Tegra20 platforms as described in the Tegra20
      Technical Reference Manual (available at
      https://developer.nvidia.com/embedded/downloads#?search=tegra%202 in
      Section 34.1, where it is mentioned that Relaxed Ordering bit needs to
      be enabled in its root ports to avoid deadlock in hardware) and in the
      Tegra30 platforms for the same reasons (unfortunately not documented
      in the TRM).
      
      There is no other strict requirement on PCI devices Relaxed Ordering
      enablement on any other Tegra platforms or PCI host bridge driver.
      
      Fix this quite upsetting situation by limiting the vendor and device IDs
      to which the Relaxed Ordering quirk applies to the root ports in
      question, reported above.
      Signed-off-by: NVidya Sagar <vidyas@nvidia.com>
      [lorenzo.pieralisi@arm.com: completely rewrote the commit log/fixes tag]
      Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Acked-by: NThierry Reding <treding@nvidia.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8181146c
    • S
      usbip: Implement SG support to vhci-hcd and stub driver · e2dd254b
      Suwan Kim 提交于
      commit ea44d190764b4422af4d1c29eaeb9e69e353b406 upstream.
      
      There are bugs on vhci with usb 3.0 storage device. In USB, each SG
      list entry buffer should be divisible by the bulk max packet size.
      But with native SG support, this problem doesn't matter because the
      SG buffer is treated as contiguous buffer. But without native SG
      support, USB storage driver breaks SG list into several URBs and the
      error occurs because of a buffer size of URB that cannot be divided
      by the bulk max packet size. The error situation is as follows.
      
      When USB Storage driver requests 31.5 KB data and has SG list which
      has 3584 bytes buffer followed by 7 4096 bytes buffer for some
      reason. USB Storage driver splits this SG list into several URBs
      because VHCI doesn't support SG and sends them separately. So the
      first URB buffer size is 3584 bytes. When receiving data from device,
      USB 3.0 device sends data packet of 1024 bytes size because the max
      packet size of BULK pipe is 1024 bytes. So device sends 4096 bytes.
      But the first URB buffer has only 3584 bytes buffer size. So host
      controller terminates the transfer even though there is more data to
      receive. So, vhci needs to support SG transfer to prevent this error.
      
      In this patch, vhci supports SG regardless of whether the server's
      host controller supports SG or not, because stub driver splits SG
      list into several URBs if the server's host controller doesn't
      support SG.
      
      To support SG, vhci sets URB_DMA_MAP_SG flag in urb->transfer_flags
      if URB has SG list and this flag will tell stub driver to use SG
      list. After receiving urb from stub driver, vhci clear URB_DMA_MAP_SG
      flag to avoid unnecessary DMA unmapping in HCD.
      
      vhci sends each SG list entry to stub driver. Then, stub driver sees
      the total length of the buffer and allocates SG table and pages
      according to the total buffer length calling sgl_alloc(). After stub
      driver receives completed URB, it again sends each SG list entry to
      vhci.
      
      If the server's host controller doesn't support SG, stub driver
      breaks a single SG request into several URBs and submits them to
      the server's host controller. When all the split URBs are completed,
      stub driver reassembles the URBs into a single return command and
      sends it to vhci.
      
      Moreover, in the situation where vhci supports SG, but stub driver
      does not, or vice versa, usbip works normally. Because there is no
      protocol modification, there is no problem in communication between
      server and client even if the one has a kernel without SG support.
      
      In the case of vhci supports SG and stub driver doesn't, because
      vhci sends only the total length of the buffer to stub driver as
      it did before the patch applied, stub driver only needs to allocate
      the required length of buffers using only kmalloc() regardless of
      whether vhci supports SG or not. But stub driver has to allocate
      buffer with kmalloc() as much as the total length of SG buffer which
      is quite huge when vhci sends SG request, so it has overhead in
      buffer allocation in this situation.
      
      If stub driver needs to send data buffer to vhci because of IN pipe,
      stub driver also sends only total length of buffer as metadata and
      then sends real data as vhci does. Then vhci receive data from stub
      driver and store it to the corresponding buffer of SG list entry.
      
      And for the case of stub driver supports SG and vhci doesn't, since
      the USB storage driver checks that vhci doesn't support SG and sends
      the request to stub driver by splitting the SG list into multiple
      URBs, stub driver allocates a buffer for each URB with kmalloc() as
      it did before this patch.
      
      * Test environment
      
      Test uses two difference machines and two different kernel version
      to make mismatch situation between the client and the server where
      vhci supports SG, but stub driver does not, or vice versa. All tests
      are conducted in both full SG support that both vhci and stub support
      SG and half SG support that is the mismatch situation. Test kernel
      version is 5.3-rc6 with commit "usb: add a HCD_DMA flag instead of
      guestimating DMA capabilities" to avoid unnecessary DMA mapping and
      unmapping.
      
       - Test kernel version
          - 5.3-rc6 with SG support
          - 5.1.20-200.fc29.x86_64 without SG support
      
      * SG support test
      
       - Test devices
          - Super-speed storage device - SanDisk Ultra USB 3.0
          - High-speed storage device - SMI corporation USB 2.0 flash drive
      
       - Test description
      
      Test read and write operation of mass storage device that uses the
      BULK transfer. In test, the client reads and writes files whose size
      is over 1G and it works normally.
      
      * Regression test
      
       - Test devices
          - Super-speed device - Logitech Brio webcam
          - High-speed device  - Logitech C920 HD Pro webcam
          - Full-speed device  - Logitech bluetooth mouse
                               - Britz BR-Orion speaker
          - Low-speed device   - Logitech wired mouse
      
       - Test description
      
      Moving and click test for mouse. To test the webcam, use gnome-cheese.
      To test the speaker, play music and video on the client. All works
      normally.
      
      * VUDC compatibility test
      
      VUDC also works well with this patch. Tests are done with two USB
      gadget created by CONFIGFS USB gadget. Both use the BULK pipe.
      
              1. Serial gadget
              2. Mass storage gadget
      
       - Serial gadget test
      
      Serial gadget on the host sends and receives data using cat command
      on the /dev/ttyGS<N>. The client uses minicom to communicate with
      the serial gadget.
      
       - Mass storage gadget test
      
      After connecting the gadget with vhci, use "dd" to test read and
      write operation on the client side.
      
      Read  - dd if=/dev/sd<N> iflag=direct of=/dev/null bs=1G count=1
      Write - dd if=<my file path> iflag=direct of=/dev/sd<N> bs=1G count=1
      Signed-off-by: NSuwan Kim <suwan.kim027@gmail.com>
      Acked-by: NShuah khan <skhan@linuxfoundation.org>
      Link: https://lore.kernel.org/r/20190828032741.12234-1-suwan.kim027@gmail.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e2dd254b
    • S
      usbip: Fix vhci_urb_enqueue() URB null transfer buffer error path · f865ae47
      Shuah Khan 提交于
      commit 2c904963b1dd2acd4bc785b6c72e10a6283c2081 upstream.
      
      Fix vhci_urb_enqueue() to print debug msg and return error instead of
      failing with BUG_ON.
      Signed-off-by: NShuah Khan <shuah@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f865ae47
    • Q
      sched/fair: Fix -Wunused-but-set-variable warnings · e9c0fc4a
      Qian Cai 提交于
      commit 763a9ec06c409dcde2a761aac4bb83ff3938e0b3 upstream.
      
      Commit:
      
         de53fd7aedb1 ("sched/fair: Fix low cpu usage with high throttling by removing expiration of cpu-local slices")
      
      introduced a few compilation warnings:
      
        kernel/sched/fair.c: In function '__refill_cfs_bandwidth_runtime':
        kernel/sched/fair.c:4365:6: warning: variable 'now' set but not used [-Wunused-but-set-variable]
        kernel/sched/fair.c: In function 'start_cfs_bandwidth':
        kernel/sched/fair.c:4992:6: warning: variable 'overrun' set but not used [-Wunused-but-set-variable]
      
      Also, __refill_cfs_bandwidth_runtime() does no longer update the
      expiration time, so fix the comments accordingly.
      Signed-off-by: NQian Cai <cai@lca.pw>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NBen Segall <bsegall@google.com>
      Reviewed-by: NDave Chiluk <chiluk+linux@indeed.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: pauld@redhat.com
      Fixes: de53fd7aedb1 ("sched/fair: Fix low cpu usage with high throttling by removing expiration of cpu-local slices")
      Link: https://lkml.kernel.org/r/1566326455-8038-1-git-send-email-cai@lca.pwSigned-off-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      e9c0fc4a
    • D
      sched/fair: Fix low cpu usage with high throttling by removing expiration of cpu-local slices · 502bd151
      Dave Chiluk 提交于
      commit de53fd7aedb100f03e5d2231cfce0e4993282425 upstream.
      
      It has been observed, that highly-threaded, non-cpu-bound applications
      running under cpu.cfs_quota_us constraints can hit a high percentage of
      periods throttled while simultaneously not consuming the allocated
      amount of quota. This use case is typical of user-interactive non-cpu
      bound applications, such as those running in kubernetes or mesos when
      run on multiple cpu cores.
      
      This has been root caused to cpu-local run queue being allocated per cpu
      bandwidth slices, and then not fully using that slice within the period.
      At which point the slice and quota expires. This expiration of unused
      slice results in applications not being able to utilize the quota for
      which they are allocated.
      
      The non-expiration of per-cpu slices was recently fixed by
      'commit 512ac999 ("sched/fair: Fix bandwidth timer clock drift
      condition")'. Prior to that it appears that this had been broken since
      at least 'commit 51f2176d ("sched/fair: Fix unlocked reads of some
      cfs_b->quota/period")' which was introduced in v3.16-rc1 in 2014. That
      added the following conditional which resulted in slices never being
      expired.
      
      if (cfs_rq->runtime_expires != cfs_b->runtime_expires) {
      	/* extend local deadline, drift is bounded above by 2 ticks */
      	cfs_rq->runtime_expires += TICK_NSEC;
      
      Because this was broken for nearly 5 years, and has recently been fixed
      and is now being noticed by many users running kubernetes
      (https://github.com/kubernetes/kubernetes/issues/67577) it is my opinion
      that the mechanisms around expiring runtime should be removed
      altogether.
      
      This allows quota already allocated to per-cpu run-queues to live longer
      than the period boundary. This allows threads on runqueues that do not
      use much CPU to continue to use their remaining slice over a longer
      period of time than cpu.cfs_period_us. However, this helps prevent the
      above condition of hitting throttling while also not fully utilizing
      your cpu quota.
      
      This theoretically allows a machine to use slightly more than its
      allotted quota in some periods. This overflow would be bounded by the
      remaining quota left on each per-cpu runqueueu. This is typically no
      more than min_cfs_rq_runtime=1ms per cpu. For CPU bound tasks this will
      change nothing, as they should theoretically fully utilize all of their
      quota in each period. For user-interactive tasks as described above this
      provides a much better user/application experience as their cpu
      utilization will more closely match the amount they requested when they
      hit throttling. This means that cpu limits no longer strictly apply per
      period for non-cpu bound applications, but that they are still accurate
      over longer timeframes.
      
      This greatly improves performance of high-thread-count, non-cpu bound
      applications with low cfs_quota_us allocation on high-core-count
      machines. In the case of an artificial testcase (10ms/100ms of quota on
      80 CPU machine), this commit resulted in almost 30x performance
      improvement, while still maintaining correct cpu quota restrictions.
      That testcase is available at https://github.com/indeedeng/fibtest.
      
      Fixes: 512ac999 ("sched/fair: Fix bandwidth timer clock drift condition")
      Signed-off-by: NDave Chiluk <chiluk+linux@indeed.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NPhil Auld <pauld@redhat.com>
      Reviewed-by: NBen Segall <bsegall@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: John Hammond <jhammond@indeed.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Kyle Anderson <kwa@yelp.com>
      Cc: Gabriel Munos <gmunoz@netflix.com>
      Cc: Peter Oskolkov <posk@posk.io>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: Brendan Gregg <bgregg@netflix.com>
      Link: https://lkml.kernel.org/r/1563900266-19734-2-git-send-email-chiluk+linux@indeed.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      502bd151
    • T
      ALSA: usb-audio: Fix copy&paste error in the validator · 4ebee487
      Takashi Iwai 提交于
      commit ba8bf0967a154796be15c4983603aad0b05c3138 upstream.
      
      The recently introduced USB-audio descriptor validator had a stupid
      copy&paste error that may lead to an unexpected overlook of too short
      descriptors for processing and extension units.  It's likely the cause
      of the report triggered by syzkaller fuzzer.  Let's fix it.
      
      Fixes: 57f8770620e9 ("ALSA: usb-audio: More validations of descriptor units")
      Reported-by: syzbot+0620f79a1978b1133fd7@syzkaller.appspotmail.com
      Link: https://lore.kernel.org/r/s5hsgnkdbsl.wl-tiwai@suse.deSigned-off-by: NTakashi Iwai <tiwai@suse.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4ebee487
    • D
      ALSA: usb-audio: remove some dead code · e0051889
      Dan Carpenter 提交于
      commit b39e077fcb283dd96dd251a3abeba585402c61fe upstream.
      
      We recently cleaned up the error handling in commit 52c3e317a857 ("ALSA:
      usb-audio: Unify the release of usb_mixer_elem_info objects") but
      accidentally left this stray return.
      
      Fixes: 52c3e317a857 ("ALSA: usb-audio: Unify the release of usb_mixer_elem_info objects")
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NTakashi Iwai <tiwai@suse.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e0051889
    • T
      ALSA: usb-audio: Fix possible NULL dereference at create_yamaha_midi_quirk() · 4f6c5200
      Takashi Iwai 提交于
      commit 60849562a5db4a1eee2160167e4dce4590d3eafe upstream.
      
      The previous addition of descriptor validation may lead to a NULL
      dereference at create_yamaha_midi_quirk() when either injd or outjd is
      NULL.  Add proper non-NULL checks.
      
      Fixes: 57f8770620e9 ("ALSA: usb-audio: More validations of descriptor units")
      Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NTakashi Iwai <tiwai@suse.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4f6c5200
    • T
      ALSA: usb-audio: Clean up check_input_term() · 3a0cdf21
      Takashi Iwai 提交于
      commit e0ccdef92653f8867e2d1667facfd3c23699f540 upstream.
      
      The primary changes in this patch are cleanups of __check_input_term()
      and move to a non-nested switch-case block by evaluating the pair of
      UAC version and the unit type, as we've done for parse_audio_unit().
      Also each parser is split into the function for readability.
      
      Now, a slight behavior change by this cleanup is the handling of
      processing and extension units.  Formerly we've dealt with them
      differently between UAC1/2 and UAC3; the latter returns an error if no
      input sources are available, while the former continues to parse.
      
      In this patch, unify the behavior in all cases: when input sources are
      available, it parses recursively, then override the type and the id,
      as well as channel information if not provided yet.
      Signed-off-by: NTakashi Iwai <tiwai@suse.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3a0cdf21
    • T
      ALSA: usb-audio: Remove superfluous bLength checks · 9feeaa50
      Takashi Iwai 提交于
      commit b8e4f1fdfa422398c2d6c47bfb7d1feb3046d70a upstream.
      
      Now that we got the more comprehensive validation code for USB-audio
      descriptors, the check of overflow in each descriptor unit parser
      became superfluous.  Drop some of the obvious cases.
      Signed-off-by: NTakashi Iwai <tiwai@suse.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9feeaa50
    • T
      ALSA: usb-audio: Unify the release of usb_mixer_elem_info objects · f0e164f6
      Takashi Iwai 提交于
      commit 52c3e317a857091fd746e15179a637f32be4d337 upstream.
      
      Instead of the direct kfree() calls, introduce a new local helper to
      release the usb_mixer_elem_info object.  This will be extended to do
      more than a single kfree() in the later patches.
      
      Also, use the standard goto instead of multiple calls in
      parse_audio_selector_unit() error paths.
      Signed-off-by: NTakashi Iwai <tiwai@suse.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f0e164f6
    • T
      ALSA: usb-audio: Simplify parse_audio_unit() · dae4d839
      Takashi Iwai 提交于
      commit 68e9fde245591d18200f8a9054cac22339437adb upstream.
      
      Minor code refactoring by combining the UAC version and the type in
      the switch-case flow, so that we reduce the indentation and
      redundancy.  One good bonus is that the duplicated definition of the
      same type value (e.g. UAC2_EFFECT_UNIT) can be handled more cleanly.
      Signed-off-by: NTakashi Iwai <tiwai@suse.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dae4d839
    • T
      ALSA: usb-audio: More validations of descriptor units · 17821e2f
      Takashi Iwai 提交于
      commit 57f8770620e9b51c61089751f0b5ad3dbe376ff2 upstream.
      
      Introduce a new helper to validate each audio descriptor unit before
      and check the unit before actually accessing it.  This should harden
      against the OOB access cases with malformed descriptors that have been
      recently frequently reported by fuzzers.
      
      The existing descriptor checks are still kept although they become
      superfluous after this patch.  They'll be cleaned up eventually
      later.
      Signed-off-by: NTakashi Iwai <tiwai@suse.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      17821e2f
    • A
      configfs: fix a deadlock in configfs_symlink() · 5e36cf8e
      Al Viro 提交于
      commit 351e5d869e5ac10cb40c78b5f2d7dfc816ad4587 upstream.
      
      Configfs abuses symlink(2).  Unlike the normal filesystems, it
      wants the target resolved at symlink(2) time, like link(2) would've
      done.  The problem is that ->symlink() is called with the parent
      directory locked exclusive, so resolving the target inside the
      ->symlink() is easily deadlocked.
      
      Short of really ugly games in sys_symlink() itself, all we can
      do is to unlock the parent before resolving the target and
      relock it after.  However, that invalidates the checks done
      by the caller of ->symlink(), so we have to
      	* check that dentry is still where it used to be
      (it couldn't have been moved, but it could've been unhashed)
      	* recheck that it's still negative (somebody else
      might've successfully created a symlink with the same name
      while we were looking the target up)
      	* recheck the permissions on the parent directory.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5e36cf8e
    • A
      configfs: provide exclusion between IO and removals · 0dfc45be
      Al Viro 提交于
      commit b0841eefd9693827afb9888235e26ddd098f9cef upstream.
      
      Make sure that attribute methods are not called after the item
      has been removed from the tree.  To do so, we
      	* at the point of no return in removals, grab ->frag_sem
      exclusive and mark the fragment dead.
      	* call the methods of attributes with ->frag_sem taken
      shared and only after having verified that the fragment is still
      alive.
      
      	The main benefit is for method instances - they are
      guaranteed that the objects they are accessing *and* all ancestors
      are still there.  Another win is that we don't need to bother
      with extra refcount on config_item when opening a file -
      the item will be alive for as long as it stays in the tree, and
      we won't touch it/attributes/any associated data after it's
      been removed from the tree.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0dfc45be
    • A
      configfs: new object reprsenting tree fragments · 25c118d8
      Al Viro 提交于
      commit 47320fbe11a6059ae502c9c16b668022fdb4cf76 upstream.
      
      Refcounted, hangs of configfs_dirent, created by operations that add
      fragments to configfs tree (mkdir and configfs_register_{subsystem,group}).
      Will be used in the next commit to provide exclusion between fragment
      removal and ->show/->store calls.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      25c118d8
    • A
      configfs_register_group() shouldn't be (and isn't) called in rmdirable parts · 65524d64
      Al Viro 提交于
      commit f19e4ed1e1edbfa3c9ccb9fed17759b7d6db24c6 upstream.
      
      revert cc57c073 "configfs: fix registered group removal"
      It was an attempt to handle something that fundamentally doesn't
      work - configfs_register_group() should never be done in a part
      of tree that can be rmdir'ed.  And in mainline it never had been,
      so let's not borrow trouble; the fix was racy anyway, it would take
      a lot more to make that work and desired semantics is not clear.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      65524d64
    • A
      configfs: stash the data we need into configfs_buffer at open time · 2bd63490
      Al Viro 提交于
      commit ff4dd081977da56566a848f071aed8fa92d604a1 upstream.
      
      simplifies the ->read()/->write()/->release() instances nicely
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2bd63490
    • J
      can: peak_usb: fix slab info leak · a7be2deb
      Johan Hovold 提交于
      commit f7a1337f0d29b98733c8824e165fca3371d7d4fd upstream.
      
      Fix a small slab info leak due to a failure to clear the command buffer
      at allocation.
      
      The first 16 bytes of the command buffer are always sent to the device
      in pcan_usb_send_cmd() even though only the first two may have been
      initialised in case no argument payload is provided (e.g. when waiting
      for a response).
      
      Fixes: bb478555 ("can: usb: PEAK-System Technik USB adapters driver core")
      Cc: stable <stable@vger.kernel.org>     # 3.4
      Reported-by: syzbot+863724e7128e14b26732@syzkaller.appspotmail.com
      Signed-off-by: NJohan Hovold <johan@kernel.org>
      Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a7be2deb
    • J
      can: mcba_usb: fix use-after-free on disconnect · ce9b94da
      Johan Hovold 提交于
      commit 4d6636498c41891d0482a914dd570343a838ad79 upstream.
      
      The driver was accessing its driver data after having freed it.
      
      Fixes: 51f3baad ("can: mcba_usb: Add support for Microchip CAN BUS Analyzer")
      Cc: stable <stable@vger.kernel.org>     # 4.12
      Cc: Remigiusz Kołłątaj <remigiusz.kollataj@mobica.com>
      Reported-by: syzbot+e29b17e5042bbc56fae9@syzkaller.appspotmail.com
      Signed-off-by: NJohan Hovold <johan@kernel.org>
      Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ce9b94da
    • W
      can: dev: add missing of_node_put() after calling of_get_child_by_name() · 5a9e37f2
      Wen Yang 提交于
      commit db9ee384f6f71f7c5296ce85b7c1a2a2527e7c72 upstream.
      
      of_node_put() needs to be called when the device node which is got
      from of_get_child_by_name() finished using.
      
      Fixes: 2290aefa ("can: dev: Add support for limiting configured bitrate")
      Cc: Franklin S Cooper Jr <fcooper@ti.com>
      Signed-off-by: NWen Yang <wenyang@linux.alibaba.com>
      Cc: linux-stable <stable@vger.kernel.org>
      Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5a9e37f2
    • N
      can: gs_usb: gs_can_open(): prevent memory leak · 9289226f
      Navid Emamdoost 提交于
      commit fb5be6a7b4863ecc44963bb80ca614584b6c7817 upstream.
      
      In gs_can_open() if usb_submit_urb() fails the allocated urb should be
      released.
      
      Fixes: d08e973a ("can: gs_usb: Added support for the GS_USB CAN devices")
      Cc: linux-stable <stable@vger.kernel.org>
      Signed-off-by: NNavid Emamdoost <navid.emamdoost@gmail.com>
      Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9289226f