1. 05 10月, 2019 40 次提交
    • Y
      block: fix null pointer dereference in blk_mq_rq_timed_out() · 82652c06
      Yufen Yu 提交于
      commit 8d6996630c03d7ceeabe2611378fea5ca1c3f1b3 upstream.
      
      We got a null pointer deference BUG_ON in blk_mq_rq_timed_out()
      as following:
      
      [  108.825472] BUG: kernel NULL pointer dereference, address: 0000000000000040
      [  108.827059] PGD 0 P4D 0
      [  108.827313] Oops: 0000 [#1] SMP PTI
      [  108.827657] CPU: 6 PID: 198 Comm: kworker/6:1H Not tainted 5.3.0-rc8+ #431
      [  108.829503] Workqueue: kblockd blk_mq_timeout_work
      [  108.829913] RIP: 0010:blk_mq_check_expired+0x258/0x330
      [  108.838191] Call Trace:
      [  108.838406]  bt_iter+0x74/0x80
      [  108.838665]  blk_mq_queue_tag_busy_iter+0x204/0x450
      [  108.839074]  ? __switch_to_asm+0x34/0x70
      [  108.839405]  ? blk_mq_stop_hw_queue+0x40/0x40
      [  108.839823]  ? blk_mq_stop_hw_queue+0x40/0x40
      [  108.840273]  ? syscall_return_via_sysret+0xf/0x7f
      [  108.840732]  blk_mq_timeout_work+0x74/0x200
      [  108.841151]  process_one_work+0x297/0x680
      [  108.841550]  worker_thread+0x29c/0x6f0
      [  108.841926]  ? rescuer_thread+0x580/0x580
      [  108.842344]  kthread+0x16a/0x1a0
      [  108.842666]  ? kthread_flush_work+0x170/0x170
      [  108.843100]  ret_from_fork+0x35/0x40
      
      The bug is caused by the race between timeout handle and completion for
      flush request.
      
      When timeout handle function blk_mq_rq_timed_out() try to read
      'req->q->mq_ops', the 'req' have completed and reinitiated by next
      flush request, which would call blk_rq_init() to clear 'req' as 0.
      
      After commit 12f5b931 ("blk-mq: Remove generation seqeunce"),
      normal requests lifetime are protected by refcount. Until 'rq->ref'
      drop to zero, the request can really be free. Thus, these requests
      cannot been reused before timeout handle finish.
      
      However, flush request has defined .end_io and rq->end_io() is still
      called even if 'rq->ref' doesn't drop to zero. After that, the 'flush_rq'
      can be reused by the next flush request handle, resulting in null
      pointer deference BUG ON.
      
      We fix this problem by covering flush request with 'rq->ref'.
      If the refcount is not zero, flush_end_io() return and wait the
      last holder recall it. To record the request status, we add a new
      entry 'rq_status', which will be used in flush_end_io().
      
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Keith Busch <keith.busch@intel.com>
      Cc: Bart Van Assche <bvanassche@acm.org>
      Cc: stable@vger.kernel.org # v4.18+
      Reviewed-by: NMing Lei <ming.lei@redhat.com>
      Reviewed-by: NBob Liu <bob.liu@oracle.com>
      Signed-off-by: NYufen Yu <yuyufen@huawei.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      -------
      v2:
       - move rq_status from struct request to struct blk_flush_queue
      v3:
       - remove unnecessary '{}' pair.
      v4:
       - let spinlock to protect 'fq->rq_status'
      v5:
       - move rq_status after flush_running_idx member of struct blk_flush_queue
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      82652c06
    • S
      i40e: check __I40E_VF_DISABLE bit in i40e_sync_filters_subtask · db5b2fe4
      Stefan Assmann 提交于
      commit a7542b87607560d0b89e7ff81d870bd6ff8835cb upstream.
      
      While testing VF spawn/destroy the following panic occurred.
      
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000029
      [...]
      Workqueue: i40e i40e_service_task [i40e]
      RIP: 0010:i40e_sync_vsi_filters+0x6fd/0xc60 [i40e]
      [...]
      Call Trace:
       ? __switch_to_asm+0x35/0x70
       ? __switch_to_asm+0x41/0x70
       ? __switch_to_asm+0x35/0x70
       ? _cond_resched+0x15/0x30
       i40e_sync_filters_subtask+0x56/0x70 [i40e]
       i40e_service_task+0x382/0x11b0 [i40e]
       ? __switch_to_asm+0x41/0x70
       ? __switch_to_asm+0x41/0x70
       process_one_work+0x1a7/0x3b0
       worker_thread+0x30/0x390
       ? create_worker+0x1a0/0x1a0
       kthread+0x112/0x130
       ? kthread_bind+0x30/0x30
       ret_from_fork+0x35/0x40
      
      Investigation revealed a race where pf->vf[vsi->vf_id].trusted may get
      accessed by the watchdog via i40e_sync_filters_subtask() although
      i40e_free_vfs() already free'd pf->vf.
      To avoid this the call to i40e_sync_vsi_filters() in
      i40e_sync_filters_subtask() needs to be guarded by __I40E_VF_DISABLE,
      which is also used by i40e_free_vfs().
      
      Note: put the __I40E_VF_DISABLE check after the
      __I40E_MACVLAN_SYNC_PENDING check as the latter is more likely to
      trigger.
      
      CC: stable@vger.kernel.org
      Signed-off-by: NStefan Assmann <sassmann@kpanic.de>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      db5b2fe4
    • M
      memcg, kmem: do not fail __GFP_NOFAIL charges · b4a734a5
      Michal Hocko 提交于
      commit e55d9d9bfb69405bd7615c0f8d229d8fafb3e9b8 upstream.
      
      Thomas has noticed the following NULL ptr dereference when using cgroup
      v1 kmem limit:
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
      PGD 0
      P4D 0
      Oops: 0000 [#1] PREEMPT SMP PTI
      CPU: 3 PID: 16923 Comm: gtk-update-icon Not tainted 4.19.51 #42
      Hardware name: Gigabyte Technology Co., Ltd. Z97X-Gaming G1/Z97X-Gaming G1, BIOS F9 07/31/2015
      RIP: 0010:create_empty_buffers+0x24/0x100
      Code: cd 0f 1f 44 00 00 0f 1f 44 00 00 41 54 49 89 d4 ba 01 00 00 00 55 53 48 89 fb e8 97 fe ff ff 48 89 c5 48 89 c2 eb 03 48 89 ca <48> 8b 4a 08 4c 09 22 48 85 c9 75 f1 48 89 6a 08 48 8b 43 18 48 8d
      RSP: 0018:ffff927ac1b37bf8 EFLAGS: 00010286
      RAX: 0000000000000000 RBX: fffff2d4429fd740 RCX: 0000000100097149
      RDX: 0000000000000000 RSI: 0000000000000082 RDI: ffff9075a99fbe00
      RBP: 0000000000000000 R08: fffff2d440949cc8 R09: 00000000000960c0
      R10: 0000000000000002 R11: 0000000000000000 R12: 0000000000000000
      R13: ffff907601f18360 R14: 0000000000002000 R15: 0000000000001000
      FS:  00007fb55b288bc0(0000) GS:ffff90761f8c0000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000000000008 CR3: 000000007aebc002 CR4: 00000000001606e0
      Call Trace:
       create_page_buffers+0x4d/0x60
       __block_write_begin_int+0x8e/0x5a0
       ? ext4_inode_attach_jinode.part.82+0xb0/0xb0
       ? jbd2__journal_start+0xd7/0x1f0
       ext4_da_write_begin+0x112/0x3d0
       generic_perform_write+0xf1/0x1b0
       ? file_update_time+0x70/0x140
       __generic_file_write_iter+0x141/0x1a0
       ext4_file_write_iter+0xef/0x3b0
       __vfs_write+0x17e/0x1e0
       vfs_write+0xa5/0x1a0
       ksys_write+0x57/0xd0
       do_syscall_64+0x55/0x160
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Tetsuo then noticed that this is because the __memcg_kmem_charge_memcg
      fails __GFP_NOFAIL charge when the kmem limit is reached.  This is a wrong
      behavior because nofail allocations are not allowed to fail.  Normal
      charge path simply forces the charge even if that means to cross the
      limit.  Kmem accounting should be doing the same.
      
      Link: http://lkml.kernel.org/r/20190906125608.32129-1-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com>
      Reported-by: NThomas Lindroth <thomas.lindroth@gmail.com>
      Debugged-by: NTetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Thomas Lindroth <thomas.lindroth@gmail.com>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b4a734a5
    • T
      memcg, oom: don't require __GFP_FS when invoking memcg OOM killer · d40b3eaf
      Tetsuo Handa 提交于
      commit f9c645621a28e37813a1de96d9cbd89cde94a1e4 upstream.
      
      Masoud Sharbiani noticed that commit 29ef680a ("memcg, oom: move
      out_of_memory back to the charge path") broke memcg OOM called from
      __xfs_filemap_fault() path.  It turned out that try_charge() is retrying
      forever without making forward progress because mem_cgroup_oom(GFP_NOFS)
      cannot invoke the OOM killer due to commit 3da88fb3 ("mm, oom:
      move GFP_NOFS check to out_of_memory").
      
      Allowing forced charge due to being unable to invoke memcg OOM killer will
      lead to global OOM situation.  Also, just returning -ENOMEM will be risky
      because OOM path is lost and some paths (e.g.  get_user_pages()) will leak
      -ENOMEM.  Therefore, invoking memcg OOM killer (despite GFP_NOFS) will be
      the only choice we can choose for now.
      
      Until 29ef680a, we were able to invoke memcg OOM killer when
      GFP_KERNEL reclaim failed [1].  But since 29ef680a, we need to
      invoke memcg OOM killer when GFP_NOFS reclaim failed [2].  Although in the
      past we did invoke memcg OOM killer for GFP_NOFS [3], we might get
      pre-mature memcg OOM reports due to this patch.
      
      [1]
      
       leaker invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null), order=0, oom_score_adj=0
       CPU: 0 PID: 2746 Comm: leaker Not tainted 4.18.0+ #19
       Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/13/2018
       Call Trace:
        dump_stack+0x63/0x88
        dump_header+0x67/0x27a
        ? mem_cgroup_scan_tasks+0x91/0xf0
        oom_kill_process+0x210/0x410
        out_of_memory+0x10a/0x2c0
        mem_cgroup_out_of_memory+0x46/0x80
        mem_cgroup_oom_synchronize+0x2e4/0x310
        ? high_work_func+0x20/0x20
        pagefault_out_of_memory+0x31/0x76
        mm_fault_error+0x55/0x115
        ? handle_mm_fault+0xfd/0x220
        __do_page_fault+0x433/0x4e0
        do_page_fault+0x22/0x30
        ? page_fault+0x8/0x30
        page_fault+0x1e/0x30
       RIP: 0033:0x4009f0
       Code: 03 00 00 00 e8 71 fd ff ff 48 83 f8 ff 49 89 c6 74 74 48 89 c6 bf c0 0c 40 00 31 c0 e8 69 fd ff ff 45 85 ff 7e 21 31 c9 66 90 <41> 0f be 14 0e 01 d3 f7 c1 ff 0f 00 00 75 05 41 c6 04 0e 2a 48 83
       RSP: 002b:00007ffe29ae96f0 EFLAGS: 00010206
       RAX: 000000000000001b RBX: 0000000000000000 RCX: 0000000001ce1000
       RDX: 0000000000000000 RSI: 000000007fffffe5 RDI: 0000000000000000
       RBP: 000000000000000c R08: 0000000000000000 R09: 00007f94be09220d
       R10: 0000000000000002 R11: 0000000000000246 R12: 00000000000186a0
       R13: 0000000000000003 R14: 00007f949d845000 R15: 0000000002800000
       Task in /leaker killed as a result of limit of /leaker
       memory: usage 524288kB, limit 524288kB, failcnt 158965
       memory+swap: usage 0kB, limit 9007199254740988kB, failcnt 0
       kmem: usage 2016kB, limit 9007199254740988kB, failcnt 0
       Memory cgroup stats for /leaker: cache:844KB rss:521136KB rss_huge:0KB shmem:0KB mapped_file:0KB dirty:132KB writeback:0KB inactive_anon:0KB active_anon:521224KB inactive_file:1012KB active_file:8KB unevictable:0KB
       Memory cgroup out of memory: Kill process 2746 (leaker) score 998 or sacrifice child
       Killed process 2746 (leaker) total-vm:536704kB, anon-rss:521176kB, file-rss:1208kB, shmem-rss:0kB
       oom_reaper: reaped process 2746 (leaker), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
      
      [2]
      
       leaker invoked oom-killer: gfp_mask=0x600040(GFP_NOFS), nodemask=(null), order=0, oom_score_adj=0
       CPU: 1 PID: 2746 Comm: leaker Not tainted 4.18.0+ #20
       Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/13/2018
       Call Trace:
        dump_stack+0x63/0x88
        dump_header+0x67/0x27a
        ? mem_cgroup_scan_tasks+0x91/0xf0
        oom_kill_process+0x210/0x410
        out_of_memory+0x109/0x2d0
        mem_cgroup_out_of_memory+0x46/0x80
        try_charge+0x58d/0x650
        ? __radix_tree_replace+0x81/0x100
        mem_cgroup_try_charge+0x7a/0x100
        __add_to_page_cache_locked+0x92/0x180
        add_to_page_cache_lru+0x4d/0xf0
        iomap_readpages_actor+0xde/0x1b0
        ? iomap_zero_range_actor+0x1d0/0x1d0
        iomap_apply+0xaf/0x130
        iomap_readpages+0x9f/0x150
        ? iomap_zero_range_actor+0x1d0/0x1d0
        xfs_vm_readpages+0x18/0x20 [xfs]
        read_pages+0x60/0x140
        __do_page_cache_readahead+0x193/0x1b0
        ondemand_readahead+0x16d/0x2c0
        page_cache_async_readahead+0x9a/0xd0
        filemap_fault+0x403/0x620
        ? alloc_set_pte+0x12c/0x540
        ? _cond_resched+0x14/0x30
        __xfs_filemap_fault+0x66/0x180 [xfs]
        xfs_filemap_fault+0x27/0x30 [xfs]
        __do_fault+0x19/0x40
        __handle_mm_fault+0x8e8/0xb60
        handle_mm_fault+0xfd/0x220
        __do_page_fault+0x238/0x4e0
        do_page_fault+0x22/0x30
        ? page_fault+0x8/0x30
        page_fault+0x1e/0x30
       RIP: 0033:0x4009f0
       Code: 03 00 00 00 e8 71 fd ff ff 48 83 f8 ff 49 89 c6 74 74 48 89 c6 bf c0 0c 40 00 31 c0 e8 69 fd ff ff 45 85 ff 7e 21 31 c9 66 90 <41> 0f be 14 0e 01 d3 f7 c1 ff 0f 00 00 75 05 41 c6 04 0e 2a 48 83
       RSP: 002b:00007ffda45c9290 EFLAGS: 00010206
       RAX: 000000000000001b RBX: 0000000000000000 RCX: 0000000001a1e000
       RDX: 0000000000000000 RSI: 000000007fffffe5 RDI: 0000000000000000
       RBP: 000000000000000c R08: 0000000000000000 R09: 00007f6d061ff20d
       R10: 0000000000000002 R11: 0000000000000246 R12: 00000000000186a0
       R13: 0000000000000003 R14: 00007f6ce59b2000 R15: 0000000002800000
       Task in /leaker killed as a result of limit of /leaker
       memory: usage 524288kB, limit 524288kB, failcnt 7221
       memory+swap: usage 0kB, limit 9007199254740988kB, failcnt 0
       kmem: usage 1944kB, limit 9007199254740988kB, failcnt 0
       Memory cgroup stats for /leaker: cache:3632KB rss:518232KB rss_huge:0KB shmem:0KB mapped_file:0KB dirty:0KB writeback:0KB inactive_anon:0KB active_anon:518408KB inactive_file:3908KB active_file:12KB unevictable:0KB
       Memory cgroup out of memory: Kill process 2746 (leaker) score 992 or sacrifice child
       Killed process 2746 (leaker) total-vm:536704kB, anon-rss:518264kB, file-rss:1188kB, shmem-rss:0kB
       oom_reaper: reaped process 2746 (leaker), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
      
      [3]
      
       leaker invoked oom-killer: gfp_mask=0x50, order=0, oom_score_adj=0
       leaker cpuset=/ mems_allowed=0
       CPU: 1 PID: 3206 Comm: leaker Not tainted 3.10.0-957.27.2.el7.x86_64 #1
       Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/13/2018
       Call Trace:
        [<ffffffffaf364147>] dump_stack+0x19/0x1b
        [<ffffffffaf35eb6a>] dump_header+0x90/0x229
        [<ffffffffaedbb456>] ? find_lock_task_mm+0x56/0xc0
        [<ffffffffaee32a38>] ? try_get_mem_cgroup_from_mm+0x28/0x60
        [<ffffffffaedbb904>] oom_kill_process+0x254/0x3d0
        [<ffffffffaee36c36>] mem_cgroup_oom_synchronize+0x546/0x570
        [<ffffffffaee360b0>] ? mem_cgroup_charge_common+0xc0/0xc0
        [<ffffffffaedbc194>] pagefault_out_of_memory+0x14/0x90
        [<ffffffffaf35d072>] mm_fault_error+0x6a/0x157
        [<ffffffffaf3717c8>] __do_page_fault+0x3c8/0x4f0
        [<ffffffffaf371925>] do_page_fault+0x35/0x90
        [<ffffffffaf36d768>] page_fault+0x28/0x30
       Task in /leaker killed as a result of limit of /leaker
       memory: usage 524288kB, limit 524288kB, failcnt 20628
       memory+swap: usage 524288kB, limit 9007199254740988kB, failcnt 0
       kmem: usage 0kB, limit 9007199254740988kB, failcnt 0
       Memory cgroup stats for /leaker: cache:840KB rss:523448KB rss_huge:0KB mapped_file:0KB swap:0KB inactive_anon:0KB active_anon:523448KB inactive_file:464KB active_file:376KB unevictable:0KB
       Memory cgroup out of memory: Kill process 3206 (leaker) score 970 or sacrifice child
       Killed process 3206 (leaker) total-vm:536692kB, anon-rss:523304kB, file-rss:412kB, shmem-rss:0kB
      
      Bisected by Masoud Sharbiani.
      
      Link: http://lkml.kernel.org/r/cbe54ed1-b6ba-a056-8899-2dc42526371d@i-love.sakura.ne.jp
      Fixes: 3da88fb3 ("mm, oom: move GFP_NOFS check to out_of_memory") [necessary after 29ef680a]
      Signed-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Reported-by: NMasoud Sharbiani <msharbiani@apple.com>
      Tested-by: NMasoud Sharbiani <msharbiani@apple.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: <stable@vger.kernel.org>	[4.19+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d40b3eaf
    • B
      gfs2: clear buf_in_tr when ending a transaction in sweep_bh_for_rgrps · e0c1e6e5
      Bob Peterson 提交于
      commit f0b444b349e33ae0d3dd93e25ca365482a5d17d4 upstream.
      
      In function sweep_bh_for_rgrps, which is a helper for punch_hole,
      it uses variable buf_in_tr to keep track of when it needs to commit
      pending block frees on a partial delete that overflows the
      transaction created for the delete. The problem is that the
      variable was initialized at the start of function sweep_bh_for_rgrps
      but it was never cleared, even when starting a new transaction.
      
      This patch reinitializes the variable when the transaction is
      ended, so the next transaction starts out with it cleared.
      
      Fixes: d552a2b9 ("GFS2: Non-recursive delete")
      Cc: stable@vger.kernel.org # v4.12+
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e0c1e6e5
    • H
      efifb: BGRT: Improve efifb_bgrt_sanity_check · 3620b06b
      Hans de Goede 提交于
      commit 51677dfcc17f88ed754143df670ff064eae67f84 upstream.
      
      For various reasons, at least with x86 EFI firmwares, the xoffset and
      yoffset in the BGRT info are not always reliable.
      
      Extensive testing has shown that when the info is correct, the
      BGRT image is always exactly centered horizontally (the yoffset variable
      is more variable and not always predictable).
      
      This commit simplifies / improves the bgrt_sanity_check to simply
      check that the BGRT image is exactly centered horizontally and skips
      (re)drawing it when it is not.
      
      This fixes the BGRT image sometimes being drawn in the wrong place.
      
      Cc: stable@vger.kernel.org
      Fixes: 88fe4ceb ("efifb: BGRT: Do not copy the boot graphics for non native resolutions")
      Signed-off-by: NHans de Goede <hdegoede@redhat.com>
      Cc: Peter Jones <pjones@redhat.com>,
      Signed-off-by: NBartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190721131918.10115-1-hdegoede@redhat.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3620b06b
    • M
      regulator: Defer init completion for a while after late_initcall · c4f65c2f
      Mark Brown 提交于
      commit 55576cf1853798e86f620766e23b604c9224c19c upstream.
      
      The kernel has no way of knowing when we have finished instantiating
      drivers, between deferred probe and systems that build key drivers as
      modules we might be doing this long after userspace has booted. This has
      always been a bit of an issue with regulator_init_complete since it can
      power off hardware that's not had it's driver loaded which can result in
      user visible effects, the main case is powering off displays. Practically
      speaking it's not been an issue in real systems since most systems that
      use the regulator API are embedded and build in key drivers anyway but
      with Arm laptops coming on the market it's becoming more of an issue so
      let's do something about it.
      
      In the absence of any better idea just defer the powering off for 30s
      after late_initcall(), this is obviously a hack but it should mask the
      issue for now and it's no more arbitrary than late_initcall() itself.
      Ideally we'd have some heuristics to detect if we're on an affected
      system and tune or skip the delay appropriately, and there may be some
      need for a command line option to be added.
      
      Link: https://lore.kernel.org/r/20190904124250.25844-1-broonie@kernel.orgSigned-off-by: NMark Brown <broonie@kernel.org>
      Tested-by: NLee Jones <lee.jones@linaro.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c4f65c2f
    • T
      alarmtimer: Use EOPNOTSUPP instead of ENOTSUPP · 3784576f
      Thadeu Lima de Souza Cascardo 提交于
      commit f18ddc13af981ce3c7b7f26925f099e7c6929aba upstream.
      
      ENOTSUPP is not supposed to be returned to userspace. This was found on an
      OpenPower machine, where the RTC does not support set_alarm.
      
      On that system, a clock_nanosleep(CLOCK_REALTIME_ALARM, ...) results in
      "524 Unknown error 524"
      
      Replace it with EOPNOTSUPP which results in the expected "95 Operation not
      supported" error.
      
      Fixes: 1c6b39ad (alarmtimers: Return -ENOTSUPP if no RTC device is present)
      Signed-off-by: NThadeu Lima de Souza Cascardo <cascardo@canonical.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20190903171802.28314-1-cascardo@canonical.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3784576f
    • S
      arm64: dts: rockchip: limit clock rate of MMC controllers for RK3328 · 174bbcc5
      Shawn Lin 提交于
      commit 03e61929c0d227ed3e1c322fc3804216ea298b7e upstream.
      
      150MHz is a fundamental limitation of RK3328 Soc, w/o this limitation,
      eMMC, for instance, will run into 200MHz clock rate in HS200 mode, which
      makes the RK3328 boards not always boot properly. By adding it in
      rk3328.dtsi would also obviate the worry of missing it when adding new
      boards.
      
      Fixes: 52e02d37 ("arm64: dts: rockchip: add core dtsi file for RK3328 SoCs")
      Cc: stable@vger.kernel.org
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Liang Chen <cl@rock-chips.com>
      Signed-off-by: NShawn Lin <shawn.lin@rock-chips.com>
      Signed-off-by: NHeiko Stuebner <heiko@sntech.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      174bbcc5
    • W
      arm64: tlb: Ensure we execute an ISB following walk cache invalidation · 8cfe3b8a
      Will Deacon 提交于
      commit 51696d346c49c6cf4f29e9b20d6e15832a2e3408 upstream.
      
      05f2d2f8 ("arm64: tlbflush: Introduce __flush_tlb_kernel_pgtable")
      added a new TLB invalidation helper which is used when freeing
      intermediate levels of page table used for kernel mappings, but is
      missing the required ISB instruction after completion of the TLBI
      instruction.
      
      Add the missing barrier.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 05f2d2f8 ("arm64: tlbflush: Introduce __flush_tlb_kernel_pgtable")
      Reviewed-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NWill Deacon <will@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8cfe3b8a
    • W
      Revert "arm64: Remove unnecessary ISBs from set_{pte,pmd,pud}" · fc7d6bfd
      Will Deacon 提交于
      commit d0b7a302d58abe24ed0f32a0672dd4c356bb73db upstream.
      
      This reverts commit 24fe1b0e.
      
      Commit 24fe1b0e ("arm64: Remove unnecessary ISBs from
      set_{pte,pmd,pud}") removed ISB instructions immediately following updates
      to the page table, on the grounds that they are not required by the
      architecture and a DSB alone is sufficient to ensure that subsequent data
      accesses use the new translation:
      
        DDI0487E_a, B2-128:
      
        | ... no instruction that appears in program order after the DSB
        | instruction can alter any state of the system or perform any part of
        | its functionality until the DSB completes other than:
        |
        | * Being fetched from memory and decoded
        | * Reading the general-purpose, SIMD and floating-point,
        |   Special-purpose, or System registers that are directly or indirectly
        |   read without causing side-effects.
      
      However, the same document also states the following:
      
        DDI0487E_a, B2-125:
      
        | DMB and DSB instructions affect reads and writes to the memory system
        | generated by Load/Store instructions and data or unified cache
        | maintenance instructions being executed by the PE. Instruction fetches
        | or accesses caused by a hardware translation table access are not
        | explicit accesses.
      
      which appears to claim that the DSB alone is insufficient.  Unfortunately,
      some CPU designers have followed the second clause above, whereas in Linux
      we've been relying on the first. This means that our mapping sequence:
      
      	MOV	X0, <valid pte>
      	STR	X0, [Xptep]	// Store new PTE to page table
      	DSB	ISHST
      	LDR	X1, [X2]	// Translates using the new PTE
      
      can actually raise a translation fault on the load instruction because the
      translation can be performed speculatively before the page table update and
      then marked as "faulting" by the CPU. For user PTEs, this is ok because we
      can handle the spurious fault, but for kernel PTEs and intermediate table
      entries this results in a panic().
      
      Revert the offending commit to reintroduce the missing barriers.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 24fe1b0e ("arm64: Remove unnecessary ISBs from set_{pte,pmd,pud}")
      Reviewed-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NWill Deacon <will@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fc7d6bfd
    • L
      ARM: zynq: Use memcpy_toio instead of memcpy on smp bring-up · 881edc16
      Luis Araneda 提交于
      commit b7005d4ef4f3aa2dc24019ffba03a322557ac43d upstream.
      
      This fixes a kernel panic on memcpy when
      FORTIFY_SOURCE is enabled.
      
      The initial smp implementation on commit aa7eb2bb
      ("arm: zynq: Add smp support")
      used memcpy, which worked fine until commit ee333554
      ("ARM: 8749/1: Kconfig: Add ARCH_HAS_FORTIFY_SOURCE")
      enabled overflow checks at runtime, producing a read
      overflow panic.
      
      The computed size of memcpy args are:
      - p_size (dst): 4294967295 = (size_t) -1
      - q_size (src): 1
      - size (len): 8
      
      Additionally, the memory is marked as __iomem, so one of
      the memcpy_* functions should be used for read/write.
      
      Fixes: aa7eb2bb ("arm: zynq: Add smp support")
      Signed-off-by: NLuis Araneda <luaraneda@gmail.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NMichal Simek <michal.simek@xilinx.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      881edc16
    • L
      ARM: samsung: Fix system restart on S3C6410 · 22092794
      Lihua Yao 提交于
      commit 16986074035cc0205472882a00d404ed9d213313 upstream.
      
      S3C6410 system restart is triggered by watchdog reset.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 9f55342c ("ARM: dts: s3c64xx: Fix infinite interrupt in soft mode")
      Signed-off-by: NLihua Yao <ylhuajnu@outlook.com>
      Signed-off-by: NKrzysztof Kozlowski <krzk@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      22092794
    • A
      ASoC: Intel: Fix use of potentially uninitialized variable · ad884155
      Amadeusz Sławiński 提交于
      commit 810f3b860850148788fc1ed8a6f5f807199fed65 upstream.
      
      If ipc->ops.reply_msg_match is NULL, we may end up using uninitialized
      mask value.
      
      reported by smatch:
      sound/soc/intel/common/sst-ipc.c:266 sst_ipc_reply_find_msg() error: uninitialized symbol 'mask'.
      Signed-off-by: NAmadeusz Sławiński <amadeuszx.slawinski@intel.com>
      Link: https://lore.kernel.org/r/20190827141712.21015-3-amadeuszx.slawinski@linux.intel.comReviewed-by: NPierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
      Signed-off-by: NMark Brown <broonie@kernel.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ad884155
    • A
      ASoC: Intel: Skylake: Use correct function to access iomem space · 7bdab364
      Amadeusz Sławiński 提交于
      commit 17d29ff98fd4b70e9ccdac5e95e18a087e2737ef upstream.
      
      For copying from __iomem, we should use __ioread32_copy.
      
      reported by sparse:
      sound/soc/intel/skylake/skl-debug.c:437:34: warning: incorrect type in argument 1 (different address spaces)
      sound/soc/intel/skylake/skl-debug.c:437:34:    expected void [noderef] <asn:2> *to
      sound/soc/intel/skylake/skl-debug.c:437:34:    got unsigned char *
      sound/soc/intel/skylake/skl-debug.c:437:51: warning: incorrect type in argument 2 (different address spaces)
      sound/soc/intel/skylake/skl-debug.c:437:51:    expected void const *from
      sound/soc/intel/skylake/skl-debug.c:437:51:    got void [noderef] <asn:2> *[assigned] fw_reg_addr
      Signed-off-by: NAmadeusz Sławiński <amadeuszx.slawinski@intel.com>
      Link: https://lore.kernel.org/r/20190827141712.21015-2-amadeuszx.slawinski@linux.intel.comReviewed-by: NPierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
      Signed-off-by: NMark Brown <broonie@kernel.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7bdab364
    • A
      ASoC: Intel: NHLT: Fix debug print format · 3c54f463
      Amadeusz Sławiński 提交于
      commit 855a06da37a773fd073d51023ac9d07988c87da8 upstream.
      
      oem_table_id is 8 chars long, so we need to limit it, otherwise it
      may print some unprintable characters into dmesg.
      Signed-off-by: NAmadeusz Sławiński <amadeuszx.slawinski@intel.com>
      Link: https://lore.kernel.org/r/20190827141712.21015-7-amadeuszx.slawinski@linux.intel.comReviewed-by: NPierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
      Signed-off-by: NMark Brown <broonie@kernel.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3c54f463
    • K
      binfmt_elf: Do not move brk for INTERP-less ET_EXEC · 29ecf8ca
      Kees Cook 提交于
      commit 7be3cb019db1cbd5fd5ffe6d64a23fefa4b6f229 upstream.
      
      When brk was moved for binaries without an interpreter, it should have
      been limited to ET_DYN only. In other words, the special case was an
      ET_DYN that lacks an INTERP, not just an executable that lacks INTERP.
      The bug manifested for giant static executables, where the brk would end
      up in the middle of the text area on 32-bit architectures.
      Reported-and-tested-by: NRichard Kojedzinszky <richard@kojedz.in>
      Fixes: bbdc6076d2e5 ("binfmt_elf: move brk out of mmap when doing direct loader exec")
      Cc: stable@vger.kernel.org
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      29ecf8ca
    • A
      media: don't drop front-end reference count for ->detach · 02ef5c29
      Arnd Bergmann 提交于
      commit 14e3cdbb00a885eedc95c0cf8eda8fe28d26d6b4 upstream.
      
      A bugfix introduce a link failure in configurations without CONFIG_MODULES:
      
      In file included from drivers/media/usb/dvb-usb/pctv452e.c:20:0:
      drivers/media/usb/dvb-usb/pctv452e.c: In function 'pctv452e_frontend_attach':
      drivers/media/dvb-frontends/stb0899_drv.h:151:36: error: weak declaration of 'stb0899_attach' being applied to a already existing, static definition
      
      The problem is that the !IS_REACHABLE() declaration of stb0899_attach()
      is a 'static inline' definition that clashes with the weak definition.
      
      I further observed that the bugfix was only done for one of the five users
      of stb0899_attach(), the other four still have the problem.  This reverts
      the bugfix and instead addresses the problem by not dropping the reference
      count when calling '->detach()', instead we call this function directly
      in dvb_frontend_put() before dropping the kref on the front-end.
      
      I first submitted this in early 2018, and after some discussion it
      was apparently discarded.  While there is a long-term plan in place,
      that plan is obviously not nearing completion yet, and the current
      kernel is still broken unless this patch is applied.
      
      Link: https://patchwork.kernel.org/patch/10140175/
      Link: https://patchwork.linuxtv.org/patch/54831/
      
      Cc: Max Kellermann <max.kellermann@gmail.com>
      Cc: Wolfgang Rohdewald <wolfgang@rohdewald.de>
      Cc: stable@vger.kernel.org
      Fixes: f686c143 ("[media] stb0899: move code to "detach" callback")
      Fixes: 6cdeaed3 ("media: dvb_usb_pctv452e: module refcount changes were unbalanced")
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NSean Young <sean@mess.org>
      Signed-off-by: NMauro Carvalho Chehab <mchehab+samsung@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      02ef5c29
    • H
      media: sn9c20x: Add MSI MS-1039 laptop to flip_dmi_table · 589ca8ec
      Hans de Goede 提交于
      commit 7e0bb5828311f811309bed5749528ca04992af2f upstream.
      
      Like a bunch of other MSI laptops the MS-1039 uses a 0c45:627b
      SN9C201 + OV7660 webcam which is mounted upside down.
      
      Add it to the sn9c20x flip_dmi_table to deal with this.
      
      Cc: stable@vger.kernel.org
      Reported-by: NRui Salvaterra <rsalvaterra@gmail.com>
      Signed-off-by: NHans de Goede <hdegoede@redhat.com>
      Signed-off-by: NHans Verkuil <hverkuil-cisco@xs4all.nl>
      Signed-off-by: NMauro Carvalho Chehab <mchehab+samsung@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      589ca8ec
    • S
      KVM: x86: Manually calculate reserved bits when loading PDPTRS · 496cf984
      Sean Christopherson 提交于
      commit 16cfacc8085782dab8e365979356ce1ca87fd6cc upstream.
      
      Manually generate the PDPTR reserved bit mask when explicitly loading
      PDPTRs.  The reserved bits that are being tracked by the MMU reflect the
      current paging mode, which is unlikely to be PAE paging in the vast
      majority of flows that use load_pdptrs(), e.g. CR0 and CR4 emulation,
      __set_sregs(), etc...  This can cause KVM to incorrectly signal a bad
      PDPTR, or more likely, miss a reserved bit check and subsequently fail
      a VM-Enter due to a bad VMCS.GUEST_PDPTR.
      
      Add a one off helper to generate the reserved bits instead of sharing
      code across the MMU's calculations and the PDPTR emulation.  The PDPTR
      reserved bits are basically set in stone, and pushing a helper into
      the MMU's calculation adds unnecessary complexity without improving
      readability.
      
      Oppurtunistically fix/update the comment for load_pdptrs().
      
      Note, the buggy commit also introduced a deliberate functional change,
      "Also remove bit 5-6 from rsvd_bits_mask per latest SDM.", which was
      effectively (and correctly) reverted by commit cd9ae5fe ("KVM: x86:
      Fix page-tables reserved bits").  A bit of SDM archaeology shows that
      the SDM from late 2008 had a bug (likely a copy+paste error) where it
      listed bits 6:5 as AVL and A for PDPTEs used for 4k entries but reserved
      for 2mb entries.  I.e. the SDM contradicted itself, and bits 6:5 are and
      always have been reserved.
      
      Fixes: 20c466b5 ("KVM: Use rsvd_bits_mask in load_pdptrs()")
      Cc: stable@vger.kernel.org
      Cc: Nadav Amit <nadav.amit@gmail.com>
      Reported-by: NDoug Reiland <doug.reiland@intel.com>
      Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Reviewed-by: NPeter Xu <peterx@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      496cf984
    • J
      KVM: x86: set ctxt->have_exception in x86_decode_insn() · 933e3e2b
      Jan Dakinevich 提交于
      commit c8848cee74ff05638e913582a476bde879c968ad upstream.
      
      x86_emulate_instruction() takes into account ctxt->have_exception flag
      during instruction decoding, but in practice this flag is never set in
      x86_decode_insn().
      
      Fixes: 6ea6e843 ("KVM: x86: inject exceptions produced by x86_decode_insn")
      Cc: stable@vger.kernel.org
      Cc: Denis Lunev <den@virtuozzo.com>
      Cc: Roman Kagan <rkagan@virtuozzo.com>
      Cc: Denis Plotnikov <dplotnikov@virtuozzo.com>
      Signed-off-by: NJan Dakinevich <jan.dakinevich@virtuozzo.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      933e3e2b
    • J
      KVM: x86: always stop emulation on page fault · 9723e445
      Jan Dakinevich 提交于
      commit 8530a79c5a9f4e29e6ffb35ec1a79d81f4968ec8 upstream.
      
      inject_emulated_exception() returns true if and only if nested page
      fault happens. However, page fault can come from guest page tables
      walk, either nested or not nested. In both cases we should stop an
      attempt to read under RIP and give guest to step over its own page
      fault handler.
      
      This is also visible when an emulated instruction causes a #GP fault
      and the VMware backdoor is enabled.  To handle the VMware backdoor,
      KVM intercepts #GP faults; with only the next patch applied,
      x86_emulate_instruction() injects a #GP but returns EMULATE_FAIL
      instead of EMULATE_DONE.   EMULATE_FAIL causes handle_exception_nmi()
      (or gp_interception() for SVM) to re-inject the original #GP because it
      thinks emulation failed due to a non-VMware opcode.  This patch prevents
      the issue as x86_emulate_instruction() will return EMULATE_DONE after
      injecting the #GP.
      
      Fixes: 6ea6e843 ("KVM: x86: inject exceptions produced by x86_decode_insn")
      Cc: stable@vger.kernel.org
      Cc: Denis Lunev <den@virtuozzo.com>
      Cc: Roman Kagan <rkagan@virtuozzo.com>
      Cc: Denis Plotnikov <dplotnikov@virtuozzo.com>
      Signed-off-by: NJan Dakinevich <jan.dakinevich@virtuozzo.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9723e445
    • H
      parisc: Disable HP HSC-PCI Cards to prevent kernel crash · 8225db4a
      Helge Deller 提交于
      commit 5fa1659105fac63e0f3c199b476025c2e04111ce upstream.
      
      The HP Dino PCI controller chip can be used in two variants: as on-board
      controller (e.g. in B160L), or on an Add-On card ("Card-Mode") to bridge
      PCI components to systems without a PCI bus, e.g. to a HSC/GSC bus.  One
      such Add-On card is the HP HSC-PCI Card which has one or more DEC Tulip
      PCI NIC chips connected to the on-card Dino PCI controller.
      
      Dino in Card-Mode has a big disadvantage: All PCI memory accesses need
      to go through the DINO_MEM_DATA register, so Linux drivers will not be
      able to use the ioremap() function. Without ioremap() many drivers will
      not work, one example is the tulip driver which then simply crashes the
      kernel if it tries to access the ports on the HP HSC card.
      
      This patch disables the HP HSC card if it finds one, and as such
      fixes the kernel crash on a HP D350/2 machine.
      Signed-off-by: NHelge Deller <deller@gmx.de>
      Noticed-by: NPhil Scarr <phil.scarr@pm.me>
      Cc: stable@vger.kernel.org
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8225db4a
    • V
      fuse: fix missing unlock_page in fuse_writepage() · ad411629
      Vasily Averin 提交于
      commit d5880c7a8620290a6c90ced7a0e8bd0ad9419601 upstream.
      
      unlock_page() was missing in case of an already in-flight write against the
      same page.
      Signed-off-by: NVasily Averin <vvs@virtuozzo.com>
      Fixes: ff17be08 ("fuse: writepage: skip already in flight")
      Cc: <stable@vger.kernel.org> # v3.13
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ad411629
    • M
      powerpc/imc: Dont create debugfs files for cpu-less nodes · ecfe4b5f
      Madhavan Srinivasan 提交于
      commit 41ba17f20ea835c489e77bd54e2da73184e22060 upstream.
      
      Commit <684d9840> ('powerpc/powernv: Add debugfs interface for
      imc-mode and imc') added debugfs interface for the nest imc pmu
      devices to support changing of different ucode modes. Primarily adding
      this capability for debug. But when doing so, the code did not
      consider the case of cpu-less nodes. So when reading the _cmd_ or
      _mode_ file of a cpu-less node will create this crash.
      
        Faulting instruction address: 0xc0000000000d0d58
        Oops: Kernel access of bad area, sig: 11 [#1]
        ...
        CPU: 67 PID: 5301 Comm: cat Not tainted 5.2.0-rc6-next-20190627+ #19
        NIP:  c0000000000d0d58 LR: c00000000049aa18 CTR:c0000000000d0d50
        REGS: c00020194548f9e0 TRAP: 0300   Not tainted  (5.2.0-rc6-next-20190627+)
        MSR:  9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR:28022822  XER: 00000000
        CFAR: c00000000049aa14 DAR: 000000000003fc08 DSISR:40000000 IRQMASK: 0
        ...
        NIP imc_mem_get+0x8/0x20
        LR  simple_attr_read+0x118/0x170
        Call Trace:
          simple_attr_read+0x70/0x170 (unreliable)
          debugfs_attr_read+0x6c/0xb0
          __vfs_read+0x3c/0x70
           vfs_read+0xbc/0x1a0
          ksys_read+0x7c/0x140
          system_call+0x5c/0x70
      
      Patch fixes the issue with a more robust check for vbase to NULL.
      
      Before patch, ls output for the debugfs imc directory
      
        # ls /sys/kernel/debug/powerpc/imc/
        imc_cmd_0    imc_cmd_251  imc_cmd_253  imc_cmd_255  imc_mode_0    imc_mode_251  imc_mode_253  imc_mode_255
        imc_cmd_250  imc_cmd_252  imc_cmd_254  imc_cmd_8    imc_mode_250  imc_mode_252  imc_mode_254  imc_mode_8
      
      After patch, ls output for the debugfs imc directory
      
        # ls /sys/kernel/debug/powerpc/imc/
        imc_cmd_0  imc_cmd_8  imc_mode_0  imc_mode_8
      
      Actual bug here is that, we have two loops with potentially different
      loop counts. That is, in imc_get_mem_addr_nest(), loop count is
      obtained from the dt entries. But in case of export_imc_mode_and_cmd(),
      loop was based on for_each_nid() count. Patch fixes the loop count in
      latter based on the struct mem_info. Ideally it would be better to
      have array size in struct imc_pmu.
      
      Fixes: 684d9840 ('powerpc/powernv: Add debugfs interface for imc-mode and imc')
      Reported-by: NQian Cai <cai@lca.pw>
      Suggested-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20190827101635.6942-1-maddy@linux.vnet.ibm.com
      Cc: Jan Stancek <jstancek@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ecfe4b5f
    • M
      scsi: implement .cleanup_rq callback · e94443fc
      Ming Lei 提交于
      [ Upstream commit b7e9e1fb7a9227be34ad4a5e778022c3164494cf ]
      
      Implement .cleanup_rq() callback for freeing driver private part
      of the request. Then we can avoid to leak this part if the request isn't
      completed by SCSI, and freed by blk-mq or upper layer(such as dm-rq) finally.
      
      Cc: Ewan D. Milne <emilne@redhat.com>
      Cc: Bart Van Assche <bvanassche@acm.org>
      Cc: Hannes Reinecke <hare@suse.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Mike Snitzer <snitzer@redhat.com>
      Cc: dm-devel@redhat.com
      Cc: <stable@vger.kernel.org>
      Fixes: 396eaf21 ("blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback")
      Signed-off-by: NMing Lei <ming.lei@redhat.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      e94443fc
    • M
      blk-mq: add callback of .cleanup_rq · 4ec3ca27
      Ming Lei 提交于
      [ Upstream commit 226b4fc75c78f9c497c5182d939101b260cfb9f3 ]
      
      SCSI maintains its own driver private data hooked off of each SCSI
      request, and the pridate data won't be freed after scsi_queue_rq()
      returns BLK_STS_RESOURCE or BLK_STS_DEV_RESOURCE. An upper layer driver
      (e.g. dm-rq) may need to retry these SCSI requests, before SCSI has
      fully dispatched them, due to a lower level SCSI driver's resource
      limitation identified in scsi_queue_rq(). Currently SCSI's per-request
      private data is leaked when the upper layer driver (dm-rq) frees and
      then retries these requests in response to BLK_STS_RESOURCE or
      BLK_STS_DEV_RESOURCE returns from scsi_queue_rq().
      
      This usecase is so specialized that it doesn't warrant training an
      existing blk-mq interface (e.g. blk_mq_free_request) to allow SCSI to
      account for freeing its driver private data -- doing so would add an
      extra branch for handling a special case that all other consumers of
      SCSI (and blk-mq) won't ever need to worry about.
      
      So the most pragmatic way forward is to delegate freeing SCSI driver
      private data to the upper layer driver (dm-rq).  Do so by adding
      new .cleanup_rq callback and calling a new blk_mq_cleanup_rq() method
      from dm-rq.  A following commit will implement the .cleanup_rq() hook
      in scsi_mq_ops.
      
      Cc: Ewan D. Milne <emilne@redhat.com>
      Cc: Bart Van Assche <bvanassche@acm.org>
      Cc: Hannes Reinecke <hare@suse.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Mike Snitzer <snitzer@redhat.com>
      Cc: dm-devel@redhat.com
      Cc: <stable@vger.kernel.org>
      Fixes: 396eaf21 ("blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback")
      Signed-off-by: NMing Lei <ming.lei@redhat.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      4ec3ca27
    • J
      ALSA: hda/realtek - PCI quirk for Medion E4254 · 4848fb93
      Jan-Marek Glogowski 提交于
      [ Upstream commit bd9c10bc663dd2eaac8fe39dad0f18cd21527446 ]
      
      The laptop has a combined jack to attach headsets on the right.
      The BIOS encodes them as two different colored jacks at the front,
      but otherwise it seems to be configured ok. But any adaption of
      the pins config on its own doesn't fix the jack detection to work
      in Linux. Still Windows works correct.
      
      This is somehow fixed by chaining ALC256_FIXUP_ASUS_HEADSET_MODE,
      which seems to register the microphone jack as a headset part and
      also results in fixing jack sensing, visible in dmesg as:
      
      -snd_hda_codec_realtek hdaudioC0D0:      Mic=0x19
      +snd_hda_codec_realtek hdaudioC0D0:      Headset Mic=0x19
      
      [ Actually the essential change is the location of the jack; the
        driver created "Front Mic Jack" without the matching volume / mute
        control element due to its jack location, which confused PA.
        -- tiwai ]
      Signed-off-by: NJan-Marek Glogowski <glogow@fbihome.de>
      Cc: <stable@vger.kernel.org>
      Link: https://lore.kernel.org/r/8f4f9b20-0aeb-f8f1-c02f-fd53c09679f1@fbihome.deSigned-off-by: NTakashi Iwai <tiwai@suse.de>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      4848fb93
    • Y
      ceph: use ceph_evict_inode to cleanup inode's resource · e9bcaf82
      Yan, Zheng 提交于
      [ Upstream commit 87bc5b895d94a0f40fe170d4cf5771c8e8f85d15 ]
      
      remove_session_caps() relies on __wait_on_freeing_inode(), to wait for
      freeing inode to remove its caps. But VFS wakes freeing inode waiters
      before calling destroy_inode().
      
      [ jlayton: mainline moved to ->free_inode before the original patch was
      	   merged. This backport reinstates ceph_destroy_inode and just
      	   has it do the call_rcu call. ]
      
      Cc: stable@vger.kernel.org
      Link: https://tracker.ceph.com/issues/40102Signed-off-by: N"Yan, Zheng" <zyan@redhat.com>
      Reviewed-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      e9bcaf82
    • S
      Revert "ceph: use ceph_evict_inode to cleanup inode's resource" · 72f0fff3
      Sasha Levin 提交于
      This reverts commit 81281039.
      
      The backport was incorrect and was causing kernel panics. Revert and
      re-apply a correct backport from Jeff Layton.
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      72f0fff3
    • J
      randstruct: Check member structs in is_pure_ops_struct() · 98dc6d95
      Joonwon Kang 提交于
      commit 60f2c82ed20bde57c362e66f796cf9e0e38a6dbb upstream.
      
      While no uses in the kernel triggered this case, it was possible to have
      a false negative where a struct contains other structs which contain only
      function pointers because of unreachable code in is_pure_ops_struct().
      Signed-off-by: NJoonwon Kang <kjw1627@gmail.com>
      Link: https://lore.kernel.org/r/20190727155841.GA13586@host
      Fixes: 313dd1b6 ("gcc-plugins: Add the randstruct plugin")
      Cc: stable@vger.kernel.org
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      98dc6d95
    • I
      IB/hfi1: Define variables as unsigned long to fix KASAN warning · ad6819cd
      Ira Weiny 提交于
      commit f8659d68e2bee5b86a1beaf7be42d942e1fc81f4 upstream.
      
      Define the working variables to be unsigned long to be compatible with
      for_each_set_bit and change types as needed.
      
      While we are at it remove unused variables from a couple of functions.
      
      This was found because of the following KASAN warning:
       ==================================================================
         BUG: KASAN: stack-out-of-bounds in find_first_bit+0x19/0x70
         Read of size 8 at addr ffff888362d778d0 by task kworker/u308:2/1889
      
         CPU: 21 PID: 1889 Comm: kworker/u308:2 Tainted: G W         5.3.0-rc2-mm1+ #2
         Hardware name: Intel Corporation W2600CR/W2600CR, BIOS SE5C600.86B.02.04.0003.102320141138 10/23/2014
         Workqueue: ib-comp-unb-wq ib_cq_poll_work [ib_core]
         Call Trace:
          dump_stack+0x9a/0xf0
          ? find_first_bit+0x19/0x70
          print_address_description+0x6c/0x332
          ? find_first_bit+0x19/0x70
          ? find_first_bit+0x19/0x70
          __kasan_report.cold.6+0x1a/0x3b
          ? find_first_bit+0x19/0x70
          kasan_report+0xe/0x12
          find_first_bit+0x19/0x70
          pma_get_opa_portstatus+0x5cc/0xa80 [hfi1]
          ? ret_from_fork+0x3a/0x50
          ? pma_get_opa_port_ectrs+0x200/0x200 [hfi1]
          ? stack_trace_consume_entry+0x80/0x80
          hfi1_process_mad+0x39b/0x26c0 [hfi1]
          ? __lock_acquire+0x65e/0x21b0
          ? clear_linkup_counters+0xb0/0xb0 [hfi1]
          ? check_chain_key+0x1d7/0x2e0
          ? lock_downgrade+0x3a0/0x3a0
          ? match_held_lock+0x2e/0x250
          ib_mad_recv_done+0x698/0x15e0 [ib_core]
          ? clear_linkup_counters+0xb0/0xb0 [hfi1]
          ? ib_mad_send_done+0xc80/0xc80 [ib_core]
          ? mark_held_locks+0x79/0xa0
          ? _raw_spin_unlock_irqrestore+0x44/0x60
          ? rvt_poll_cq+0x1e1/0x340 [rdmavt]
          __ib_process_cq+0x97/0x100 [ib_core]
          ib_cq_poll_work+0x31/0xb0 [ib_core]
          process_one_work+0x4ee/0xa00
          ? pwq_dec_nr_in_flight+0x110/0x110
          ? do_raw_spin_lock+0x113/0x1d0
          worker_thread+0x57/0x5a0
          ? process_one_work+0xa00/0xa00
          kthread+0x1bb/0x1e0
          ? kthread_create_on_node+0xc0/0xc0
          ret_from_fork+0x3a/0x50
      
         The buggy address belongs to the page:
         page:ffffea000d8b5dc0 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0
         flags: 0x17ffffc0000000()
         raw: 0017ffffc0000000 0000000000000000 ffffea000d8b5dc8 0000000000000000
         raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
         page dumped because: kasan: bad access detected
      
         addr ffff888362d778d0 is located in stack of task kworker/u308:2/1889 at offset 32 in frame:
          pma_get_opa_portstatus+0x0/0xa80 [hfi1]
      
         this frame has 1 object:
          [32, 36) 'vl_select_mask'
      
         Memory state around the buggy address:
          ffff888362d77780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
          ffff888362d77800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
         >ffff888362d77880: 00 00 00 00 00 00 f1 f1 f1 f1 04 f2 f2 f2 00 00
                                                          ^
          ffff888362d77900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
          ffff888362d77980: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 04 f2 f2 f2
      
       ==================================================================
      
      Cc: <stable@vger.kernel.org>
      Fixes: 77241056 ("IB/hfi1: add driver files")
      Link: https://lore.kernel.org/r/20190911113053.126040.47327.stgit@awfm-01.aw.intel.comReviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: NIra Weiny <ira.weiny@intel.com>
      Signed-off-by: NKaike Wan <kaike.wan@intel.com>
      Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ad6819cd
    • D
      IB/mlx5: Free mpi in mp_slave mode · a924850c
      Danit Goldberg 提交于
      commit 5d44adebbb7e785939df3db36ac360f5e8b73e44 upstream.
      
      ib_add_slave_port() allocates a multiport struct but never frees it.
      Don't leak memory, free the allocated mpi struct during driver unload.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 32f69e4b ("{net, IB}/mlx5: Manage port association for multiport RoCE")
      Link: https://lore.kernel.org/r/20190916064818.19823-3-leon@kernel.orgSigned-off-by: NDanit Goldberg <danitg@mellanox.com>
      Reviewed-by: NJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a924850c
    • V
      printk: Do not lose last line in kmsg buffer dump · 40b07199
      Vincent Whitchurch 提交于
      commit c9dccacfccc72c32692eedff4a27a4b0833a2afd upstream.
      
      kmsg_dump_get_buffer() is supposed to select all the youngest log
      messages which fit into the provided buffer.  It determines the correct
      start index by using msg_print_text() with a NULL buffer to calculate
      the size of each entry.  However, when performing the actual writes,
      msg_print_text() only writes the entry to the buffer if the written len
      is lesser than the size of the buffer.  So if the lengths of the
      selected youngest log messages happen to precisely fill up the provided
      buffer, the last log message is not included.
      
      We don't want to modify msg_print_text() to fill up the buffer and start
      returning a length which is equal to the size of the buffer, since
      callers of its other users, such as kmsg_dump_get_line(), depend upon
      the current behaviour.
      
      Instead, fix kmsg_dump_get_buffer() to compensate for this.
      
      For example, with the following two final prints:
      
      [    6.427502] AAAAAAAAAAAAA
      [    6.427769] BBBBBBBB12345
      
      A dump of a 64-byte buffer filled by kmsg_dump_get_buffer(), before this
      patch:
      
       00000000: 3c 30 3e 5b 20 20 20 20 36 2e 35 32 32 31 39 37  <0>[    6.522197
       00000010: 5d 20 41 41 41 41 41 41 41 41 41 41 41 41 41 0a  ] AAAAAAAAAAAAA.
       00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
       00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      
      After this patch:
      
       00000000: 3c 30 3e 5b 20 20 20 20 36 2e 34 35 36 36 37 38  <0>[    6.456678
       00000010: 5d 20 42 42 42 42 42 42 42 42 31 32 33 34 35 0a  ] BBBBBBBB12345.
       00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
       00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      
      Link: http://lkml.kernel.org/r/20190711142937.4083-1-vincent.whitchurch@axis.com
      Fixes: e2ae715d ("kmsg - kmsg_dump() use iterator to receive log buffer content")
      To: rostedt@goodmis.org
      Cc: linux-kernel@vger.kernel.org
      Cc: <stable@vger.kernel.org> # v3.5+
      Signed-off-by: NVincent Whitchurch <vincent.whitchurch@axis.com>
      Reviewed-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Signed-off-by: NPetr Mladek <pmladek@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      40b07199
    • Q
      scsi: qla2xxx: Fix Relogin to prevent modifying scan_state flag · 28f142b9
      Quinn Tran 提交于
      commit 8b5292bcfcacf15182a77a973a98d310e76fd58b upstream.
      
      Relogin fails to move forward due to scan_state flag indicating device is
      not there. Before relogin process, Session delete process accidently
      modified the scan_state flag.
      
      [mkp: typos plus corrected Fixes: sha as reported by sfr]
      
      Fixes: 2dee5521 ("scsi: qla2xxx: Fix login state machine freeze")
      Cc: stable@vger.kernel.org
      Signed-off-by: NQuinn Tran <qutran@marvell.com>
      Signed-off-by: NHimanshu Madhani <hmadhani@marvell.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      28f142b9
    • M
      scsi: scsi_dh_rdac: zero cdb in send_mode_select() · 03b75e65
      Martin Wilck 提交于
      commit 57adf5d4cfd3198aa480e7c94a101fc8c4e6109d upstream.
      
      cdb in send_mode_select() is not zeroed and is only partially filled in
      rdac_failover_get(), which leads to some random data getting to the
      device. Users have reported storage responding to such commands with
      INVALID FIELD IN CDB. Code before commit 32782557 was not affected, as
      it called blk_rq_set_block_pc().
      
      Fix this by zeroing out the cdb first.
      
      Identified & fix proposed by HPE.
      
      Fixes: 32782557 ("scsi_dh_rdac: switch to scsi_execute_req_flags()")
      Cc: stable@vger.kernel.org
      Link: https://lore.kernel.org/r/20190904155205.1666-1-martin.wilck@suse.comSigned-off-by: NMartin Wilck <mwilck@suse.com>
      Acked-by: NAles Novak <alnovak@suse.cz>
      Reviewed-by: NShane Seymour <shane.seymour@hpe.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      03b75e65
    • T
      ALSA: firewire-tascam: check intermediate state of clock status and retry · 2e21e5b2
      Takashi Sakamoto 提交于
      commit e1a00b5b253a4f97216b9a33199a863987075162 upstream.
      
      2 bytes in MSB of register for clock status is zero during intermediate
      state after changing status of sampling clock in models of TASCAM FireWire
      series. The duration of this state differs depending on cases. During the
      state, it's better to retry reading the register for current status of
      the clock.
      
      In current implementation, the intermediate state is checked only when
      getting current sampling transmission frequency, then retry reading.
      This care is required for the other operations to read the register.
      
      This commit moves the codes of check and retry into helper function
      commonly used for operations to read the register.
      
      Fixes: e453df44 ("ALSA: firewire-tascam: add PCM functionality")
      Cc: <stable@vger.kernel.org> # v4.4+
      Signed-off-by: NTakashi Sakamoto <o-takashi@sakamocchi.jp>
      Link: https://lore.kernel.org/r/20190910135152.29800-3-o-takashi@sakamocchi.jpSigned-off-by: NTakashi Iwai <tiwai@suse.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2e21e5b2
    • T
      ALSA: firewire-tascam: handle error code when getting current source of clock · f5779e44
      Takashi Sakamoto 提交于
      commit 2617120f4de6d0423384e0e86b14c78b9de84d5a upstream.
      
      The return value of snd_tscm_stream_get_clock() is ignored. This commit
      checks the value and handle error.
      
      Fixes: e453df44 ("ALSA: firewire-tascam: add PCM functionality")
      Cc: <stable@vger.kernel.org> # v4.4+
      Signed-off-by: NTakashi Sakamoto <o-takashi@sakamocchi.jp>
      Link: https://lore.kernel.org/r/20190910135152.29800-2-o-takashi@sakamocchi.jpSigned-off-by: NTakashi Iwai <tiwai@suse.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f5779e44
    • L
      iwlwifi: fw: don't send GEO_TX_POWER_LIMIT command to FW version 36 · fdd131ea
      Luca Coelho 提交于
      commit fddbfeece9c7882cc47754c7da460fe427e3e85b upstream.
      
      The intention was to have the GEO_TX_POWER_LIMIT command in FW version
      36 as well, but not all 8000 family got this feature enabled.  The
      8000 family is the only one using version 36, so skip this version
      entirely.  If we try to send this command to the firmwares that do not
      support it, we get a BAD_COMMAND response from the firmware.
      
      This fixes https://bugzilla.kernel.org/show_bug.cgi?id=204151.
      
      Cc: stable@vger.kernel.org # 4.19+
      Signed-off-by: NLuca Coelho <luciano.coelho@intel.com>
      Signed-off-by: NKalle Valo <kvalo@codeaurora.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fdd131ea
    • M
      PM / devfreq: passive: fix compiler warning · 6437ec27
      MyungJoo Ham 提交于
      [ Upstream commit 0465814831a926ce2f83e8f606d067d86745234e ]
      
      The recent commit of
      PM / devfreq: passive: Use non-devm notifiers
      had incurred compiler warning, "unused variable 'dev'".
      Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NMyungJoo Ham <myungjoo.ham@samsung.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      6437ec27