1. 23 1月, 2019 40 次提交
    • J
      blockdev: Fix livelocks on loop device · 1e11b1d6
      Jan Kara 提交于
      commit 04906b2f542c23626b0ef6219b808406f8dddbe9 upstream.
      
      bd_set_size() updates also block device's block size. This is somewhat
      unexpected from its name and at this point, only blkdev_open() uses this
      functionality. Furthermore, this can result in changing block size under
      a filesystem mounted on a loop device which leads to livelocks inside
      __getblk_gfp() like:
      
      Sending NMI from CPU 0 to CPUs 1:
      NMI backtrace for cpu 1
      CPU: 1 PID: 10863 Comm: syz-executor0 Not tainted 4.18.0-rc5+ #151
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google
      01/01/2011
      RIP: 0010:__sanitizer_cov_trace_pc+0x3f/0x50 kernel/kcov.c:106
      ...
      Call Trace:
       init_page_buffers+0x3e2/0x530 fs/buffer.c:904
       grow_dev_page fs/buffer.c:947 [inline]
       grow_buffers fs/buffer.c:1009 [inline]
       __getblk_slow fs/buffer.c:1036 [inline]
       __getblk_gfp+0x906/0xb10 fs/buffer.c:1313
       __bread_gfp+0x2d/0x310 fs/buffer.c:1347
       sb_bread include/linux/buffer_head.h:307 [inline]
       fat12_ent_bread+0x14e/0x3d0 fs/fat/fatent.c:75
       fat_ent_read_block fs/fat/fatent.c:441 [inline]
       fat_alloc_clusters+0x8ce/0x16e0 fs/fat/fatent.c:489
       fat_add_cluster+0x7a/0x150 fs/fat/inode.c:101
       __fat_get_block fs/fat/inode.c:148 [inline]
      ...
      
      Trivial reproducer for the problem looks like:
      
      truncate -s 1G /tmp/image
      losetup /dev/loop0 /tmp/image
      mkfs.ext4 -b 1024 /dev/loop0
      mount -t ext4 /dev/loop0 /mnt
      losetup -c /dev/loop0
      l /mnt
      
      Fix the problem by moving initialization of a block device block size
      into a separate function and call it when needed.
      
      Thanks to Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> for help with
      debugging the problem.
      
      Reported-by: syzbot+9933e4476f365f5d5a1b@syzkaller.appspotmail.com
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1e11b1d6
    • S
      selinux: fix GPF on invalid policy · 5a79e71e
      Stephen Smalley 提交于
      commit 5b0e7310a2a33c06edc7eb81ffc521af9b2c5610 upstream.
      
      levdatum->level can be NULL if we encounter an error while loading
      the policy during sens_read prior to initializing it.  Make sure
      sens_destroy handles that case correctly.
      
      Reported-by: syzbot+6664500f0f18f07a5c0e@syzkaller.appspotmail.com
      Signed-off-by: NStephen Smalley <sds@tycho.nsa.gov>
      Signed-off-by: NPaul Moore <paul@paul-moore.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5a79e71e
    • Y
      block: use rcu_work instead of call_rcu to avoid sleep in softirq · 4cc66cc4
      Yufen Yu 提交于
      commit 94a2c3a32b62e868dc1e3d854326745a7f1b8c7a upstream.
      
      We recently got a stack by syzkaller like this:
      
      BUG: sleeping function called from invalid context at mm/slab.h:361
      in_atomic(): 1, irqs_disabled(): 0, pid: 6644, name: blkid
      INFO: lockdep is turned off.
      CPU: 1 PID: 6644 Comm: blkid Not tainted 4.4.163-514.55.6.9.x86_64+ #76
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
       0000000000000000 5ba6a6b879e50c00 ffff8801f6b07b10 ffffffff81cb2194
       0000000041b58ab3 ffffffff833c7745 ffffffff81cb2080 5ba6a6b879e50c00
       0000000000000000 0000000000000001 0000000000000004 0000000000000000
      Call Trace:
       <IRQ>  [<ffffffff81cb2194>] __dump_stack lib/dump_stack.c:15 [inline]
       <IRQ>  [<ffffffff81cb2194>] dump_stack+0x114/0x1a0 lib/dump_stack.c:51
       [<ffffffff8129a981>] ___might_sleep+0x291/0x490 kernel/sched/core.c:7675
       [<ffffffff8129ac33>] __might_sleep+0xb3/0x270 kernel/sched/core.c:7637
       [<ffffffff81794c13>] slab_pre_alloc_hook mm/slab.h:361 [inline]
       [<ffffffff81794c13>] slab_alloc_node mm/slub.c:2610 [inline]
       [<ffffffff81794c13>] slab_alloc mm/slub.c:2692 [inline]
       [<ffffffff81794c13>] kmem_cache_alloc_trace+0x2c3/0x5c0 mm/slub.c:2709
       [<ffffffff81cbe9a7>] kmalloc include/linux/slab.h:479 [inline]
       [<ffffffff81cbe9a7>] kzalloc include/linux/slab.h:623 [inline]
       [<ffffffff81cbe9a7>] kobject_uevent_env+0x2c7/0x1150 lib/kobject_uevent.c:227
       [<ffffffff81cbf84f>] kobject_uevent+0x1f/0x30 lib/kobject_uevent.c:374
       [<ffffffff81cbb5b9>] kobject_cleanup lib/kobject.c:633 [inline]
       [<ffffffff81cbb5b9>] kobject_release+0x229/0x440 lib/kobject.c:675
       [<ffffffff81cbb0a2>] kref_sub include/linux/kref.h:73 [inline]
       [<ffffffff81cbb0a2>] kref_put include/linux/kref.h:98 [inline]
       [<ffffffff81cbb0a2>] kobject_put+0x72/0xd0 lib/kobject.c:692
       [<ffffffff8216f095>] put_device+0x25/0x30 drivers/base/core.c:1237
       [<ffffffff81c4cc34>] delete_partition_rcu_cb+0x1d4/0x2f0 block/partition-generic.c:232
       [<ffffffff813c08bc>] __rcu_reclaim kernel/rcu/rcu.h:118 [inline]
       [<ffffffff813c08bc>] rcu_do_batch kernel/rcu/tree.c:2705 [inline]
       [<ffffffff813c08bc>] invoke_rcu_callbacks kernel/rcu/tree.c:2973 [inline]
       [<ffffffff813c08bc>] __rcu_process_callbacks kernel/rcu/tree.c:2940 [inline]
       [<ffffffff813c08bc>] rcu_process_callbacks+0x59c/0x1c70 kernel/rcu/tree.c:2957
       [<ffffffff8120f509>] __do_softirq+0x299/0xe20 kernel/softirq.c:273
       [<ffffffff81210496>] invoke_softirq kernel/softirq.c:350 [inline]
       [<ffffffff81210496>] irq_exit+0x216/0x2c0 kernel/softirq.c:391
       [<ffffffff82c2cd7b>] exiting_irq arch/x86/include/asm/apic.h:652 [inline]
       [<ffffffff82c2cd7b>] smp_apic_timer_interrupt+0x8b/0xc0 arch/x86/kernel/apic/apic.c:926
       [<ffffffff82c2bc25>] apic_timer_interrupt+0xa5/0xb0 arch/x86/entry/entry_64.S:746
       <EOI>  [<ffffffff814cbf40>] ? audit_kill_trees+0x180/0x180
       [<ffffffff8187d2f7>] fd_install+0x57/0x80 fs/file.c:626
       [<ffffffff8180989e>] do_sys_open+0x45e/0x550 fs/open.c:1043
       [<ffffffff818099c2>] SYSC_open fs/open.c:1055 [inline]
       [<ffffffff818099c2>] SyS_open+0x32/0x40 fs/open.c:1050
       [<ffffffff82c299e1>] entry_SYSCALL_64_fastpath+0x1e/0x9a
      
      In softirq context, we call rcu callback function delete_partition_rcu_cb(),
      which may allocate memory by kzalloc with GFP_KERNEL flag. If the
      allocation cannot be satisfied, it may sleep. However, That is not allowed
      in softirq contex.
      
      Although we found this problem on linux 4.4, the latest kernel version
      seems to have this problem as well. And it is very similar to the
      previous one:
      	https://lkml.org/lkml/2018/7/9/391
      
      Fix it by using RCU workqueue, which allows sleep.
      Reviewed-by: NPaul E. McKenney <paulmck@linux.ibm.com>
      Signed-off-by: NYufen Yu <yuyufen@huawei.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4cc66cc4
    • S
      netfilter: ebtables: account ebt_table_info to kmemcg · 2663bcba
      Shakeel Butt 提交于
      commit e2c8d550a973bb34fc28bc8d0ec996f84562fb8a upstream.
      
      The [ip,ip6,arp]_tables use x_tables_info internally and the underlying
      memory is already accounted to kmemcg. Do the same for ebtables. The
      syzbot, by using setsockopt(EBT_SO_SET_ENTRIES), was able to OOM the
      whole system from a restricted memcg, a potential DoS.
      
      By accounting the ebt_table_info, the memory used for ebt_table_info can
      be contained within the memcg of the allocating process. However the
      lifetime of ebt_table_info is independent of the allocating process and
      is tied to the network namespace. So, the oom-killer will not be able to
      relieve the memory pressure due to ebt_table_info memory. The memory for
      ebt_table_info is allocated through vmalloc. Currently vmalloc does not
      handle the oom-killed allocating process correctly and one large
      allocation can bypass memcg limit enforcement. So, with this patch,
      at least the small allocations will be contained. For large allocations,
      we need to fix vmalloc.
      
      Reported-by: syzbot+7713f3aa67be76b1552c@syzkaller.appspotmail.com
      Signed-off-by: NShakeel Butt <shakeelb@google.com>
      Reviewed-by: NKirill Tkhai <ktkhai@virtuozzo.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2663bcba
    • J
      sunrpc: handle ENOMEM in rpcb_getport_async · 61b29bed
      J. Bruce Fields 提交于
      commit 81c88b18de1f11f70c97f28ced8d642c00bb3955 upstream.
      
      If we ignore the error we'll hit a null dereference a little later.
      
      Reported-by: syzbot+4b98281f2401ab849f4b@syzkaller.appspotmail.com
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      61b29bed
    • H
      media: vb2: vb2_mmap: move lock up · c4f39cba
      Hans Verkuil 提交于
      commit cd26d1c4d1bc947b56ae404998ae2276df7b39b7 upstream.
      
      If a filehandle is dup()ped, then it is possible to close it from one fd
      and call mmap from the other. This creates a race condition in vb2_mmap
      where it is using queue data that __vb2_queue_free (called from close())
      is in the process of releasing.
      
      By moving up the mutex_lock(mmap_lock) in vb2_mmap this race is avoided
      since __vb2_queue_free is called with the same mutex locked. So vb2_mmap
      now reads consistent buffer data.
      Signed-off-by: NHans Verkuil <hverkuil@xs4all.nl>
      Reported-by: syzbot+be93025dd45dccd8923c@syzkaller.appspotmail.com
      Signed-off-by: NHans Verkuil <hansverk@cisco.com>
      Signed-off-by: NMauro Carvalho Chehab <mchehab+samsung@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c4f39cba
    • J
      LSM: Check for NULL cred-security on free · a19aedf1
      James Morris 提交于
      commit a5795fd38ee8194451ba3f281f075301a3696ce2 upstream.
      
      From: Casey Schaufler <casey@schaufler-ca.com>
      
      Check that the cred security blob has been set before trying
      to clean it up. There is a case during credential initialization
      that could result in this.
      Signed-off-by: NCasey Schaufler <casey@schaufler-ca.com>
      Acked-by: NJohn Johansen <john.johansen@canonical.com>
      Signed-off-by: NJames Morris <james.morris@microsoft.com>
      Reported-by: syzbot+69ca07954461f189e808@syzkaller.appspotmail.com
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a19aedf1
    • E
      ipv6: make icmp6_send() robust against null skb->dev · a72e572f
      Eric Dumazet 提交于
      commit 8d933670452107e41165bea70a30dffbd281bef1 upstream.
      
      syzbot was able to crash one host with the following stack trace :
      
      kasan: GPF could be caused by NULL-ptr deref or user memory access
      general protection fault: 0000 [#1] PREEMPT SMP KASAN
      CPU: 0 PID: 8625 Comm: syz-executor4 Not tainted 4.20.0+ #8
      RIP: 0010:dev_net include/linux/netdevice.h:2169 [inline]
      RIP: 0010:icmp6_send+0x116/0x2d30 net/ipv6/icmp.c:426
       icmpv6_send
       smack_socket_sock_rcv_skb
       security_sock_rcv_skb
       sk_filter_trim_cap
       __sk_receive_skb
       dccp_v6_do_rcv
       release_sock
      
      This is because a RX packet found socket owned by user and
      was stored into socket backlog. Before leaving RCU protected section,
      skb->dev was cleared in __sk_receive_skb(). When socket backlog
      was finally handled at release_sock() time, skb was fed to
      smack_socket_sock_rcv_skb() then icmp6_send()
      
      We could fix the bug in smack_socket_sock_rcv_skb(), or simply
      make icmp6_send() more robust against such possibility.
      
      In the future we might provide to icmp6_send() the net pointer
      instead of infering it.
      
      Fixes: d66a8acb ("Smack: Inform peer that IPv6 traffic has been blocked")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Piotr Sawicki <p.sawicki2@partner.samsung.com>
      Cc: Casey Schaufler <casey@schaufler-ca.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Acked-by: NCasey Schaufler <casey@schaufler-ca.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a72e572f
    • W
      bpf: in __bpf_redirect_no_mac pull mac only if present · 341906cb
      Willem de Bruijn 提交于
      commit e7c87bd6cc4ec7b0ac1ed0a88a58f8206c577488 upstream.
      
      Syzkaller was able to construct a packet of negative length by
      redirecting from bpf_prog_test_run_skb with BPF_PROG_TYPE_LWT_XMIT:
      
          BUG: KASAN: slab-out-of-bounds in memcpy include/linux/string.h:345 [inline]
          BUG: KASAN: slab-out-of-bounds in skb_copy_from_linear_data include/linux/skbuff.h:3421 [inline]
          BUG: KASAN: slab-out-of-bounds in __pskb_copy_fclone+0x2dd/0xeb0 net/core/skbuff.c:1395
          Read of size 4294967282 at addr ffff8801d798009c by task syz-executor2/12942
      
          kasan_report.cold.9+0x242/0x309 mm/kasan/report.c:412
          check_memory_region_inline mm/kasan/kasan.c:260 [inline]
          check_memory_region+0x13e/0x1b0 mm/kasan/kasan.c:267
          memcpy+0x23/0x50 mm/kasan/kasan.c:302
          memcpy include/linux/string.h:345 [inline]
          skb_copy_from_linear_data include/linux/skbuff.h:3421 [inline]
          __pskb_copy_fclone+0x2dd/0xeb0 net/core/skbuff.c:1395
          __pskb_copy include/linux/skbuff.h:1053 [inline]
          pskb_copy include/linux/skbuff.h:2904 [inline]
          skb_realloc_headroom+0xe7/0x120 net/core/skbuff.c:1539
          ipip6_tunnel_xmit net/ipv6/sit.c:965 [inline]
          sit_tunnel_xmit+0xe1b/0x30d0 net/ipv6/sit.c:1029
          __netdev_start_xmit include/linux/netdevice.h:4325 [inline]
          netdev_start_xmit include/linux/netdevice.h:4334 [inline]
          xmit_one net/core/dev.c:3219 [inline]
          dev_hard_start_xmit+0x295/0xc90 net/core/dev.c:3235
          __dev_queue_xmit+0x2f0d/0x3950 net/core/dev.c:3805
          dev_queue_xmit+0x17/0x20 net/core/dev.c:3838
          __bpf_tx_skb net/core/filter.c:2016 [inline]
          __bpf_redirect_common net/core/filter.c:2054 [inline]
          __bpf_redirect+0x5cf/0xb20 net/core/filter.c:2061
          ____bpf_clone_redirect net/core/filter.c:2094 [inline]
          bpf_clone_redirect+0x2f6/0x490 net/core/filter.c:2066
          bpf_prog_41f2bcae09cd4ac3+0xb25/0x1000
      
      The generated test constructs a packet with mac header, network
      header, skb->data pointing to network header and skb->len 0.
      
      Redirecting to a sit0 through __bpf_redirect_no_mac pulls the
      mac length, even though skb->data already is at skb->network_header.
      bpf_prog_test_run_skb has already pulled it as LWT_XMIT !is_l2.
      
      Update the offset calculation to pull only if skb->data differs
      from skb->network_header, which is not true in this case.
      
      The test itself can be run only from commit 1cf1cae9 ("bpf:
      introduce BPF_PROG_TEST_RUN command"), but the same type of packets
      with skb at network header could already be built from lwt xmit hooks,
      so this fix is more relevant to that commit.
      
      Also set the mac header on redirect from LWT_XMIT, as even after this
      change to __bpf_redirect_no_mac that field is expected to be set, but
      is not yet in ip_finish_output2.
      
      Fixes: 3a0af8fd ("bpf: BPF for lightweight tunnel infrastructure")
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      341906cb
    • H
      media: vivid: set min width/height to a value > 0 · d9c249a3
      Hans Verkuil 提交于
      commit 9729d6d282a6d7ce88e64c9119cecdf79edf4e88 upstream.
      
      The capture DV timings capabilities allowed for a minimum width and
      height of 0. So passing a timings struct with 0 values is allowed
      and will later cause a division by zero.
      
      Ensure that the width and height must be >= 16 to avoid this.
      Signed-off-by: NHans Verkuil <hverkuil-cisco@xs4all.nl>
      Reported-by: syzbot+57c3d83d71187054d56f@syzkaller.appspotmail.com
      Signed-off-by: NMauro Carvalho Chehab <mchehab+samsung@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d9c249a3
    • H
      media: vivid: fix error handling of kthread_run · 4497ce43
      Hans Verkuil 提交于
      commit 701f49bc028edb19ffccd101997dd84f0d71e279 upstream.
      
      kthread_run returns an error pointer, but elsewhere in the code
      dev->kthread_vid_cap/out is checked against NULL.
      
      If kthread_run returns an error, then set the pointer to NULL.
      
      I chose this method over changing all kthread_vid_cap/out tests
      elsewhere since this is more robust.
      Signed-off-by: NHans Verkuil <hverkuil-cisco@xs4all.nl>
      Reported-by: syzbot+53d5b2df0d9744411e2e@syzkaller.appspotmail.com
      Signed-off-by: NMauro Carvalho Chehab <mchehab+samsung@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4497ce43
    • V
      omap2fb: Fix stack memory disclosure · 4190c5fd
      Vlad Tsyrklevich 提交于
      commit a01421e4484327fe44f8e126793ed5a48a221e24 upstream.
      
      Using [1] for static analysis I found that the OMAPFB_QUERY_PLANE,
      OMAPFB_GET_COLOR_KEY, OMAPFB_GET_DISPLAY_INFO, and OMAPFB_GET_VRAM_INFO
      cases could all leak uninitialized stack memory--either due to
      uninitialized padding or 'reserved' fields.
      
      Fix them by clearing the shared union used to store copied out data.
      
      [1] https://github.com/vlad902/kernel-uninitialized-memory-checkerSigned-off-by: NVlad Tsyrklevich <vlad@tsyrklevich.net>
      Reviewed-by: NKees Cook <keescook@chromium.org>
      Fixes: b39a982d ("OMAP: DSS2: omapfb driver")
      Cc: security@kernel.org
      [b.zolnierkie: prefix patch subject with "omap2fb: "]
      Signed-off-by: NBartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4190c5fd
    • F
      fix int_sqrt64() for very large numbers · 328f3de2
      Florian La Roche 提交于
      commit fbfaf851902cd9293f392f3a1735e0543016d530 upstream.
      
      If an input number x for int_sqrt64() has the highest bit set, then
      fls64(x) is 64.  (1UL << 64) is an overflow and breaks the algorithm.
      
      Subtracting 1 is a better guess for the initial value of m anyway and
      that's what also done in int_sqrt() implicitly [*].
      
      [*] Note how int_sqrt() uses __fls() with two underscores, which already
          returns the proper raw bit number.
      
          In contrast, int_sqrt64() used fls64(), and that returns bit numbers
          illogically starting at 1, because of error handling for the "no
          bits set" case. Will points out that he bug probably is due to a
          copy-and-paste error from the regular int_sqrt() case.
      Signed-off-by: NFlorian La Roche <Florian.LaRoche@googlemail.com>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      328f3de2
    • Y
      Disable MSI also when pcie-octeon.pcie_disable on · 89a9f049
      YunQiang Su 提交于
      commit a214720cbf50cd8c3f76bbb9c3f5c283910e9d33 upstream.
      
      Octeon has an boot-time option to disable pcie.
      
      Since MSI depends on PCI-E, we should also disable MSI also with
      this option is on in order to avoid inadvertently accessing PCIe
      registers.
      Signed-off-by: NYunQiang Su <ysu@wavecomp.com>
      Signed-off-by: NPaul Burton <paul.burton@mips.com>
      Cc: pburton@wavecomp.com
      Cc: linux-mips@vger.kernel.org
      Cc: aaro.koskinen@iki.fi
      Cc: stable@vger.kernel.org # v3.3+
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      89a9f049
    • H
      arm64: dts: marvell: armada-ap806: reserve PSCI area · 3832c115
      Heinrich Schuchardt 提交于
      commit 132ac39cffbcfed80ada38ef0fc6d34d95da7be6 upstream.
      
      The memory area [0x4000000-0x4200000[ is occupied by the PSCI firmware. Any
      attempt to access it from Linux leads to an immediate crash.
      
      So let's make the same memory reservation as the vendor kernel.
      
      [gregory: added as comment that this region matches the mainline U-boot]
      Signed-off-by: NHeinrich Schuchardt <xypron.glpk@gmx.de>
      Signed-off-by: NGregory CLEMENT <gregory.clement@bootlin.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3832c115
    • A
      arm64: kaslr: ensure randomized quantities are clean to the PoC · ca8080c3
      Ard Biesheuvel 提交于
      commit 1598ecda7b239e9232dda032bfddeed9d89fab6c upstream.
      
      kaslr_early_init() is called with the kernel mapped at its
      link time offset, and if it returns with a non-zero offset,
      the kernel is unmapped and remapped again at the randomized
      offset.
      
      During its execution, kaslr_early_init() also randomizes the
      base of the module region and of the linear mapping of DRAM,
      and sets two variables accordingly. However, since these
      variables are assigned with the caches on, they may get lost
      during the cache maintenance that occurs when unmapping and
      remapping the kernel, so ensure that these values are cleaned
      to the PoC.
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Fixes: f80fb3a3 ("arm64: add support for kernel ASLR")
      Cc: <stable@vger.kernel.org> # v4.6+
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ca8080c3
    • K
      pstore/ram: Avoid allocation and leak of platform data · 483ac8e6
      Kees Cook 提交于
      commit 5631e8576a3caf606cdc375f97425a67983b420c upstream.
      
      Yue Hu noticed that when parsing device tree the allocated platform data
      was never freed. Since it's not used beyond the function scope, this
      switches to using a stack variable instead.
      Reported-by: NYue Hu <huyue2@yulong.com>
      Fixes: 35da6094 ("pstore/ram: add Device Tree bindings")
      Cc: stable@vger.kernel.org
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      483ac8e6
    • J
      net: dsa: realtek-smi: fix OF child-node lookup · a10cabbf
      Johan Hovold 提交于
      commit 3f1bb6abdf19cfa89860b3bc9e7f31b44b6a0ba1 upstream.
      
      Use the new of_get_compatible_child() helper to look up child nodes to
      avoid ever matching non-child nodes elsewhere in the tree.
      
      Also fix up the related struct device_node leaks.
      
      Fixes: d8652956 ("net: dsa: realtek-smi: Add Realtek SMI driver")
      Cc: stable <stable@vger.kernel.org>     # 4.19: 36156f92
      Cc: Linus Walleij <linus.walleij@linaro.org>
      Signed-off-by: NJohan Hovold <johan@kernel.org>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a10cabbf
    • P
      kbuild: Disable LD_DEAD_CODE_DATA_ELIMINATION with ftrace & GCC <= 4.7 · 0098f2e7
      Paul Burton 提交于
      commit 16fd20aa98080c2fa666dc384036ec08c80af710 upstream.
      
      When building using GCC 4.7 or older, -ffunction-sections & the -pg flag
      used by ftrace are incompatible. This causes warnings or build failures
      (where -Werror applies) such as the following:
      
        arch/mips/generic/init.c:
          error: -ffunction-sections disabled; it makes profiling impossible
      
      This used to be taken into account by the ordering of calls to cc-option
      from within the top-level Makefile, which was introduced by commit
      90ad4052 ("kbuild: avoid conflict between -ffunction-sections and
      -pg on gcc-4.7"). Unfortunately this was broken when the
      CONFIG_LD_DEAD_CODE_DATA_ELIMINATION cc-option check was moved to
      Kconfig in commit e85d1d65 ("kbuild: test dead code/data elimination
      support in Kconfig"), because the flags used by this check no longer
      include -pg.
      
      Fix this by not allowing CONFIG_LD_DEAD_CODE_DATA_ELIMINATION to be
      enabled at the same time as ftrace/CONFIG_FUNCTION_TRACER when building
      using GCC 4.7 or older.
      Signed-off-by: NPaul Burton <paul.burton@mips.com>
      Fixes: e85d1d65 ("kbuild: test dead code/data elimination support in Kconfig")
      Reported-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: stable@vger.kernel.org # v4.19+
      Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0098f2e7
    • A
      RDMA/vmw_pvrdma: Return the correct opcode when creating WR · ec485378
      Adit Ranadive 提交于
      commit 6325e01b6cdf4636b721cf7259c1616e3cf28ce2 upstream.
      
      Since the IB_WR_REG_MR opcode value changed, let's set the PVRDMA device
      opcodes explicitly.
      Reported-by: NRuishuang Wang <ruishuangw@vmware.com>
      Fixes: 9a59739bd01f ("IB/rxe: Revise the ib_wr_opcode enum")
      Cc: stable@vger.kernel.org
      Reviewed-by: NBryan Tan <bryantan@vmware.com>
      Reviewed-by: NRuishuang Wang <ruishuangw@vmware.com>
      Reviewed-by: NVishnu Dasa <vdasa@vmware.com>
      Signed-off-by: NAdit Ranadive <aditr@vmware.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ec485378
    • L
      RDMA/nldev: Don't expose unsafe global rkey to regular user · 836edf22
      Leon Romanovsky 提交于
      commit a9666c1cae8dbcd1a9aacd08a778bf2a28eea300 upstream.
      
      Unsafe global rkey is considered dangerous because it exposes memory
      registered for all memory in the system. Only users with a QP on the same
      PD can use the rkey, and generally those QPs will already know the
      value. However, out of caution, do not expose the value to unprivleged
      users on the local system. Require CAP_NET_ADMIN instead.
      
      Cc: <stable@vger.kernel.org> # 4.16
      Fixes: 29cf1351 ("RDMA/nldev: provide detailed PD information")
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      836edf22
    • S
      media: v4l: ioctl: Validate num_planes for debug messages · 8f4a0e7d
      Sakari Ailus 提交于
      commit 7fe9f01c04c2673bd6662c35b664f0f91888b96f upstream.
      
      The num_planes field in struct v4l2_pix_format_mplane is used in a loop
      before validating it. As the use is printing a debug message in this case,
      just cap the value to the maximum allowed.
      Signed-off-by: NSakari Ailus <sakari.ailus@linux.intel.com>
      Cc: stable@vger.kernel.org
      Reviewed-by: NThierry Reding <treding@nvidia.com>
      Signed-off-by: NHans Verkuil <hverkuil-cisco@xs4all.nl>
      Cc: <stable@vger.kernel.org>      # for v4.12 and up
      Signed-off-by: NMauro Carvalho Chehab <mchehab+samsung@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8f4a0e7d
    • J
      mfd: tps6586x: Handle interrupts on suspend · d846f48c
      Jonathan Hunter 提交于
      commit ac4ca4b9f4623ba5e1ea7a582f286567c611e027 upstream.
      
      The tps6586x driver creates an irqchip that is used by its various child
      devices for managing interrupts. The tps6586x-rtc device is one of its
      children that uses the tps6586x irqchip. When using the tps6586x-rtc as
      a wake-up device from suspend, the following is seen:
      
       PM: Syncing filesystems ... done.
       Freezing user space processes ... (elapsed 0.001 seconds) done.
       OOM killer disabled.
       Freezing remaining freezable tasks ... (elapsed 0.000 seconds) done.
       Disabling non-boot CPUs ...
       Entering suspend state LP1
       Enabling non-boot CPUs ...
       CPU1 is up
       tps6586x 3-0034: failed to read interrupt status
       tps6586x 3-0034: failed to read interrupt status
      
      The reason why the tps6586x interrupt status cannot be read is because
      the tps6586x interrupt is not masked during suspend and when the
      tps6586x-rtc interrupt occurs, to wake-up the device, the interrupt is
      seen before the i2c controller has been resumed in order to read the
      tps6586x interrupt status.
      
      The tps6586x-rtc driver sets it's interrupt as a wake-up source during
      suspend, which gets propagated to the parent tps6586x interrupt.
      However, the tps6586x-rtc driver cannot disable it's interrupt during
      suspend otherwise we would never be woken up and so the tps6586x must
      disable it's interrupt instead.
      
      Prevent the tps6586x interrupt handler from executing on exiting suspend
      before the i2c controller has been resumed by disabling the tps6586x
      interrupt on entering suspend and re-enabling it on resuming from
      suspend.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NJon Hunter <jonathanh@nvidia.com>
      Reviewed-by: NDmitry Osipenko <digetx@gmail.com>
      Tested-by: NDmitry Osipenko <digetx@gmail.com>
      Acked-by: NThierry Reding <treding@nvidia.com>
      Signed-off-by: NLee Jones <lee.jones@linaro.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d846f48c
    • J
      OF: properties: add missing of_node_put · a752c6d6
      Julia Lawall 提交于
      commit 28b170e88bc0c7509e6724717c15cb4b5686026e upstream.
      
      Add an of_node_put when the result of of_graph_get_remote_port_parent is
      not available.
      
      The semantic match that finds this problem is as follows
      (http://coccinelle.lip6.fr):
      
      // <smpl>
      @r exists@
      local idexpression e;
      expression x;
      @@
      e = of_graph_get_remote_port_parent(...);
      ... when != x = e
          when != true e == NULL
          when != of_node_put(e)
          when != of_fwnode_handle(e)
      (
      return e;
      |
      *return ...;
      )
      // </smpl>
      Signed-off-by: NJulia Lawall <Julia.Lawall@lip6.fr>
      Cc: stable@vger.kernel.org
      Signed-off-by: NRob Herring <robh@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a752c6d6
    • Z
      drm/i915/gvt: Fix mmap range check · ac8b9e8e
      Zhenyu Wang 提交于
      commit 51b00d8509dc69c98740da2ad07308b630d3eb7d upstream.
      
      This is to fix missed mmap range check on vGPU bar2 region
      and only allow to map vGPU allocated GMADDR range, which means
      user space should support sparse mmap to get proper offset for
      mmap vGPU aperture. And this takes care of actual pgoff in mmap
      request as original code always does from beginning of vGPU
      aperture.
      
      Fixes: 659643f7 ("drm/i915/gvt/kvmgt: add vfio/mdev support to KVMGT")
      Cc: "Monroy, Rodrigo Axel" <rodrigo.axel.monroy@intel.com>
      Cc: "Orrala Contreras, Alfredo" <alfredo.orrala.contreras@intel.com>
      Cc: stable@vger.kernel.org # v4.10+
      Reviewed-by: NHang Yuan <hang.yuan@intel.com>
      Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ac8b9e8e
    • H
      MIPS: lantiq: Fix IPI interrupt handling · 434b1b91
      Hauke Mehrtens 提交于
      commit 2b4dba55b04b212a7fd1f0395b41d79ee3a9801b upstream.
      
      This makes SMP on the vrx200 work again, by removing all the MIPS CPU
      interrupt specific code and making it fully use the generic MIPS CPU
      interrupt controller.
      
      The mti,cpu-interrupt-controller from irq-mips-cpu.c now handles the CPU
      interrupts and also the IPI interrupts which are used to communication
      between the CPUs in a SMP system. The generic interrupt code was
      already used before but the interrupt vectors were overwritten again
      when we called set_vi_handler() in the lantiq interrupt driver and we
      also provided our own plat_irq_dispatch() function which overwrote the
      weak generic implementation. Now the code uses the generic handler for
      the MIPS CPU interrupts including the IPI interrupts and registers a
      handler for the CPU interrupts which are handled by the lantiq ICU with
      irq_set_chained_handler() which was already called before.
      
      Calling the set_c0_status() function is also not needed any more because
      the generic MIPS CPU interrupt already activates the needed bits.
      
      Fixes: 1eed4004 ("MIPS: smp-mt: Use CPU interrupt controller IPI IRQ domain support")
      Cc: stable@kernel.org # v4.12
      Signed-off-by: NHauke Mehrtens <hauke@hauke-m.de>
      Signed-off-by: NPaul Burton <paul.burton@mips.com>
      Cc: jhogan@kernel.org
      Cc: ralf@linux-mips.org
      Cc: john@phrozen.org
      Cc: linux-mips@linux-mips.org
      Cc: linux-mips@vger.kernel.org
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      434b1b91
    • R
      MIPS: BCM47XX: Setup struct device for the SoC · 19f41f32
      Rafał Miłecki 提交于
      commit 321c46b91550adc03054125fa7a1639390608e1a upstream.
      
      So far we never had any device registered for the SoC. This resulted in
      some small issues that we kept ignoring like:
      1) Not working GPIOLIB_IRQCHIP (gpiochip_irqchip_add_key() failing)
      2) Lack of proper tree in the /sys/devices/
      3) mips_dma_alloc_coherent() silently handling empty coherent_dma_mask
      
      Kernel 4.19 came with a lot of DMA changes and caused a regression on
      bcm47xx. Starting with the commit f8c55dc6 ("MIPS: use generic dma
      noncoherent ops for simple noncoherent platforms") DMA coherent
      allocations just fail. Example:
      [    1.114914] bgmac_bcma bcma0:2: Allocation of TX ring 0x200 failed
      [    1.121215] bgmac_bcma bcma0:2: Unable to alloc memory for DMA
      [    1.127626] bgmac_bcma: probe of bcma0:2 failed with error -12
      [    1.133838] bgmac_bcma: Broadcom 47xx GBit MAC driver loaded
      
      The bgmac driver also triggers a WARNING:
      [    0.959486] ------------[ cut here ]------------
      [    0.964387] WARNING: CPU: 0 PID: 1 at ./include/linux/dma-mapping.h:516 bgmac_enet_probe+0x1b4/0x5c4
      [    0.973751] Modules linked in:
      [    0.976913] CPU: 0 PID: 1 Comm: swapper Not tainted 4.19.9 #0
      [    0.982750] Stack : 804a0000 804597c4 00000000 00000000 80458fd8 8381bc2c 838282d4 80481a47
      [    0.991367]         8042e3ec 00000001 804d38f0 00000204 83980000 00000065 8381bbe0 6f55b24f
      [    0.999975]         00000000 00000000 80520000 00002018 00000000 00000075 00000007 00000000
      [    1.008583]         00000000 80480000 000ee811 00000000 00000000 00000000 80432c00 80248db8
      [    1.017196]         00000009 00000204 83980000 803ad7b0 00000000 801feeec 00000000 804d0000
      [    1.025804]         ...
      [    1.028325] Call Trace:
      [    1.030875] [<8000aef8>] show_stack+0x58/0x100
      [    1.035513] [<8001f8b4>] __warn+0xe4/0x118
      [    1.039708] [<8001f9a4>] warn_slowpath_null+0x48/0x64
      [    1.044935] [<80248db8>] bgmac_enet_probe+0x1b4/0x5c4
      [    1.050101] [<802498e0>] bgmac_probe+0x558/0x590
      [    1.054906] [<80252fd0>] bcma_device_probe+0x38/0x70
      [    1.060017] [<8020e1e8>] really_probe+0x170/0x2e8
      [    1.064891] [<8020e714>] __driver_attach+0xa4/0xec
      [    1.069784] [<8020c1e0>] bus_for_each_dev+0x58/0xb0
      [    1.074833] [<8020d590>] bus_add_driver+0xf8/0x218
      [    1.079731] [<8020ef24>] driver_register+0xcc/0x11c
      [    1.084804] [<804b54cc>] bgmac_init+0x1c/0x44
      [    1.089258] [<8000121c>] do_one_initcall+0x7c/0x1a0
      [    1.094343] [<804a1d34>] kernel_init_freeable+0x150/0x218
      [    1.099886] [<803a082c>] kernel_init+0x10/0x104
      [    1.104583] [<80005878>] ret_from_kernel_thread+0x14/0x1c
      [    1.110107] ---[ end trace f441c0d873d1fb5b ]---
      
      This patch setups a "struct device" (and passes it to the bcma) which
      allows fixing all the mentioned problems. It'll also require a tiny bcma
      patch which will follow through the wireless tree & its maintainer.
      
      Fixes: f8c55dc6 ("MIPS: use generic dma noncoherent ops for simple noncoherent platforms")
      Signed-off-by: NRafał Miłecki <rafal@milecki.pl>
      Signed-off-by: NPaul Burton <paul.burton@mips.com>
      Acked-by: NHauke Mehrtens <hauke@hauke-m.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Linus Walleij <linus.walleij@linaro.org>
      Cc: linux-wireless@vger.kernel.org
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: James Hogan <jhogan@kernel.org>
      Cc: linux-mips@linux-mips.org
      Cc: linux-kernel@vger.kernel.org
      Cc: stable@vger.kernel.org # v4.19+
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      19f41f32
    • A
      mips: fix n32 compat_ipc_parse_version · 8f469dc0
      Arnd Bergmann 提交于
      commit 5a9372f751b5350e0ce3d2ee91832f1feae2c2e5 upstream.
      
      While reading through the sysvipc implementation, I noticed that the n32
      semctl/shmctl/msgctl system calls behave differently based on whether
      o32 support is enabled or not: Without o32, the IPC_64 flag passed by
      user space is rejected but calls without that flag get IPC_64 behavior.
      
      As far as I can tell, this was inadvertently changed by a cleanup patch
      but never noticed by anyone, possibly nobody has tried using sysvipc
      on n32 after linux-3.19.
      
      Change it back to the old behavior now.
      
      Fixes: 78aaf956 ("MIPS: Compat: Fix build error if CONFIG_MIPS32_COMPAT but no compat ABI.")
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NPaul Burton <paul.burton@mips.com>
      Cc: linux-mips@vger.kernel.org
      Cc: stable@vger.kernel.org # 3.19+
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8f469dc0
    • I
      scsi: sd: Fix cache_type_store() · 9d37f4a0
      Ivan Mironov 提交于
      commit 44759979a49bfd2d20d789add7fa81a21eb1a4ab upstream.
      
      Changing of caching mode via /sys/devices/.../scsi_disk/.../cache_type may
      fail if device responds to MODE SENSE command with DPOFUA flag set, and
      then checks this flag to be not set on MODE SELECT command.
      
      In this scenario, when trying to change cache_type, write always fails:
      
      	# echo "none" >cache_type
      	bash: echo: write error: Invalid argument
      
      And following appears in dmesg:
      
      	[13007.865745] sd 1:0:1:0: [sda] Sense Key : Illegal Request [current]
      	[13007.865753] sd 1:0:1:0: [sda] Add. Sense: Invalid field in parameter list
      
      From SBC-4 r15, 6.5.1 "Mode pages overview", description of DEVICE-SPECIFIC
      PARAMETER field in the mode parameter header:
      	...
      	The write protect (WP) bit for mode data sent with a MODE SELECT
      	command shall be ignored by the device server.
      	...
      	The DPOFUA bit is reserved for mode data sent with a MODE SELECT
      	command.
      	...
      
      The remaining bits in the DEVICE-SPECIFIC PARAMETER byte are also reserved
      and shall be set to zero.
      
      [mkp: shuffled commentary to commit description]
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NIvan Mironov <mironov.ivan@gmail.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9d37f4a0
    • S
      scsi: core: Synchronize request queue PM status only on successful resume · d368f540
      Stanley Chu 提交于
      commit 3f7e62bba0003f9c68f599f5997c4647ef5b4f4e upstream.
      
      The commit 356fd266 ("scsi: Set request queue runtime PM status back to
      active on resume") fixed up the inconsistent RPM status between request
      queue and device. However changing request queue RPM status shall be done
      only on successful resume, otherwise status may be still inconsistent as
      below,
      
      Request queue: RPM_ACTIVE
      Device: RPM_SUSPENDED
      
      This ends up soft lockup because requests can be submitted to underlying
      devices but those devices and their required resource are not resumed.
      
      For example,
      
      After above inconsistent status happens, IO request can be submitted to UFS
      device driver but required resource (like clock) is not resumed yet thus
      lead to warning as below call stack,
      
      WARN_ON(hba->clk_gating.state != CLKS_ON);
      ufshcd_queuecommand
      scsi_dispatch_cmd
      scsi_request_fn
      __blk_run_queue
      cfq_insert_request
      __elv_add_request
      blk_flush_plug_list
      blk_finish_plug
      jbd2_journal_commit_transaction
      kjournald2
      
      We may see all behind IO requests hang because of no response from storage
      host or device and then soft lockup happens in system. In the end, system
      may crash in many ways.
      
      Fixes: 356fd266 (scsi: Set request queue runtime PM status back to active on resume)
      Cc: stable@vger.kernel.org
      Signed-off-by: NStanley Chu <stanley.chu@mediatek.com>
      Reviewed-by: NBart Van Assche <bvanassche@acm.org>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d368f540
    • K
      Yama: Check for pid death before checking ancestry · b955a2c7
      Kees Cook 提交于
      commit 9474f4e7cd71a633fa1ef93b7daefd44bbdfd482 upstream.
      
      It's possible that a pid has died before we take the rcu lock, in which
      case we can't walk the ancestry list as it may be detached. Instead, check
      for death first before doing the walk.
      
      Reported-by: syzbot+a9ac39bf55329e206219@syzkaller.appspotmail.com
      Fixes: 2d514487 ("security: Yama LSM")
      Cc: stable@vger.kernel.org
      Suggested-by: NOleg Nesterov <oleg@redhat.com>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: NJames Morris <james.morris@microsoft.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b955a2c7
    • J
      btrfs: wait on ordered extents on abort cleanup · 01634ac5
      Josef Bacik 提交于
      commit 74d5d229b1bf60f93bff244b2dfc0eb21ec32a07 upstream.
      
      If we flip read-only before we initiate writeback on all dirty pages for
      ordered extents we've created then we'll have ordered extents left over
      on umount, which results in all sorts of bad things happening.  Fix this
      by making sure we wait on ordered extents if we have to do the aborted
      transaction cleanup stuff.
      
      generic/475 can produce this warning:
      
       [ 8531.177332] WARNING: CPU: 2 PID: 11997 at fs/btrfs/disk-io.c:3856 btrfs_free_fs_root+0x95/0xa0 [btrfs]
       [ 8531.183282] CPU: 2 PID: 11997 Comm: umount Tainted: G        W 5.0.0-rc1-default+ #394
       [ 8531.185164] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),BIOS rel-1.11.2-0-gf9626cc-prebuilt.qemu-project.org 04/01/2014
       [ 8531.187851] RIP: 0010:btrfs_free_fs_root+0x95/0xa0 [btrfs]
       [ 8531.193082] RSP: 0018:ffffb1ab86163d98 EFLAGS: 00010286
       [ 8531.194198] RAX: ffff9f3449494d18 RBX: ffff9f34a2695000 RCX:0000000000000000
       [ 8531.195629] RDX: 0000000000000002 RSI: 0000000000000001 RDI:0000000000000000
       [ 8531.197315] RBP: ffff9f344e930000 R08: 0000000000000001 R09:0000000000000000
       [ 8531.199095] R10: 0000000000000000 R11: ffff9f34494d4ff8 R12:ffffb1ab86163dc0
       [ 8531.200870] R13: ffff9f344e9300b0 R14: ffffb1ab86163db8 R15:0000000000000000
       [ 8531.202707] FS:  00007fc68e949fc0(0000) GS:ffff9f34bd800000(0000)knlGS:0000000000000000
       [ 8531.204851] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       [ 8531.205942] CR2: 00007ffde8114dd8 CR3: 000000002dfbd000 CR4:00000000000006e0
       [ 8531.207516] Call Trace:
       [ 8531.208175]  btrfs_free_fs_roots+0xdb/0x170 [btrfs]
       [ 8531.210209]  ? wait_for_completion+0x5b/0x190
       [ 8531.211303]  close_ctree+0x157/0x350 [btrfs]
       [ 8531.212412]  generic_shutdown_super+0x64/0x100
       [ 8531.213485]  kill_anon_super+0x14/0x30
       [ 8531.214430]  btrfs_kill_super+0x12/0xa0 [btrfs]
       [ 8531.215539]  deactivate_locked_super+0x29/0x60
       [ 8531.216633]  cleanup_mnt+0x3b/0x70
       [ 8531.217497]  task_work_run+0x98/0xc0
       [ 8531.218397]  exit_to_usermode_loop+0x83/0x90
       [ 8531.219324]  do_syscall_64+0x15b/0x180
       [ 8531.220192]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
       [ 8531.221286] RIP: 0033:0x7fc68e5e4d07
       [ 8531.225621] RSP: 002b:00007ffde8116608 EFLAGS: 00000246 ORIG_RAX:00000000000000a6
       [ 8531.227512] RAX: 0000000000000000 RBX: 00005580c2175970 RCX:00007fc68e5e4d07
       [ 8531.229098] RDX: 0000000000000001 RSI: 0000000000000000 RDI:00005580c2175b80
       [ 8531.230730] RBP: 0000000000000000 R08: 00005580c2175ba0 R09:00007ffde8114e80
       [ 8531.232269] R10: 0000000000000000 R11: 0000000000000246 R12:00005580c2175b80
       [ 8531.233839] R13: 00007fc68eac61c4 R14: 00005580c2175a68 R15:0000000000000000
      
      Leaving a tree in the rb-tree:
      
      3853 void btrfs_free_fs_root(struct btrfs_root *root)
      3854 {
      3855         iput(root->ino_cache_inode);
      3856         WARN_ON(!RB_EMPTY_ROOT(&root->inode_tree));
      
      CC: stable@vger.kernel.org
      Reviewed-by: NNikolay Borisov <nborisov@suse.com>
      Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
      [ add stacktrace ]
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      01634ac5
    • D
      Revert "btrfs: balance dirty metadata pages in btrfs_finish_ordered_io" · 4675f90e
      David Sterba 提交于
      commit 77b7aad195099e7c6da11e94b7fa6ef5e6fb0025 upstream.
      
      This reverts commit e73e81b6.
      
      This patch causes a few problems:
      
      - adds latency to btrfs_finish_ordered_io
      - as btrfs_finish_ordered_io is used for free space cache, generating
        more work from btrfs_btree_balance_dirty_nodelay could end up in the
        same workque, effectively deadlocking
      
      12260 kworker/u96:16+btrfs-freespace-write D
      [<0>] balance_dirty_pages+0x6e6/0x7ad
      [<0>] balance_dirty_pages_ratelimited+0x6bb/0xa90
      [<0>] btrfs_finish_ordered_io+0x3da/0x770
      [<0>] normal_work_helper+0x1c5/0x5a0
      [<0>] process_one_work+0x1ee/0x5a0
      [<0>] worker_thread+0x46/0x3d0
      [<0>] kthread+0xf5/0x130
      [<0>] ret_from_fork+0x24/0x30
      [<0>] 0xffffffffffffffff
      
      Transaction commit will wait on the freespace cache:
      
      838 btrfs-transacti D
      [<0>] btrfs_start_ordered_extent+0x154/0x1e0
      [<0>] btrfs_wait_ordered_range+0xbd/0x110
      [<0>] __btrfs_wait_cache_io+0x49/0x1a0
      [<0>] btrfs_write_dirty_block_groups+0x10b/0x3b0
      [<0>] commit_cowonly_roots+0x215/0x2b0
      [<0>] btrfs_commit_transaction+0x37e/0x910
      [<0>] transaction_kthread+0x14d/0x180
      [<0>] kthread+0xf5/0x130
      [<0>] ret_from_fork+0x24/0x30
      [<0>] 0xffffffffffffffff
      
      And then writepages ends up waiting on transaction commit:
      
      9520 kworker/u96:13+flush-btrfs-1 D
      [<0>] wait_current_trans+0xac/0xe0
      [<0>] start_transaction+0x21b/0x4b0
      [<0>] cow_file_range_inline+0x10b/0x6b0
      [<0>] cow_file_range.isra.69+0x329/0x4a0
      [<0>] run_delalloc_range+0x105/0x3c0
      [<0>] writepage_delalloc+0x119/0x180
      [<0>] __extent_writepage+0x10c/0x390
      [<0>] extent_write_cache_pages+0x26f/0x3d0
      [<0>] extent_writepages+0x4f/0x80
      [<0>] do_writepages+0x17/0x60
      [<0>] __writeback_single_inode+0x59/0x690
      [<0>] writeback_sb_inodes+0x291/0x4e0
      [<0>] __writeback_inodes_wb+0x87/0xb0
      [<0>] wb_writeback+0x3bb/0x500
      [<0>] wb_workfn+0x40d/0x610
      [<0>] process_one_work+0x1ee/0x5a0
      [<0>] worker_thread+0x1e0/0x3d0
      [<0>] kthread+0xf5/0x130
      [<0>] ret_from_fork+0x24/0x30
      [<0>] 0xffffffffffffffff
      
      Eventually, we have every process in the system waiting on
      balance_dirty_pages(), and nobody is able to make progress on page
      writeback.
      
      The original patch tried to fix an OOM condition, that happened on 4.4 but no
      success reproducing that on later kernels (4.19 and 4.20). This is more likely
      a problem in OOM itself.
      
      Link: https://lore.kernel.org/linux-btrfs/20180528054821.9092-1-ethanlien@synology.com/Reported-by: NChris Mason <clm@fb.com>
      CC: stable@vger.kernel.org # 4.18+
      CC: ethanlien <ethanlien@synology.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4675f90e
    • J
      xen: Fix x86 sched_clock() interface for xen · 4432362a
      Juergen Gross 提交于
      commit 867cefb4cb1012f42cada1c7d1f35ac8dd276071 upstream.
      
      Commit f94c8d11 ("sched/clock, x86/tsc: Rework the x86 'unstable'
      sched_clock() interface") broke Xen guest time handling across
      migration:
      
      [  187.249951] Freezing user space processes ... (elapsed 0.001 seconds) done.
      [  187.251137] OOM killer disabled.
      [  187.251137] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
      [  187.252299] suspending xenstore...
      [  187.266987] xen:grant_table: Grant tables using version 1 layout
      [18446743811.706476] OOM killer enabled.
      [18446743811.706478] Restarting tasks ... done.
      [18446743811.720505] Setting capacity to 16777216
      
      Fix that by setting xen_sched_clock_offset at resume time to ensure a
      monotonic clock value.
      
      [boris: replaced pr_info() with pr_info_once() in xen_callback_vector()
       to avoid printing with incorrect timestamp during resume (as we
       haven't re-adjusted the clock yet)]
      
      Fixes: f94c8d11 ("sched/clock, x86/tsc: Rework the x86 'unstable' sched_clock() interface")
      Cc: <stable@vger.kernel.org> # 4.11
      Reported-by: NHans van Kranenburg <hans.van.kranenburg@mendix.com>
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      Tested-by: NHans van Kranenburg <hans.van.kranenburg@mendix.com>
      Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4432362a
    • C
      crypto: talitos - fix ablkcipher for CONFIG_VMAP_STACK · 64e98644
      Christophe Leroy 提交于
      commit 1bea445b0a022ee126ca328b3705cd4df18ebc14 upstream.
      
      [    2.364486] WARNING: CPU: 0 PID: 60 at ./arch/powerpc/include/asm/io.h:837 dma_nommu_map_page+0x44/0xd4
      [    2.373579] CPU: 0 PID: 60 Comm: cryptomgr_test Tainted: G        W         4.20.0-rc5-00560-g6bfb52e23a00-dirty #531
      [    2.384740] NIP:  c000c540 LR: c000c584 CTR: 00000000
      [    2.389743] REGS: c95abab0 TRAP: 0700   Tainted: G        W          (4.20.0-rc5-00560-g6bfb52e23a00-dirty)
      [    2.400042] MSR:  00029032 <EE,ME,IR,DR,RI>  CR: 24042204  XER: 00000000
      [    2.406669]
      [    2.406669] GPR00: c02f2244 c95abb60 c6262990 c95abd80 0000256a 00000001 00000001 00000001
      [    2.406669] GPR08: 00000000 00002000 00000010 00000010 24042202 00000000 00000100 c95abd88
      [    2.406669] GPR16: 00000000 c05569d4 00000001 00000010 c95abc88 c0615664 00000004 00000000
      [    2.406669] GPR24: 00000010 c95abc88 c95abc88 00000000 c61ae210 c7ff6d40 c61ae210 00003d68
      [    2.441559] NIP [c000c540] dma_nommu_map_page+0x44/0xd4
      [    2.446720] LR [c000c584] dma_nommu_map_page+0x88/0xd4
      [    2.451762] Call Trace:
      [    2.454195] [c95abb60] [82000808] 0x82000808 (unreliable)
      [    2.459572] [c95abb80] [c02f2244] talitos_edesc_alloc+0xbc/0x3c8
      [    2.465493] [c95abbb0] [c02f2600] ablkcipher_edesc_alloc+0x4c/0x5c
      [    2.471606] [c95abbd0] [c02f4ed0] ablkcipher_encrypt+0x20/0x64
      [    2.477389] [c95abbe0] [c02023b0] __test_skcipher+0x4bc/0xa08
      [    2.483049] [c95abe00] [c0204b60] test_skcipher+0x2c/0xcc
      [    2.488385] [c95abe20] [c0204c48] alg_test_skcipher+0x48/0xbc
      [    2.494064] [c95abe40] [c0205cec] alg_test+0x164/0x2e8
      [    2.499142] [c95abf00] [c0200dec] cryptomgr_test+0x48/0x50
      [    2.504558] [c95abf10] [c0039ff4] kthread+0xe4/0x110
      [    2.509471] [c95abf40] [c000e1d0] ret_from_kernel_thread+0x14/0x1c
      [    2.515532] Instruction dump:
      [    2.518468] 7c7e1b78 7c9d2378 7cbf2b78 41820054 3d20c076 8089c200 3d20c076 7c84e850
      [    2.526127] 8129c204 7c842e70 7f844840 419c0008 <0fe00000> 2f9e0000 54847022 7c84fa14
      [    2.533960] ---[ end trace bf78d94af73fe3b8 ]---
      [    2.539123] talitos ff020000.crypto: master data transfer error
      [    2.544775] talitos ff020000.crypto: TEA error: ISR 0x20000000_00000040
      [    2.551625] alg: skcipher: encryption failed on test 1 for ecb-aes-talitos: ret=22
      
      IV cannot be on stack when CONFIG_VMAP_STACK is selected because the stack
      cannot be DMA mapped anymore.
      
      This patch copies the IV into the extended descriptor.
      
      Fixes: 4de9d0b5 ("crypto: talitos - Add ablkcipher algorithms")
      Cc: stable@vger.kernel.org
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Reviewed-by: NHoria Geantă <horia.geanta@nxp.com>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      64e98644
    • C
      crypto: talitos - reorder code in talitos_edesc_alloc() · c6578f50
      Christophe Leroy 提交于
      commit c56c2e173773097a248fd3bace91ac8f6fc5386d upstream.
      
      This patch moves the mapping of IV after the kmalloc(). This
      avoids having to unmap in case kmalloc() fails.
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Reviewed-by: NHoria Geantă <horia.geanta@nxp.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c6578f50
    • E
      crypto: authenc - fix parsing key with misaligned rta_len · 44c67402
      Eric Biggers 提交于
      commit 8f9c469348487844328e162db57112f7d347c49f upstream.
      
      Keys for "authenc" AEADs are formatted as an rtattr containing a 4-byte
      'enckeylen', followed by an authentication key and an encryption key.
      crypto_authenc_extractkeys() parses the key to find the inner keys.
      
      However, it fails to consider the case where the rtattr's payload is
      longer than 4 bytes but not 4-byte aligned, and where the key ends
      before the next 4-byte aligned boundary.  In this case, 'keylen -=
      RTA_ALIGN(rta->rta_len);' underflows to a value near UINT_MAX.  This
      causes a buffer overread and crash during crypto_ahash_setkey().
      
      Fix it by restricting the rtattr payload to the expected size.
      
      Reproducer using AF_ALG:
      
      	#include <linux/if_alg.h>
      	#include <linux/rtnetlink.h>
      	#include <sys/socket.h>
      
      	int main()
      	{
      		int fd;
      		struct sockaddr_alg addr = {
      			.salg_type = "aead",
      			.salg_name = "authenc(hmac(sha256),cbc(aes))",
      		};
      		struct {
      			struct rtattr attr;
      			__be32 enckeylen;
      			char keys[1];
      		} __attribute__((packed)) key = {
      			.attr.rta_len = sizeof(key),
      			.attr.rta_type = 1 /* CRYPTO_AUTHENC_KEYA_PARAM */,
      		};
      
      		fd = socket(AF_ALG, SOCK_SEQPACKET, 0);
      		bind(fd, (void *)&addr, sizeof(addr));
      		setsockopt(fd, SOL_ALG, ALG_SET_KEY, &key, sizeof(key));
      	}
      
      It caused:
      
      	BUG: unable to handle kernel paging request at ffff88007ffdc000
      	PGD 2e01067 P4D 2e01067 PUD 2e04067 PMD 2e05067 PTE 0
      	Oops: 0000 [#1] SMP
      	CPU: 0 PID: 883 Comm: authenc Not tainted 4.20.0-rc1-00108-g00c9fe37a7f27 #13
      	Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-20181126_142135-anatol 04/01/2014
      	RIP: 0010:sha256_ni_transform+0xb3/0x330 arch/x86/crypto/sha256_ni_asm.S:155
      	[...]
      	Call Trace:
      	 sha256_ni_finup+0x10/0x20 arch/x86/crypto/sha256_ssse3_glue.c:321
      	 crypto_shash_finup+0x1a/0x30 crypto/shash.c:178
      	 shash_digest_unaligned+0x45/0x60 crypto/shash.c:186
      	 crypto_shash_digest+0x24/0x40 crypto/shash.c:202
      	 hmac_setkey+0x135/0x1e0 crypto/hmac.c:66
      	 crypto_shash_setkey+0x2b/0xb0 crypto/shash.c:66
      	 shash_async_setkey+0x10/0x20 crypto/shash.c:223
      	 crypto_ahash_setkey+0x2d/0xa0 crypto/ahash.c:202
      	 crypto_authenc_setkey+0x68/0x100 crypto/authenc.c:96
      	 crypto_aead_setkey+0x2a/0xc0 crypto/aead.c:62
      	 aead_setkey+0xc/0x10 crypto/algif_aead.c:526
      	 alg_setkey crypto/af_alg.c:223 [inline]
      	 alg_setsockopt+0xfe/0x130 crypto/af_alg.c:256
      	 __sys_setsockopt+0x6d/0xd0 net/socket.c:1902
      	 __do_sys_setsockopt net/socket.c:1913 [inline]
      	 __se_sys_setsockopt net/socket.c:1910 [inline]
      	 __x64_sys_setsockopt+0x1f/0x30 net/socket.c:1910
      	 do_syscall_64+0x4a/0x180 arch/x86/entry/common.c:290
      	 entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Fixes: e236d4a8 ("[CRYPTO] authenc: Move enckeylen into key itself")
      Cc: <stable@vger.kernel.org> # v2.6.25+
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      44c67402
    • E
      crypto: bcm - convert to use crypto_authenc_extractkeys() · 97a6662b
      Eric Biggers 提交于
      commit ab57b33525c3221afaebd391458fa0cbcd56903d upstream.
      
      Convert the bcm crypto driver to use crypto_authenc_extractkeys() so
      that it picks up the fix for broken validation of rtattr::rta_len.
      
      This also fixes the DES weak key check to actually be done on the right
      key. (It was checking the authentication key, not the encryption key...)
      
      Fixes: 9d12ba86 ("crypto: brcm - Add Broadcom SPU driver")
      Cc: <stable@vger.kernel.org> # v4.11+
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      97a6662b
    • E
      crypto: ccree - convert to use crypto_authenc_extractkeys() · 93242fa0
      Eric Biggers 提交于
      commit dc95b5350a8f07d73d6bde3a79ef87289698451d upstream.
      
      Convert the ccree crypto driver to use crypto_authenc_extractkeys() so
      that it picks up the fix for broken validation of rtattr::rta_len.
      
      Fixes: ff27e85a ("crypto: ccree - add AEAD support")
      Cc: <stable@vger.kernel.org> # v4.17+
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      93242fa0
    • H
      crypto: authencesn - Avoid twice completion call in decrypt path · 65908037
      Harsh Jain 提交于
      commit a7773363624b034ab198c738661253d20a8055c2 upstream.
      
      Authencesn template in decrypt path unconditionally calls aead_request_complete
      after ahash_verify which leads to following kernel panic in after decryption.
      
      [  338.539800] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
      [  338.548372] PGD 0 P4D 0
      [  338.551157] Oops: 0000 [#1] SMP PTI
      [  338.554919] CPU: 0 PID: 0 Comm: swapper/0 Kdump: loaded Tainted: G        W I       4.19.7+ #13
      [  338.564431] Hardware name: Supermicro X8ST3/X8ST3, BIOS 2.0        07/29/10
      [  338.572212] RIP: 0010:esp_input_done2+0x350/0x410 [esp4]
      [  338.578030] Code: ff 0f b6 68 10 48 8b 83 c8 00 00 00 e9 8e fe ff ff 8b 04 25 04 00 00 00 83 e8 01 48 98 48 8b 3c c5 10 00 00 00 e9 f7 fd ff ff <8b> 04 25 04 00 00 00 83 e8 01 48 98 4c 8b 24 c5 10 00 00 00 e9 3b
      [  338.598547] RSP: 0018:ffff911c97803c00 EFLAGS: 00010246
      [  338.604268] RAX: 0000000000000002 RBX: ffff911c4469ee00 RCX: 0000000000000000
      [  338.612090] RDX: 0000000000000000 RSI: 0000000000000130 RDI: ffff911b87c20400
      [  338.619874] RBP: 0000000000000000 R08: ffff911b87c20498 R09: 000000000000000a
      [  338.627610] R10: 0000000000000001 R11: 0000000000000004 R12: 0000000000000000
      [  338.635402] R13: ffff911c89590000 R14: ffff911c91730000 R15: 0000000000000000
      [  338.643234] FS:  0000000000000000(0000) GS:ffff911c97800000(0000) knlGS:0000000000000000
      [  338.652047] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  338.658299] CR2: 0000000000000004 CR3: 00000001ec20a000 CR4: 00000000000006f0
      [  338.666382] Call Trace:
      [  338.669051]  <IRQ>
      [  338.671254]  esp_input_done+0x12/0x20 [esp4]
      [  338.675922]  chcr_handle_resp+0x3b5/0x790 [chcr]
      [  338.680949]  cpl_fw6_pld_handler+0x37/0x60 [chcr]
      [  338.686080]  chcr_uld_rx_handler+0x22/0x50 [chcr]
      [  338.691233]  uldrx_handler+0x8c/0xc0 [cxgb4]
      [  338.695923]  process_responses+0x2f0/0x5d0 [cxgb4]
      [  338.701177]  ? bitmap_find_next_zero_area_off+0x3a/0x90
      [  338.706882]  ? matrix_alloc_area.constprop.7+0x60/0x90
      [  338.712517]  ? apic_update_irq_cfg+0x82/0xf0
      [  338.717177]  napi_rx_handler+0x14/0xe0 [cxgb4]
      [  338.722015]  net_rx_action+0x2aa/0x3e0
      [  338.726136]  __do_softirq+0xcb/0x280
      [  338.730054]  irq_exit+0xde/0xf0
      [  338.733504]  do_IRQ+0x54/0xd0
      [  338.736745]  common_interrupt+0xf/0xf
      
      Fixes: 104880a6 ("crypto: authencesn - Convert to new AEAD...")
      Signed-off-by: NHarsh Jain <harsh@chelsio.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      65908037