1. 26 1月, 2019 40 次提交
    • N
      dm kcopyd: Fix bug causing workqueue stalls · cbd257f3
      Nikos Tsironis 提交于
      [ Upstream commit d7e6b8dfc7bcb3f4f3a18313581f67486a725b52 ]
      
      When using kcopyd to run callbacks through dm_kcopyd_do_callback() or
      submitting copy jobs with a source size of 0, the jobs are pushed
      directly to the complete_jobs list, which could be under processing by
      the kcopyd thread. As a result, the kcopyd thread can continue running
      completed jobs indefinitely, without releasing the CPU, as long as
      someone keeps submitting new completed jobs through the aforementioned
      paths. Processing of work items, queued for execution on the same CPU as
      the currently running kcopyd thread, is thus stalled for excessive
      amounts of time, hurting performance.
      
      Running the following test, from the device mapper test suite [1],
      
        dmtest run --suite snapshot -n parallel_io_to_many_snaps_N
      
      , with 8 active snapshots, we get, in dmesg, messages like the
      following:
      
      [68899.948523] BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=0 stuck for 95s!
      [68899.949282] Showing busy workqueues and worker pools:
      [68899.949288] workqueue events: flags=0x0
      [68899.949295]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=2/256
      [68899.949306]     pending: vmstat_shepherd, cache_reap
      [68899.949331] workqueue mm_percpu_wq: flags=0x8
      [68899.949337]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
      [68899.949345]     pending: vmstat_update
      [68899.949387] workqueue dm_bufio_cache: flags=0x8
      [68899.949392]   pwq 4: cpus=2 node=0 flags=0x0 nice=0 active=1/256
      [68899.949400]     pending: work_fn [dm_bufio]
      [68899.949423] workqueue kcopyd: flags=0x8
      [68899.949429]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
      [68899.949437]     pending: do_work [dm_mod]
      [68899.949452] workqueue kcopyd: flags=0x8
      [68899.949458]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=2/256
      [68899.949466]     in-flight: 13:do_work [dm_mod]
      [68899.949474]     pending: do_work [dm_mod]
      [68899.949487] workqueue kcopyd: flags=0x8
      [68899.949493]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
      [68899.949501]     pending: do_work [dm_mod]
      [68899.949515] workqueue kcopyd: flags=0x8
      [68899.949521]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
      [68899.949529]     pending: do_work [dm_mod]
      [68899.949541] workqueue kcopyd: flags=0x8
      [68899.949547]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
      [68899.949555]     pending: do_work [dm_mod]
      [68899.949568] pool 0: cpus=0 node=0 flags=0x0 nice=0 hung=95s workers=4 idle: 27130 27223 1084
      
      Fix this by splitting the complete_jobs list into two parts: A user
      facing part, named callback_jobs, and one used internally by kcopyd,
      retaining the name complete_jobs. dm_kcopyd_do_callback() and
      dispatch_job() now push their jobs to the callback_jobs list, which is
      spliced to the complete_jobs list once, every time the kcopyd thread
      wakes up. This prevents kcopyd from hogging the CPU indefinitely and
      causing workqueue stalls.
      
      Re-running the aforementioned test:
      
        * Workqueue stalls are eliminated
        * The maximum writing time among all targets is reduced from 09m37.10s
          to 06m04.85s and the total run time of the test is reduced from
          10m43.591s to 7m19.199s
      
      [1] https://github.com/jthornber/device-mapper-test-suiteSigned-off-by: NNikos Tsironis <ntsironis@arrikto.com>
      Signed-off-by: NIlias Tsitsimpis <iliastsi@arrikto.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      cbd257f3
    • A
      dm crypt: use u64 instead of sector_t to store iv_offset · 4e26ee31
      AliOS system security 提交于
      [ Upstream commit 8d683dcd65c037efc9fb38c696ec9b65b306e573 ]
      
      The iv_offset in the mapping table of crypt target is a 64bit number when
      IV algorithm is plain64, plain64be, essiv or benbi. It will be assigned to
      iv_offset of struct crypt_config, cc_sector of struct convert_context and
      iv_sector of struct dm_crypt_request. These structures members are defined
      as a sector_t. But sector_t is 32bit when CONFIG_LBDAF is not set in 32bit
      kernel. In this situation sector_t is not big enough to store the 64bit
      iv_offset.
      
      Here is a reproducer.
      Prepare test image and device (loop is automatically allocated by cryptsetup):
      
        # dd if=/dev/zero of=tst.img bs=1M count=1
        # echo "tst"|cryptsetup open --type plain -c aes-xts-plain64 \
        --skip 500000000000000000 tst.img test
      
      On 32bit system (use IV offset value that overflows to 64bit; CONFIG_LBDAF if off)
      and device checksum is wrong:
      
        # dmsetup table test --showkeys
        0 2048 crypt aes-xts-plain64 dfa7cfe3c481f2239155739c42e539ae8f2d38f304dcc89d20b26f69daaf0933 3551657984 7:0 0
      
        # sha256sum /dev/mapper/test
        533e25c09176632b3794f35303488c4a8f3f965dffffa6ec2df347c168cb6c19 /dev/mapper/test
      
      On 64bit system (and on 32bit system with the patch), table and checksum is now correct:
      
        # dmsetup table test --showkeys
        0 2048 crypt aes-xts-plain64 dfa7cfe3c481f2239155739c42e539ae8f2d38f304dcc89d20b26f69daaf0933 500000000000000000 7:0 0
      
        # sha256sum /dev/mapper/test
        5d16160f9d5f8c33d8051e65fdb4f003cc31cd652b5abb08f03aa6fce0df75fc /dev/mapper/test
      Signed-off-by: NAliOS system security <alios_sys_security@linux.alibaba.com>
      Tested-and-Reviewed-by: NMilan Broz <gmazyland@gmail.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      4e26ee31
    • H
      x86/topology: Use total_cpus for max logical packages calculation · a4772e8b
      Hui Wang 提交于
      [ Upstream commit aa02ef099cff042c2a9109782ec2bf1bffc955d4 ]
      
      nr_cpu_ids can be limited on the command line via nr_cpus=. This can break the
      logical package management because it results in a smaller number of packages
      while in kdump kernel.
      
      Check below case:
      There is a two sockets system, each socket has 8 cores, which has 16 logical
      cpus while HT was turn on.
      
       0  1  2  3  4  5  6  7     |    16 17 18 19 20 21 22 23
       cores on socket 0               threads on socket 0
       8  9 10 11 12 13 14 15     |    24 25 26 27 28 29 30 31
       cores on socket 1               threads on socket 1
      
      While starting the kdump kernel with command line option nr_cpus=16 panic
      was triggered on one of the cpus 24-31 eg. 26, then online cpu will be
      1-15, 26(cpu 0 was disabled in kdump), ncpus will be 16 and
      __max_logical_packages will be 1, but actually two packages were booted on.
      
      This issue can reproduced by set kdump option nr_cpus=<real physical core
      numbers>, and then trigger panic on last socket's thread, for example:
      
      taskset -c 26 echo c > /proc/sysrq-trigger
      
      Use total_cpus which will not be limited by nr_cpus command line to calculate
      the value of __max_logical_packages.
      Signed-off-by: NHui Wang <john.wanghui@huawei.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: <guijianfeng@huawei.com>
      Cc: <wencongyang2@huawei.com>
      Cc: <douliyang1@huawei.com>
      Cc: <qiaonuohan@huawei.com>
      Link: https://lkml.kernel.org/r/20181107023643.22174-1-john.wanghui@huawei.comSigned-off-by: NSasha Levin <sashal@kernel.org>
      a4772e8b
    • T
      netfilter: ipt_CLUSTERIP: fix deadlock in netns exit routine · 9d51378a
      Taehee Yoo 提交于
      [ Upstream commit 5a86d68bcf02f2d1e9a5897dd482079fd5f75e7f ]
      
      When network namespace is destroyed, cleanup_net() is called.
      cleanup_net() holds pernet_ops_rwsem then calls each ->exit callback.
      So that clusterip_tg_destroy() is called by cleanup_net().
      And clusterip_tg_destroy() calls unregister_netdevice_notifier().
      
      But both cleanup_net() and clusterip_tg_destroy() hold same
      lock(pernet_ops_rwsem). hence deadlock occurrs.
      
      After this patch, only 1 notifier is registered when module is inserted.
      And all of configs are added to per-net list.
      
      test commands:
         %ip netns add vm1
         %ip netns exec vm1 bash
         %ip link set lo up
         %iptables -A INPUT -p tcp -i lo -d 192.168.0.5 --dport 80 \
      	-j CLUSTERIP --new --hashmode sourceip \
      	--clustermac 01:00:5e:00:00:20 --total-nodes 2 --local-node 1
         %exit
         %ip netns del vm1
      
      splat looks like:
      [  341.809674] ============================================
      [  341.809674] WARNING: possible recursive locking detected
      [  341.809674] 4.19.0-rc5+ #16 Tainted: G        W
      [  341.809674] --------------------------------------------
      [  341.809674] kworker/u4:2/87 is trying to acquire lock:
      [  341.809674] 000000005da2d519 (pernet_ops_rwsem){++++}, at: unregister_netdevice_notifier+0x8c/0x460
      [  341.809674]
      [  341.809674] but task is already holding lock:
      [  341.809674] 000000005da2d519 (pernet_ops_rwsem){++++}, at: cleanup_net+0x119/0x900
      [  341.809674]
      [  341.809674] other info that might help us debug this:
      [  341.809674]  Possible unsafe locking scenario:
      [  341.809674]
      [  341.809674]        CPU0
      [  341.809674]        ----
      [  341.809674]   lock(pernet_ops_rwsem);
      [  341.809674]   lock(pernet_ops_rwsem);
      [  341.809674]
      [  341.809674]  *** DEADLOCK ***
      [  341.809674]
      [  341.809674]  May be due to missing lock nesting notation
      [  341.809674]
      [  341.809674] 3 locks held by kworker/u4:2/87:
      [  341.809674]  #0: 00000000d9df6c92 ((wq_completion)"%s""netns"){+.+.}, at: process_one_work+0xafe/0x1de0
      [  341.809674]  #1: 00000000c2cbcee2 (net_cleanup_work){+.+.}, at: process_one_work+0xb60/0x1de0
      [  341.809674]  #2: 000000005da2d519 (pernet_ops_rwsem){++++}, at: cleanup_net+0x119/0x900
      [  341.809674]
      [  341.809674] stack backtrace:
      [  341.809674] CPU: 1 PID: 87 Comm: kworker/u4:2 Tainted: G        W         4.19.0-rc5+ #16
      [  341.809674] Workqueue: netns cleanup_net
      [  341.809674] Call Trace:
      [ ... ]
      [  342.070196]  down_write+0x93/0x160
      [  342.070196]  ? unregister_netdevice_notifier+0x8c/0x460
      [  342.070196]  ? down_read+0x1e0/0x1e0
      [  342.070196]  ? sched_clock_cpu+0x126/0x170
      [  342.070196]  ? find_held_lock+0x39/0x1c0
      [  342.070196]  unregister_netdevice_notifier+0x8c/0x460
      [  342.070196]  ? register_netdevice_notifier+0x790/0x790
      [  342.070196]  ? __local_bh_enable_ip+0xe9/0x1b0
      [  342.070196]  ? __local_bh_enable_ip+0xe9/0x1b0
      [  342.070196]  ? clusterip_tg_destroy+0x372/0x650 [ipt_CLUSTERIP]
      [  342.070196]  ? trace_hardirqs_on+0x93/0x210
      [  342.070196]  ? __bpf_trace_preemptirq_template+0x10/0x10
      [  342.070196]  ? clusterip_tg_destroy+0x372/0x650 [ipt_CLUSTERIP]
      [  342.123094]  clusterip_tg_destroy+0x3ad/0x650 [ipt_CLUSTERIP]
      [  342.123094]  ? clusterip_net_init+0x3d0/0x3d0 [ipt_CLUSTERIP]
      [  342.123094]  ? cleanup_match+0x17d/0x200 [ip_tables]
      [  342.123094]  ? xt_unregister_table+0x215/0x300 [x_tables]
      [  342.123094]  ? kfree+0xe2/0x2a0
      [  342.123094]  cleanup_entry+0x1d5/0x2f0 [ip_tables]
      [  342.123094]  ? cleanup_match+0x200/0x200 [ip_tables]
      [  342.123094]  __ipt_unregister_table+0x9b/0x1a0 [ip_tables]
      [  342.123094]  iptable_filter_net_exit+0x43/0x80 [iptable_filter]
      [  342.123094]  ops_exit_list.isra.10+0x94/0x140
      [  342.123094]  cleanup_net+0x45b/0x900
      [ ... ]
      
      Fixes: 202f59af ("netfilter: ipt_CLUSTERIP: do not hold dev")
      Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      9d51378a
    • T
      netfilter: ipt_CLUSTERIP: remove wrong WARN_ON_ONCE in netns exit routine · bb7b6c49
      Taehee Yoo 提交于
      [ Upstream commit b12f7bad5ad3724d19754390a3e80928525c0769 ]
      
      When network namespace is destroyed, both clusterip_tg_destroy() and
      clusterip_net_exit() are called. and clusterip_net_exit() is called
      before clusterip_tg_destroy().
      Hence cleanup check code in clusterip_net_exit() doesn't make sense.
      
      test commands:
         %ip netns add vm1
         %ip netns exec vm1 bash
         %ip link set lo up
         %iptables -A INPUT -p tcp -i lo -d 192.168.0.5 --dport 80 \
      	-j CLUSTERIP --new --hashmode sourceip \
      	--clustermac 01:00:5e:00:00:20 --total-nodes 2 --local-node 1
         %exit
         %ip netns del vm1
      
      splat looks like:
      [  341.184508] WARNING: CPU: 1 PID: 87 at net/ipv4/netfilter/ipt_CLUSTERIP.c:840 clusterip_net_exit+0x319/0x380 [ipt_CLUSTERIP]
      [  341.184850] Modules linked in: ipt_CLUSTERIP nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_tcpudp iptable_filter bpfilter ip_tables x_tables
      [  341.184850] CPU: 1 PID: 87 Comm: kworker/u4:2 Not tainted 4.19.0-rc5+ #16
      [  341.227509] Workqueue: netns cleanup_net
      [  341.227509] RIP: 0010:clusterip_net_exit+0x319/0x380 [ipt_CLUSTERIP]
      [  341.227509] Code: 0f 85 7f fe ff ff 48 c7 c2 80 64 2c c0 be a8 02 00 00 48 c7 c7 a0 63 2c c0 c6 05 18 6e 00 00 01 e8 bc 38 ff f5 e9 5b fe ff ff <0f> 0b e9 33 ff ff ff e8 4b 90 50 f6 e9 2d fe ff ff 48 89 df e8 de
      [  341.227509] RSP: 0018:ffff88011086f408 EFLAGS: 00010202
      [  341.227509] RAX: dffffc0000000000 RBX: 1ffff1002210de85 RCX: 0000000000000000
      [  341.227509] RDX: 1ffff1002210de85 RSI: ffff880110813be8 RDI: ffffed002210de58
      [  341.227509] RBP: ffff88011086f4d0 R08: 0000000000000000 R09: 0000000000000000
      [  341.227509] R10: 0000000000000000 R11: 0000000000000000 R12: 1ffff1002210de81
      [  341.227509] R13: ffff880110625a48 R14: ffff880114cec8c8 R15: 0000000000000014
      [  341.227509] FS:  0000000000000000(0000) GS:ffff880116600000(0000) knlGS:0000000000000000
      [  341.227509] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  341.227509] CR2: 00007f11fd38e000 CR3: 000000013ca16000 CR4: 00000000001006e0
      [  341.227509] Call Trace:
      [  341.227509]  ? __clusterip_config_find+0x460/0x460 [ipt_CLUSTERIP]
      [  341.227509]  ? default_device_exit+0x1ca/0x270
      [  341.227509]  ? remove_proc_entry+0x1cd/0x390
      [  341.227509]  ? dev_change_net_namespace+0xd00/0xd00
      [  341.227509]  ? __init_waitqueue_head+0x130/0x130
      [  341.227509]  ops_exit_list.isra.10+0x94/0x140
      [  341.227509]  cleanup_net+0x45b/0x900
      [ ... ]
      
      Fixes: 613d0776 ("netfilter: exit_net cleanup check added")
      Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      bb7b6c49
    • T
      netfilter: ipt_CLUSTERIP: check MAC address when duplicate config is set · 744383c8
      Taehee Yoo 提交于
      [ Upstream commit 06aa151ad1fc74a49b45336672515774a678d78d ]
      
      If same destination IP address config is already existing, that config is
      just used. MAC address also should be same.
      However, there is no MAC address checking routine.
      So that MAC address checking routine is added.
      
      test commands:
         %iptables -A INPUT -p tcp -i lo -d 192.168.0.5 --dport 80 \
      	   -j CLUSTERIP --new --hashmode sourceip \
      	   --clustermac 01:00:5e:00:00:20 --total-nodes 2 --local-node 1
         %iptables -A INPUT -p tcp -i lo -d 192.168.0.5 --dport 80 \
      	   -j CLUSTERIP --new --hashmode sourceip \
      	   --clustermac 01:00:5e:00:00:21 --total-nodes 2 --local-node 1
      
      After this patch, above commands are disallowed.
      Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      744383c8
    • A
      perf vendor events intel: Fix Load_Miss_Real_Latency on SKL/SKX · bd1040e6
      Andi Kleen 提交于
      [ Upstream commit 91b2b97025097ce7ca7536bc87eba2bf14760fb4 ]
      
      Fix incorrect event names for the Load_Miss_Real_Latency metric for
      Skylake and Skylake Server.
      
      Fixes https://github.com/andikleen/pmu-tools/issues/158
      
      Before:
      
        % perf stat -M Load_Miss_Real_Latency true
        event syntax error: '..ss.pending,mem_load_retired.l1_miss_ps,mem_load_retired.fb_hit_ps}:W'
                                          \___ parser error
      
         Usage: perf stat [<options>] [<command>]
      
            -M, --metrics <metric/metric group list>
                                  monitor specified metrics or metric groups (separated by ,)
      
      After:
      
        % perf stat -M Load_Miss_Real_Latency true
      
         Performance counter stats for 'true':
      
                   279,204      l1d_pend_miss.pending     #     14.0 Load_Miss_Real_Latency
                     4,784      mem_load_uops_retired.l1_miss
                    15,188      mem_load_uops_retired.hit_lfb
      
               0.000899640 seconds time elapsed
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Link: http://lkml.kernel.org/r/20181120050635.4215-1-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      bd1040e6
    • A
      perf parse-events: Fix unchecked usage of strncpy() · 58c67a0b
      Arnaldo Carvalho de Melo 提交于
      [ Upstream commit bd8d57fb7e25e9fcf67a9eef5fa13aabe2016e07 ]
      
      The strncpy() function may leave the destination string buffer
      unterminated, better use strlcpy() that we have a __weak fallback
      implementation for systems without it.
      
      This fixes this warning on an Alpine Linux Edge system with gcc 8.2:
      
        util/parse-events.c: In function 'print_symbol_events':
        util/parse-events.c:2465:4: error: 'strncpy' specified bound 100 equals destination size [-Werror=stringop-truncation]
            strncpy(name, syms->symbol, MAX_NAME_LEN);
            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        In function 'print_symbol_events.constprop',
            inlined from 'print_events' at util/parse-events.c:2508:2:
        util/parse-events.c:2465:4: error: 'strncpy' specified bound 100 equals destination size [-Werror=stringop-truncation]
            strncpy(name, syms->symbol, MAX_NAME_LEN);
            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        In function 'print_symbol_events.constprop',
            inlined from 'print_events' at util/parse-events.c:2511:2:
        util/parse-events.c:2465:4: error: 'strncpy' specified bound 100 equals destination size [-Werror=stringop-truncation]
            strncpy(name, syms->symbol, MAX_NAME_LEN);
            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        cc1: all warnings being treated as errors
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Fixes: 947b4ad1 ("perf list: Fix max event string size")
      Link: https://lkml.kernel.org/n/tip-b663e33bm6x8hrkie4uxh7u2@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      58c67a0b
    • A
      perf svghelper: Fix unchecked usage of strncpy() · b332b4cd
      Arnaldo Carvalho de Melo 提交于
      [ Upstream commit 2f5302533f306d5ee87bd375aef9ca35b91762cb ]
      
      The strncpy() function may leave the destination string buffer
      unterminated, better use strlcpy() that we have a __weak fallback
      implementation for systems without it.
      
      In this specific case this would only happen if fgets() was buggy, as
      its man page states that it should read one less byte than the size of
      the destination buffer, so that it can put the nul byte at the end of
      it, so it would never copy 255 non-nul chars, as fgets reads into the
      orig buffer at most 254 non-nul chars and terminates it. But lets just
      switch to strlcpy to keep the original intent and silence the gcc 8.2
      warning.
      
      This fixes this warning on an Alpine Linux Edge system with gcc 8.2:
      
        In function 'cpu_model',
            inlined from 'svg_cpu_box' at util/svghelper.c:378:2:
        util/svghelper.c:337:5: error: 'strncpy' output may be truncated copying 255 bytes from a string of length 255 [-Werror=stringop-truncation]
             strncpy(cpu_m, &buf[13], 255);
             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Fixes: f48d55ce ("perf: Add a SVG helper library file")
      Link: https://lkml.kernel.org/n/tip-xzkoo0gyr56gej39ltivuh9g@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      b332b4cd
    • F
      perf tests ARM: Disable breakpoint tests 32-bit · f54fc4c2
      Florian Fainelli 提交于
      [ Upstream commit 24f967337f6d6bce931425769c0f5ff5cf2d212e ]
      
      The breakpoint tests on the ARM 32-bit kernel are broken in several
      ways.
      
      The breakpoint length requested does not necessarily match whether the
      function address has the Thumb bit (bit 0) set or not, and this does
      matter to the ARM kernel hw_breakpoint infrastructure. See [1] for
      background.
      
      [1]: https://lkml.org/lkml/2018/11/15/205
      
      As Will indicated, the overflow handling would require single-stepping
      which is not supported at the moment. Just disable those tests for the
      ARM 32-bit platforms and update the comment above to explain these
      limitations.
      Co-developed-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20181203191138.2419-1-f.fainelli@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      f54fc4c2
    • A
      perf intel-pt: Fix error with config term "pt=0" · c3e8c335
      Adrian Hunter 提交于
      [ Upstream commit 1c6f709b9f96366cc47af23c05ecec9b8c0c392d ]
      
      Users should never use 'pt=0', but if they do it may give a meaningless
      error:
      
      	$ perf record -e intel_pt/pt=0/u uname
      	Error:
      	The sys_perf_event_open() syscall returned with 22 (Invalid argument) for
      	event (intel_pt/pt=0/u).
      
      Fix that by forcing 'pt=1'.
      
      Committer testing:
      
        # perf record -e intel_pt/pt=0/u uname
        Error:
        The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (intel_pt/pt=0/u).
        /bin/dmesg | grep -i perf may provide additional information.
      
        # perf record -e intel_pt/pt=0/u uname
        pt=0 doesn't make sense, forcing pt=1
        Linux
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.020 MB perf.data ]
        #
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/b7c5b4e5-9497-10e5-fd43-5f3e4a0fe51d@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      c3e8c335
    • S
      tty/serial: do not free trasnmit buffer page under port lock · f74fc96e
      Sergey Senozhatsky 提交于
      [ Upstream commit d72402145ace0697a6a9e8e75a3de5bf3375f78d ]
      
      LKP has hit yet another circular locking dependency between uart
      console drivers and debugobjects [1]:
      
           CPU0                                    CPU1
      
                                                  rhltable_init()
                                                   __init_work()
                                                    debug_object_init
           uart_shutdown()                          /* db->lock */
            /* uart_port->lock */                    debug_print_object()
             free_page()                              printk()
                                                       call_console_drivers()
              debug_check_no_obj_freed()                /* uart_port->lock */
               /* db->lock */
                debug_print_object()
      
      So there are two dependency chains:
      	uart_port->lock -> db->lock
      And
      	db->lock -> uart_port->lock
      
      This particular circular locking dependency can be addressed in several
      ways:
      
      a) One way would be to move debug_print_object() out of db->lock scope
         and, thus, break the db->lock -> uart_port->lock chain.
      b) Another one would be to free() transmit buffer page out of db->lock
         in UART code; which is what this patch does.
      
      It makes sense to apply a) and b) independently: there are too many things
      going on behind free(), none of which depend on uart_port->lock.
      
      The patch fixes transmit buffer page free() in uart_shutdown() and,
      additionally, in uart_port_startup() (as was suggested by Dmitry Safonov).
      
      [1] https://lore.kernel.org/lkml/20181211091154.GL23332@shao2-debian/T/#uSigned-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Reviewed-by: NPetr Mladek <pmladek@suse.com>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Jiri Slaby <jslaby@suse.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Waiman Long <longman@redhat.com>
      Cc: Dmitry Safonov <dima@arista.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      f74fc96e
    • J
      btrfs: improve error handling of btrfs_add_link · 310f8296
      Johannes Thumshirn 提交于
      [ Upstream commit 1690dd41e0cb1dade80850ed8a3eb0121b96d22f ]
      
      In the error handling block, err holds the return value of either
      btrfs_del_root_ref() or btrfs_del_inode_ref() but it hasn't been checked
      since it's introduction with commit fe66a05a (Btrfs: improve error
      handling for btrfs_insert_dir_item callers) in 2012.
      
      If the error handling in the error handling fails, there's not much left
      to do and the abort either happened earlier in the callees or is
      necessary here.
      
      So if one of btrfs_del_root_ref() or btrfs_del_inode_ref() failed, abort
      the transaction, but still return the original code of the failure
      stored in 'ret' as this will be reported to the user.
      Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      310f8296
    • A
      btrfs: fix use-after-free due to race between replace start and cancel · 38b17eee
      Anand Jain 提交于
      [ Upstream commit d189dd70e2556181732598956d808ea53cc8774e ]
      
      The device replace cancel thread can race with the replace start thread
      and if fs_info::scrubs_running is not yet set, btrfs_scrub_cancel() will
      fail to stop the scrub thread.
      
      The scrub thread continues with the scrub for replace which then will
      try to write to the target device and which is already freed by the
      cancel thread.
      
      scrub_setup_ctx() warns as tgtdev is NULL.
      
        struct scrub_ctx *scrub_setup_ctx(struct btrfs_device *dev, int is_dev_replace)
        {
        ...
      	  if (is_dev_replace) {
      		  WARN_ON(!fs_info->dev_replace.tgtdev);  <===
      		  sctx->pages_per_wr_bio = SCRUB_PAGES_PER_WR_BIO;
      		  sctx->wr_tgtdev = fs_info->dev_replace.tgtdev;
      		  sctx->flush_all_writes = false;
      	  }
      
        [ 6724.497655] BTRFS info (device sdb): dev_replace from /dev/sdb (devid 1) to /dev/sdc started
        [ 6753.945017] BTRFS info (device sdb): dev_replace from /dev/sdb (devid 1) to /dev/sdc canceled
        [ 6852.426700] WARNING: CPU: 0 PID: 4494 at fs/btrfs/scrub.c:622 scrub_setup_ctx.isra.19+0x220/0x230 [btrfs]
        ...
        [ 6852.428928] RIP: 0010:scrub_setup_ctx.isra.19+0x220/0x230 [btrfs]
        ...
        [ 6852.432970] Call Trace:
        [ 6852.433202]  btrfs_scrub_dev+0x19b/0x5c0 [btrfs]
        [ 6852.433471]  btrfs_dev_replace_start+0x48c/0x6a0 [btrfs]
        [ 6852.433800]  btrfs_dev_replace_by_ioctl+0x3a/0x60 [btrfs]
        [ 6852.434097]  btrfs_ioctl+0x2476/0x2d20 [btrfs]
        [ 6852.434365]  ? do_sigaction+0x7d/0x1e0
        [ 6852.434623]  do_vfs_ioctl+0xa9/0x6c0
        [ 6852.434865]  ? syscall_trace_enter+0x1c8/0x310
        [ 6852.435124]  ? syscall_trace_enter+0x1c8/0x310
        [ 6852.435387]  ksys_ioctl+0x60/0x90
        [ 6852.435663]  __x64_sys_ioctl+0x16/0x20
        [ 6852.435907]  do_syscall_64+0x50/0x180
        [ 6852.436150]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Further, as the replace thread enters scrub_write_page_to_dev_replace()
      without the target device it panics:
      
        static int scrub_add_page_to_wr_bio(struct scrub_ctx *sctx,
      				      struct scrub_page *spage)
        {
        ...
      	bio_set_dev(bio, sbio->dev->bdev); <======
      
        [ 6929.715145] BUG: unable to handle kernel NULL pointer dereference at 00000000000000a0
        ..
        [ 6929.717106] Workqueue: btrfs-scrub btrfs_scrub_helper [btrfs]
        [ 6929.717420] RIP: 0010:scrub_write_page_to_dev_replace+0xb4/0x260
        [btrfs]
        ..
        [ 6929.721430] Call Trace:
        [ 6929.721663]  scrub_write_block_to_dev_replace+0x3f/0x60 [btrfs]
        [ 6929.721975]  scrub_bio_end_io_worker+0x1af/0x490 [btrfs]
        [ 6929.722277]  normal_work_helper+0xf0/0x4c0 [btrfs]
        [ 6929.722552]  process_one_work+0x1f4/0x520
        [ 6929.722805]  ? process_one_work+0x16e/0x520
        [ 6929.723063]  worker_thread+0x46/0x3d0
        [ 6929.723313]  kthread+0xf8/0x130
        [ 6929.723544]  ? process_one_work+0x520/0x520
        [ 6929.723800]  ? kthread_delayed_work_timer_fn+0x80/0x80
        [ 6929.724081]  ret_from_fork+0x3a/0x50
      
      Fix this by letting the btrfs_dev_replace_finishing() to do the job of
      cleaning after the cancel, including freeing of the target device.
      btrfs_dev_replace_finishing() is called when btrfs_scub_dev() returns
      along with the scrub return status.
      Signed-off-by: NAnand Jain <anand.jain@oracle.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      38b17eee
    • H
      btrfs: alloc_chunk: fix more DUP stripe size handling · 720b86a5
      Hans van Kranenburg 提交于
      [ Upstream commit baf92114c7e6dd6124aa3d506e4bc4b694da3bc3 ]
      
      Commit 92e222df "btrfs: alloc_chunk: fix DUP stripe size handling"
      fixed calculating the stripe_size for a new DUP chunk.
      
      However, the same calculation reappears a bit later, and that one was
      not changed yet. The resulting bug that is exposed is that the newly
      allocated device extents ('stripes') can have a few MiB overlap with the
      next thing stored after them, which is another device extent or the end
      of the disk.
      
      The scenario in which this can happen is:
      * The block device for the filesystem is less than 10GiB in size.
      * The amount of contiguous free unallocated disk space chosen to use for
        chunk allocation is 20% of the total device size, or a few MiB more or
        less.
      
      An example:
      - The filesystem device is 7880MiB (max_chunk_size gets set to 788MiB)
      - There's 1578MiB unallocated raw disk space left in one contiguous
        piece.
      
      In this case stripe_size is first calculated as 789MiB, (half of
      1578MiB).
      
      Since 789MiB (stripe_size * data_stripes) > 788MiB (max_chunk_size), we
      enter the if block. Now stripe_size value is immediately overwritten
      while calculating an adjusted value based on max_chunk_size, which ends
      up as 788MiB.
      
      Next, the value is rounded up to a 16MiB boundary, 800MiB, which is
      actually more than the value we had before. However, the last comparison
      fails to detect this, because it's comparing the value with the total
      amount of free space, which is about twice the size of stripe_size.
      
      In the example above, this means that the resulting raw disk space being
      allocated is 1600MiB, while only a gap of 1578MiB has been found. The
      second device extent object for this DUP chunk will overlap for 22MiB
      with whatever comes next.
      
      The underlying problem here is that the stripe_size is reused all the
      time for different things. So, when entering the code in the if block,
      stripe_size is immediately overwritten with something else. If later we
      decide we want to have the previous value back, then the logic to
      compute it was copy pasted in again.
      
      With this change, the value in stripe_size is not unnecessarily
      destroyed, so the duplicated calculation is not needed any more.
      Signed-off-by: NHans van Kranenburg <hans.van.kranenburg@mendix.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      720b86a5
    • Q
      btrfs: volumes: Make sure there is no overlap of dev extents at mount time · bb5717a4
      Qu Wenruo 提交于
      [ Upstream commit 5eb193812a42dc49331f25137a38dfef9612d3e4 ]
      
      Enhance btrfs_verify_dev_extents() to remember previous checked dev
      extents, so it can verify no dev extents can overlap.
      
      Analysis from Hans:
      
      "Imagine allocating a DATA|DUP chunk.
      
       In the chunk allocator, we first set...
         max_stripe_size = SZ_1G;
         max_chunk_size = BTRFS_MAX_DATA_CHUNK_SIZE
       ... which is 10GiB.
      
       Then...
         /* we don't want a chunk larger than 10% of writeable space */
         max_chunk_size = min(div_factor(fs_devices->total_rw_bytes, 1),
             		 max_chunk_size);
      
       Imagine we only have one 7880MiB block device in this filesystem. Now
       max_chunk_size is down to 788MiB.
      
       The next step in the code is to search for max_stripe_size * dev_stripes
       amount of free space on the device, which is in our example 1GiB * 2 =
       2GiB. Imagine the device has exactly 1578MiB free in one contiguous
       piece. This amount of bytes will be put in devices_info[ndevs - 1].max_avail
      
       Next we recalculate the stripe_size (which is actually the device extent
       length), based on the actual maximum amount of available raw disk space:
         stripe_size = div_u64(devices_info[ndevs - 1].max_avail, dev_stripes);
      
       stripe_size is now 789MiB
      
       Next we do...
         data_stripes = num_stripes / ncopies
       ...where data_stripes ends up as 1, because num_stripes is 2 (the amount
       of device extents we're going to have), and DUP has ncopies 2.
      
       Next there's a check...
         if (stripe_size * data_stripes > max_chunk_size)
       ...which matches because 789MiB * 1 > 788MiB.
      
       We go into the if code, and next is...
         stripe_size = div_u64(max_chunk_size, data_stripes);
       ...which resets stripe_size to max_chunk_size: 788MiB
      
       Next is a fun one...
         /* bump the answer up to a 16MB boundary */
         stripe_size = round_up(stripe_size, SZ_16M);
       ...which changes stripe_size from 788MiB to 800MiB.
      
       We're not done changing stripe_size yet...
         /* But don't go higher than the limits we found while searching
          * for free extents
          */
         stripe_size = min(devices_info[ndevs - 1].max_avail,
             	      stripe_size);
      
       This is bad. max_avail is twice the stripe_size (we need to fit 2 device
       extents on the same device for DUP).
      
       The result here is that 800MiB < 1578MiB, so it's unchanged. However,
       the resulting DUP chunk will need 1600MiB disk space, which isn't there,
       and the second dev_extent might extend into the next thing (next
       dev_extent? end of device?) for 22MiB.
      
       The last shown line of code relies on a situation where there's twice
       the value of stripe_size present as value for the variable stripe_size
       when it's DUP. This was actually the case before commit 92e222df
       "btrfs: alloc_chunk: fix DUP stripe size handling", from which I quote:
         "[...] in the meantime there's a check to see if the stripe_size does
       not exceed max_chunk_size. Since during this check stripe_size is twice
       the amount as intended, the check will reduce the stripe_size to
       max_chunk_size if the actual correct to be used stripe_size is more than
       half the amount of max_chunk_size."
      
       In the previous version of the code, the 16MiB alignment (why is this
       done, by the way?) would result in a 50% chance that it would actually
       do an 8MiB alignment for the individual dev_extents, since it was
       operating on double the size. Does this matter?
      
       Does it matter that stripe_size can be set to anything which is not
       16MiB aligned because of the amount of remaining available disk space
       which is just taken?
      
       What is the main purpose of this round_up?
      
       The most straightforward thing to do seems something like...
         stripe_size = min(
             div_u64(devices_info[ndevs - 1].max_avail, dev_stripes),
             stripe_size
         )
       ..just putting half of the max_avail into stripe_size."
      
      Link: https://lore.kernel.org/linux-btrfs/b3461a38-e5f8-f41d-c67c-2efac8129054@mendix.com/Reported-by: NHans van Kranenburg <hans.van.kranenburg@mendix.com>
      Signed-off-by: NQu Wenruo <wqu@suse.com>
      [ add analysis from report ]
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      bb5717a4
    • J
      mmc: atmel-mci: do not assume idle after atmci_request_end · c21991ed
      Jonas Danielsson 提交于
      [ Upstream commit ae460c115b7aa50c9a36cf78fced07b27962c9d0 ]
      
      On our AT91SAM9260 board we use the same sdio bus for wifi and for the
      sd card slot. This caused the atmel-mci to give the following splat on
      the serial console:
      
        ------------[ cut here ]------------
        WARNING: CPU: 0 PID: 538 at drivers/mmc/host/atmel-mci.c:859 atmci_send_command+0x24/0x44
        Modules linked in:
        CPU: 0 PID: 538 Comm: mmcqd/0 Not tainted 4.14.76 #14
        Hardware name: Atmel AT91SAM9
        [<c000fccc>] (unwind_backtrace) from [<c000d3dc>] (show_stack+0x10/0x14)
        [<c000d3dc>] (show_stack) from [<c0017644>] (__warn+0xd8/0xf4)
        [<c0017644>] (__warn) from [<c0017704>] (warn_slowpath_null+0x1c/0x24)
        [<c0017704>] (warn_slowpath_null) from [<c033bb9c>] (atmci_send_command+0x24/0x44)
        [<c033bb9c>] (atmci_send_command) from [<c033e984>] (atmci_start_request+0x1f4/0x2dc)
        [<c033e984>] (atmci_start_request) from [<c033f3b4>] (atmci_request+0xf0/0x164)
        [<c033f3b4>] (atmci_request) from [<c0327108>] (mmc_start_request+0x280/0x2d0)
        [<c0327108>] (mmc_start_request) from [<c032800c>] (mmc_start_areq+0x230/0x330)
        [<c032800c>] (mmc_start_areq) from [<c03366f8>] (mmc_blk_issue_rw_rq+0xc4/0x310)
        [<c03366f8>] (mmc_blk_issue_rw_rq) from [<c03372c4>] (mmc_blk_issue_rq+0x118/0x5ac)
        [<c03372c4>] (mmc_blk_issue_rq) from [<c033781c>] (mmc_queue_thread+0xc4/0x118)
        [<c033781c>] (mmc_queue_thread) from [<c002daf8>] (kthread+0x100/0x118)
        [<c002daf8>] (kthread) from [<c000a580>] (ret_from_fork+0x14/0x34)
        ---[ end trace 594371ddfa284bd6 ]---
      
      This is:
        WARN_ON(host->cmd);
      
      This was fixed on our board by letting atmci_request_end determine what
      state we are in. Instead of unconditionally setting it to STATE_IDLE on
      STATE_END_REQUEST.
      Signed-off-by: NJonas Danielsson <jonas@orbital-systems.com>
      Signed-off-by: NUlf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      c21991ed
    • M
      kconfig: fix memory leak when EOF is encountered in quotation · 46199110
      Masahiro Yamada 提交于
      [ Upstream commit fbac5977d81cb2b2b7e37b11c459055d9585273c ]
      
      An unterminated string literal followed by new line is passed to the
      parser (with "multi-line strings not supported" warning shown), then
      handled properly there.
      
      On the other hand, an unterminated string literal at end of file is
      never passed to the parser, then results in memory leak.
      
      [Test Code]
      
        ----------(Kconfig begin)----------
        source "Kconfig.inc"
      
        config A
                bool "a"
        -----------(Kconfig end)-----------
      
        --------(Kconfig.inc begin)--------
        config B
                bool "b\No new line at end of file
        ---------(Kconfig.inc end)---------
      
      [Summary from Valgrind]
      
        Before the fix:
      
          LEAK SUMMARY:
             definitely lost: 16 bytes in 1 blocks
             ...
      
        After the fix:
      
          LEAK SUMMARY:
             definitely lost: 0 bytes in 0 blocks
             ...
      
      Eliminate the memory leak path by handling this case. Of course, such
      a Kconfig file is wrong already, so I will add an error message later.
      Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      46199110
    • M
      kconfig: fix file name and line number of warn_ignored_character() · ba8efcdc
      Masahiro Yamada 提交于
      [ Upstream commit 77c1c0fa8b1477c5799bdad65026ea5ff676da44 ]
      
      Currently, warn_ignore_character() displays invalid file name and
      line number.
      
      The lexer should use current_file->name and yylineno, while the parser
      should use zconf_curname() and zconf_lineno().
      
      This difference comes from that the lexer is always going ahead
      of the parser. The parser needs to look ahead one token to make a
      shift/reduce decision, so the lexer is requested to scan more text
      from the input file.
      
      This commit fixes the warning message from warn_ignored_character().
      
      [Test Code]
      
        ----(Kconfig begin)----
        /
        -----(Kconfig end)-----
      
      [Output]
      
        Before the fix:
      
        <none>:0:warning: ignoring unsupported character '/'
      
        After the fix:
      
        Kconfig:1:warning: ignoring unsupported character '/'
      Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      ba8efcdc
    • J
      bpf: relax verifier restriction on BPF_MOV | BPF_ALU · 344b51e7
      Jiong Wang 提交于
      [ Upstream commit e434b8cdf788568ba65a0a0fd9f3cb41f3ca1803 ]
      
      Currently, the destination register is marked as unknown for 32-bit
      sub-register move (BPF_MOV | BPF_ALU) whenever the source register type is
      SCALAR_VALUE.
      
      This is too conservative that some valid cases will be rejected.
      Especially, this may turn a constant scalar value into unknown value that
      could break some assumptions of verifier.
      
      For example, test_l4lb_noinline.c has the following C code:
      
          struct real_definition *dst
      
      1:  if (!get_packet_dst(&dst, &pckt, vip_info, is_ipv6))
      2:    return TC_ACT_SHOT;
      3:
      4:  if (dst->flags & F_IPV6) {
      
      get_packet_dst is responsible for initializing "dst" into valid pointer and
      return true (1), otherwise return false (0). The compiled instruction
      sequence using alu32 will be:
      
        412: (54) (u32) r7 &= (u32) 1
        413: (bc) (u32) r0 = (u32) r7
        414: (95) exit
      
      insn 413, a BPF_MOV | BPF_ALU, however will turn r0 into unknown value even
      r7 contains SCALAR_VALUE 1.
      
      This causes trouble when verifier is walking the code path that hasn't
      initialized "dst" inside get_packet_dst, for which case 0 is returned and
      we would then expect verifier concluding line 1 in the above C code pass
      the "if" check, therefore would skip fall through path starting at line 4.
      Now, because r0 returned from callee has became unknown value, so verifier
      won't skip analyzing path starting at line 4 and "dst->flags" requires
      dereferencing the pointer "dst" which actually hasn't be initialized for
      this path.
      
      This patch relaxed the code marking sub-register move destination. For a
      SCALAR_VALUE, it is safe to just copy the value from source then truncate
      it into 32-bit.
      
      A unit test also included to demonstrate this issue. This test will fail
      before this patch.
      
      This relaxation could let verifier skipping more paths for conditional
      comparison against immediate. It also let verifier recording a more
      accurate/strict value for one register at one state, if this state end up
      with going through exit without rejection and it is used for state
      comparison later, then it is possible an inaccurate/permissive value is
      better. So the real impact on verifier processed insn number is complex.
      But in all, without this fix, valid program could be rejected.
      
      >From real benchmarking on kernel selftests and Cilium bpf tests, there is
      no impact on processed instruction number when tests ares compiled with
      default compilation options. There is slightly improvements when they are
      compiled with -mattr=+alu32 after this patch.
      
      Also, test_xdp_noinline/-mattr=+alu32 now passed verification. It is
      rejected before this fix.
      
      Insn processed before/after this patch:
      
                              default     -mattr=+alu32
      
      Kernel selftest
      
      ===
      test_xdp.o              371/371      369/369
      test_l4lb.o             6345/6345    5623/5623
      test_xdp_noinline.o     2971/2971    rejected/2727
      test_tcp_estates.o      429/429      430/430
      
      Cilium bpf
      ===
      bpf_lb-DLB_L3.o:        2085/2085     1685/1687
      bpf_lb-DLB_L4.o:        2287/2287     1986/1982
      bpf_lb-DUNKNOWN.o:      690/690       622/622
      bpf_lxc.o:              95033/95033   N/A
      bpf_netdev.o:           7245/7245     N/A
      bpf_overlay.o:          2898/2898     3085/2947
      
      NOTE:
        - bpf_lxc.o and bpf_netdev.o compiled by -mattr=+alu32 are rejected by
          verifier due to another issue inside verifier on supporting alu32
          binary.
        - Each cilium bpf program could generate several processed insn number,
          above number is sum of them.
      
      v1->v2:
       - Restrict the change on SCALAR_VALUE.
       - Update benchmark numbers on Cilium bpf tests.
      Signed-off-by: NJiong Wang <jiong.wang@netronome.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      344b51e7
    • W
      arm64: Fix minor issues with the dcache_by_line_op macro · dfbf8c98
      Will Deacon 提交于
      [ Upstream commit 33309ecda0070506c49182530abe7728850ebe78 ]
      
      The dcache_by_line_op macro suffers from a couple of small problems:
      
      First, the GAS directives that are currently being used rely on
      assembler behavior that is not documented, and probably not guaranteed
      to produce the correct behavior going forward. As a result, we end up
      with some undefined symbols in cache.o:
      
      $ nm arch/arm64/mm/cache.o
               ...
               U civac
               ...
               U cvac
               U cvap
               U cvau
      
      This is due to the fact that the comparisons used to select the
      operation type in the dcache_by_line_op macro are comparing symbols
      not strings, and even though it seems that GAS is doing the right
      thing here (undefined symbols by the same name are equal to each
      other), it seems unwise to rely on this.
      
      Second, when patching in a DC CVAP instruction on CPUs that support it,
      the fallback path consists of a DC CVAU instruction which may be
      affected by CPU errata that require ARM64_WORKAROUND_CLEAN_CACHE.
      
      Solve these issues by unrolling the various maintenance routines and
      using the conditional directives that are documented as operating on
      strings. To avoid the complexity of nested alternatives, we move the
      DC CVAP patching to __clean_dcache_area_pop, falling back to a branch
      to __clean_dcache_area_poc if DCPOP is not supported by the CPU.
      Reported-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Suggested-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      dfbf8c98
    • L
      clk: imx6q: reset exclusive gates on init · 73f0b2e3
      Lucas Stach 提交于
      [ Upstream commit f7542d817733f461258fd3a47d77da35b2d9fc81 ]
      
      The exclusive gates may be set up in the wrong way by software running
      before the clock driver comes up. In that case the exclusive setup is
      locked in its initial state, as the complementary function can't be
      activated without disabling the initial setup first.
      
      To avoid this lock situation, reset the exclusive gates to the off
      state and allow the kernel to provide the proper setup.
      Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
      Reviewed-by: NDong Aisheng <Aisheng.dong@nxp.com>
      Signed-off-by: NStephen Boyd <sboyd@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      73f0b2e3
    • Q
      arm64: kasan: Increase stack size for KASAN_EXTRA · 8f183b33
      Qian Cai 提交于
      [ Upstream commit 6e8830674ea77f57d57a33cca09083b117a71f41 ]
      
      If the kernel is configured with KASAN_EXTRA, the stack size is
      increased significantly due to setting the GCC -fstack-reuse option to
      "none" [1]. As a result, it can trigger a stack overrun quite often with
      32k stack size compiled using GCC 8. For example, this reproducer
      
        https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/syscalls/madvise/madvise06.c
      
      can trigger a "corrupted stack end detected inside scheduler" very
      reliably with CONFIG_SCHED_STACK_END_CHECK enabled. There are other
      reports at:
      
        https://lore.kernel.org/lkml/1542144497.12945.29.camel@gmx.us/
        https://lore.kernel.org/lkml/721E7B42-2D55-4866-9C1A-3E8D64F33F9C@gmx.us/
      
      There are just too many functions that could have a large stack with
      KASAN_EXTRA due to large local variables that have been called over and
      over again without being able to reuse the stacks. Some noticiable ones
      are,
      
      size
      7536 shrink_inactive_list
      7440 shrink_page_list
      6560 fscache_stats_show
      3920 jbd2_journal_commit_transaction
      3216 try_to_unmap_one
      3072 migrate_page_move_mapping
      3584 migrate_misplaced_transhuge_page
      3920 ip_vs_lblcr_schedule
      4304 lpfc_nvme_info_show
      3888 lpfc_debugfs_nvmestat_data.constprop
      
      There are other 49 functions over 2k in size while compiling kernel with
      "-Wframe-larger-than=" on this machine. Hence, it is too much work to
      change Makefiles for each object to compile without
      -fsanitize-address-use-after-scope individually.
      
      [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81715#c23Signed-off-by: NQian Cai <cai@lca.pw>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      8f183b33
    • D
      selftests: do not macro-expand failed assertion expressions · 656257cf
      Dmitry V. Levin 提交于
      [ Upstream commit b708a3cc9600390ccaa2b68a88087dd265154b2b ]
      
      I've stumbled over the current macro-expand behaviour of the test
      harness:
      
      $ gcc -Wall -xc - <<'__EOF__'
      TEST(macro) {
      	int status = 0;
      	ASSERT_TRUE(WIFSIGNALED(status));
      }
      TEST_HARNESS_MAIN
      __EOF__
      $ ./a.out
      [==========] Running 1 tests from 1 test cases.
      [ RUN      ] global.macro
      <stdin>:4:global.macro:Expected 0 (0) != (((signed char) (((status) & 0x7f) + 1) >> 1) > 0) (0)
      global.macro: Test terminated by assertion
      [     FAIL ] global.macro
      [==========] 0 / 1 tests passed.
      [  FAILED  ]
      
      With this change the output of the same test looks much more
      comprehensible:
      
      [==========] Running 1 tests from 1 test cases.
      [ RUN      ] global.macro
      <stdin>:4:global.macro:Expected 0 (0) != WIFSIGNALED(status) (0)
      global.macro: Test terminated by assertion
      [     FAIL ] global.macro
      [==========] 0 / 1 tests passed.
      [  FAILED  ]
      
      The issue is very similar to the bug fixed in glibc assert(3)
      three years ago:
      https://sourceware.org/bugzilla/show_bug.cgi?id=18604
      
      Cc: Shuah Khan <shuah@kernel.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Will Drewry <wad@chromium.org>
      Cc: linux-kselftest@vger.kernel.org
      Signed-off-by: NDmitry V. Levin <ldv@altlinux.org>
      Acked-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: NShuah Khan <shuah@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      656257cf
    • B
      scsi: target/core: Make sure that target_wait_for_sess_cmds() waits long enough · 3ad8148c
      Bart Van Assche 提交于
      [ Upstream commit ad669505c4e9db9af9faeb5c51aa399326a80d91 ]
      
      A session must only be released after all code that accesses the session
      structure has finished. Make sure that this is the case by introducing a
      new command counter per session that is only decremented after the
      .release_cmd() callback has finished. This patch fixes the following crash:
      
      BUG: KASAN: use-after-free in do_raw_spin_lock+0x1c/0x130
      Read of size 4 at addr ffff8801534b16e4 by task rmdir/14805
      CPU: 16 PID: 14805 Comm: rmdir Not tainted 4.18.0-rc2-dbg+ #5
      Call Trace:
      dump_stack+0xa4/0xf5
      print_address_description+0x6f/0x270
      kasan_report+0x241/0x360
      __asan_load4+0x78/0x80
      do_raw_spin_lock+0x1c/0x130
      _raw_spin_lock_irqsave+0x52/0x60
      srpt_set_ch_state+0x27/0x70 [ib_srpt]
      srpt_disconnect_ch+0x1b/0xc0 [ib_srpt]
      srpt_close_session+0xa8/0x260 [ib_srpt]
      target_shutdown_sessions+0x170/0x180 [target_core_mod]
      core_tpg_del_initiator_node_acl+0xf3/0x200 [target_core_mod]
      target_fabric_nacl_base_release+0x25/0x30 [target_core_mod]
      config_item_release+0x9c/0x110 [configfs]
      config_item_put+0x26/0x30 [configfs]
      configfs_rmdir+0x3b8/0x510 [configfs]
      vfs_rmdir+0xb3/0x1e0
      do_rmdir+0x262/0x2c0
      do_syscall_64+0x77/0x230
      entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Cc: Nicholas Bellinger <nab@linux-iscsi.org>
      Cc: Mike Christie <mchristi@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: David Disseldorp <ddiss@suse.de>
      Cc: Hannes Reinecke <hare@suse.de>
      Signed-off-by: NBart Van Assche <bvanassche@acm.org>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      3ad8148c
    • D
      scsi: target: use consistent left-aligned ASCII INQUIRY data · 25d3546a
      David Disseldorp 提交于
      [ Upstream commit 0de263577de5d5e052be5f4f93334e63cc8a7f0b ]
      
      spc5r17.pdf specifies:
      
        4.3.1 ASCII data field requirements
        ASCII data fields shall contain only ASCII printable characters (i.e.,
        code values 20h to 7Eh) and may be terminated with one or more ASCII null
        (00h) characters.  ASCII data fields described as being left-aligned
        shall have any unused bytes at the end of the field (i.e., highest
        offset) and the unused bytes shall be filled with ASCII space characters
        (20h).
      
      LIO currently space-pads the T10 VENDOR IDENTIFICATION and PRODUCT
      IDENTIFICATION fields in the standard INQUIRY data. However, the PRODUCT
      REVISION LEVEL field in the standard INQUIRY data as well as the T10 VENDOR
      IDENTIFICATION field in the INQUIRY Device Identification VPD Page are
      zero-terminated/zero-padded.
      
      Fix this inconsistency by using space-padding for all of the above fields.
      Signed-off-by: NDavid Disseldorp <ddiss@suse.de>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NBryant G. Ly <bly@catalogicsoftware.com>
      Reviewed-by: NLee Duncan <lduncan@suse.com>
      Reviewed-by: NHannes Reinecke <hare@suse.com>
      Reviewed-by: NRoman Bolshakov <r.bolshakov@yadro.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      25d3546a
    • Y
      net: call sk_dst_reset when set SO_DONTROUTE · 50deccdc
      yupeng 提交于
      [ Upstream commit 0fbe82e628c817e292ff588cd5847fc935e025f2 ]
      
      after set SO_DONTROUTE to 1, the IP layer should not route packets if
      the dest IP address is not in link scope. But if the socket has cached
      the dst_entry, such packets would be routed until the sk_dst_cache
      expires. So we should clean the sk_dst_cache when a user set
      SO_DONTROUTE option. Below are server/client python scripts which
      could reprodue this issue:
      
      server side code:
      
      ==========================================================================
      import socket
      import struct
      import time
      
      s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
      s.bind(('0.0.0.0', 9000))
      s.listen(1)
      sock, addr = s.accept()
      sock.setsockopt(socket.SOL_SOCKET, socket.SO_DONTROUTE, struct.pack('i', 1))
      while True:
          sock.send(b'foo')
          time.sleep(1)
      ==========================================================================
      
      client side code:
      ==========================================================================
      import socket
      import time
      
      s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
      s.connect(('server_address', 9000))
      while True:
          data = s.recv(1024)
          print(data)
      ==========================================================================
      Signed-off-by: Nyupeng <yupeng0921@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      50deccdc
    • G
      staging: erofs: fix use-after-free of on-stack `z_erofs_vle_unzip_io' · fd4c7fe1
      Gao Xiang 提交于
      [ Upstream commit 848bd9acdcd00c164b42b14aacec242949ecd471 ]
      
      The root cause is the race as follows:
       Thread #0                         Thread #1
      
       z_erofs_vle_unzip_kickoff         z_erofs_submit_and_unzip
      
                                          struct z_erofs_vle_unzip_io io[]
         atomic_add_return()
                                          wait_event()
                                          [end of function]
         wake_up()
      
      Fix it by taking the waitqueue lock between atomic_add_return and
      wake_up to close such the race.
      
      kernel message:
      
      Unable to handle kernel paging request at virtual address 97f7052caa1303dc
      ...
      Workqueue: kverityd verity_work
      task: ffffffe32bcb8000 task.stack: ffffffe3298a0000
      PC is at __wake_up_common+0x48/0xa8
      LR is at __wake_up+0x3c/0x58
      ...
      Call trace:
      ...
      [<ffffff94a08ff648>] __wake_up_common+0x48/0xa8
      [<ffffff94a08ff8b8>] __wake_up+0x3c/0x58
      [<ffffff94a0c11b60>] z_erofs_vle_unzip_kickoff+0x40/0x64
      [<ffffff94a0c118e4>] z_erofs_vle_read_endio+0x94/0x134
      [<ffffff94a0c83c9c>] bio_endio+0xe4/0xf8
      [<ffffff94a1076540>] dec_pending+0x134/0x32c
      [<ffffff94a1076f28>] clone_endio+0x90/0xf4
      [<ffffff94a0c83c9c>] bio_endio+0xe4/0xf8
      [<ffffff94a1095024>] verity_work+0x210/0x368
      [<ffffff94a08c4150>] process_one_work+0x188/0x4b4
      [<ffffff94a08c45bc>] worker_thread+0x140/0x458
      [<ffffff94a08cad48>] kthread+0xec/0x108
      [<ffffff94a0883ab4>] ret_from_fork+0x10/0x1c
      Code: d1006273 54000260 f9400804 b9400019 (b85fc081)
      ---[ end trace be9dde154f677cd1 ]---
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NGao Xiang <gaoxiang25@huawei.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      fd4c7fe1
    • V
      media: venus: core: Set dma maximum segment size · 38be2cba
      Vivek Gautam 提交于
      [ Upstream commit de2563bce7a157f5296bab94f3843d7d64fb14b4 ]
      
      Turning on CONFIG_DMA_API_DEBUG_SG results in the following error:
      
      [  460.308650] ------------[ cut here ]------------
      [  460.313490] qcom-venus aa00000.video-codec: DMA-API: mapping sg segment longer than device claims to support [len=4194304] [max=65536]
      [  460.326017] WARNING: CPU: 3 PID: 3555 at src/kernel/dma/debug.c:1301 debug_dma_map_sg+0x174/0x254
      [  460.338888] Modules linked in: venus_dec venus_enc videobuf2_dma_sg videobuf2_memops hci_uart btqca bluetooth venus_core v4l2_mem2mem videobuf2_v4l2 videobuf2_common ath10k_snoc ath10k_core ath lzo lzo_compress zramjoydev
      [  460.375811] CPU: 3 PID: 3555 Comm: V4L2DecoderThre Tainted: G        W         4.19.1 #82
      [  460.384223] Hardware name: Google Cheza (rev1) (DT)
      [  460.389251] pstate: 60400009 (nZCv daif +PAN -UAO)
      [  460.394191] pc : debug_dma_map_sg+0x174/0x254
      [  460.398680] lr : debug_dma_map_sg+0x174/0x254
      [  460.403162] sp : ffffff80200c37d0
      [  460.406583] x29: ffffff80200c3830 x28: 0000000000010000
      [  460.412056] x27: 00000000ffffffff x26: ffffffc0f785ea80
      [  460.417532] x25: 0000000000000000 x24: ffffffc0f4ea1290
      [  460.423001] x23: ffffffc09e700300 x22: ffffffc0f4ea1290
      [  460.428470] x21: ffffff8009037000 x20: 0000000000000001
      [  460.433936] x19: ffffff80091b0000 x18: 0000000000000000
      [  460.439411] x17: 0000000000000000 x16: 000000000000f251
      [  460.444885] x15: 0000000000000006 x14: 0720072007200720
      [  460.450354] x13: ffffff800af536e0 x12: 0000000000000000
      [  460.455822] x11: 0000000000000000 x10: 0000000000000000
      [  460.461288] x9 : 537944d9c6c48d00 x8 : 537944d9c6c48d00
      [  460.466758] x7 : 0000000000000000 x6 : ffffffc0f8d98f80
      [  460.472230] x5 : 0000000000000000 x4 : 0000000000000000
      [  460.477703] x3 : 000000000000008a x2 : ffffffc0fdb13948
      [  460.483170] x1 : ffffffc0fdb0b0b0 x0 : 000000000000007a
      [  460.488640] Call trace:
      [  460.491165]  debug_dma_map_sg+0x174/0x254
      [  460.495307]  vb2_dma_sg_alloc+0x260/0x2dc [videobuf2_dma_sg]
      [  460.501150]  __vb2_queue_alloc+0x164/0x374 [videobuf2_common]
      [  460.507076]  vb2_core_reqbufs+0xfc/0x23c [videobuf2_common]
      [  460.512815]  vb2_reqbufs+0x44/0x5c [videobuf2_v4l2]
      [  460.517853]  v4l2_m2m_reqbufs+0x44/0x78 [v4l2_mem2mem]
      [  460.523144]  v4l2_m2m_ioctl_reqbufs+0x1c/0x28 [v4l2_mem2mem]
      [  460.528976]  v4l_reqbufs+0x30/0x40
      [  460.532480]  __video_do_ioctl+0x36c/0x454
      [  460.536610]  video_usercopy+0x25c/0x51c
      [  460.540572]  video_ioctl2+0x38/0x48
      [  460.544176]  v4l2_ioctl+0x60/0x74
      [  460.547602]  do_video_ioctl+0x948/0x3520
      [  460.551648]  v4l2_compat_ioctl32+0x60/0x98
      [  460.555872]  __arm64_compat_sys_ioctl+0x134/0x20c
      [  460.560718]  el0_svc_common+0x9c/0xe4
      [  460.564498]  el0_svc_compat_handler+0x2c/0x38
      [  460.568982]  el0_svc_compat+0x8/0x18
      [  460.572672] ---[ end trace ce209b87b2f3af88 ]---
      
      >From above warning one would deduce that the sg segment will overflow
      the device's capacity. In reality, the hardware can accommodate larger
      sg segments.
      So, initialize the max segment size properly to weed out this warning.
      
      Based on a similar patch sent by Sean Paul for mdss:
      https://patchwork.kernel.org/patch/10671457/Signed-off-by: NVivek Gautam <vivek.gautam@codeaurora.org>
      Acked-by: NStanimir Varbanov <stanimir.varbanov@linaro.org>
      Signed-off-by: NHans Verkuil <hverkuil-cisco@xs4all.nl>
      Signed-off-by: NMauro Carvalho Chehab <mchehab+samsung@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      38be2cba
    • Y
      ASoC: use dma_ops of parent device for acp_audio_dma · 9df6861a
      Yu Zhao 提交于
      [ Upstream commit 23aa128bb28d9da69bb1bdb2b70e50128857884a ]
      
      AMD platform device acp_audio_dma can only be created by parent PCI
      device driver (drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c). Pass struct
      device of the parent to snd_pcm_lib_preallocate_pages() so
      dma_alloc_coherent() can use correct dma_ops. Otherwise, it will
      use default dma_ops which is nommu_dma_ops on x86_64 even when
      IOMMU is enabled and set to non passthrough mode.
      
      Though platform device inherits some dma related fields during its
      creation in mfd_add_device(), we can't simply pass its struct device
      to snd_pcm_lib_preallocate_pages() because dma_ops is not among the
      inherited fields. Even it were, drivers/iommu/amd_iommu.c would
      ignore it because get_device_id() doesn't handle platform device.
      
      This change shouldn't give us any trouble even struct device of the
      parent becomes null or represents some non PCI device in the future,
      because get_dma_ops() correctly handles null struct device or uses
      the default dma_ops if struct device doesn't have it set.
      Signed-off-by: NYu Zhao <yuzhao@google.com>
      Signed-off-by: NMark Brown <broonie@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      9df6861a
    • N
      media: firewire: Fix app_info parameter type in avc_ca{,_app}_info · 597a09e0
      Nathan Chancellor 提交于
      [ Upstream commit b2e9a4eda11fd2cb1e6714e9ad3f455c402568ff ]
      
      Clang warns:
      
      drivers/media/firewire/firedtv-avc.c:999:45: warning: implicit
      conversion from 'int' to 'char' changes value from 159 to -97
      [-Wconstant-conversion]
              app_info[0] = (EN50221_TAG_APP_INFO >> 16) & 0xff;
                          ~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~
      drivers/media/firewire/firedtv-avc.c:1000:45: warning: implicit
      conversion from 'int' to 'char' changes value from 128 to -128
      [-Wconstant-conversion]
              app_info[1] = (EN50221_TAG_APP_INFO >>  8) & 0xff;
                          ~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~
      drivers/media/firewire/firedtv-avc.c:1040:44: warning: implicit
      conversion from 'int' to 'char' changes value from 159 to -97
      [-Wconstant-conversion]
              app_info[0] = (EN50221_TAG_CA_INFO >> 16) & 0xff;
                          ~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~
      drivers/media/firewire/firedtv-avc.c:1041:44: warning: implicit
      conversion from 'int' to 'char' changes value from 128 to -128
      [-Wconstant-conversion]
              app_info[1] = (EN50221_TAG_CA_INFO >>  8) & 0xff;
                          ~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~
      4 warnings generated.
      
      Change app_info's type to unsigned char to match the type of the
      member msg in struct ca_msg, which is the only thing passed into the
      app_info parameter in this function.
      
      Link: https://github.com/ClangBuiltLinux/linux/issues/105Signed-off-by: NNathan Chancellor <natechancellor@gmail.com>
      Signed-off-by: NMauro Carvalho Chehab <mchehab+samsung@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      597a09e0
    • B
      powerpc/pseries/cpuidle: Fix preempt warning · 3049cdc2
      Breno Leitao 提交于
      [ Upstream commit 2b038cbc5fcf12a7ee1cc9bfd5da1e46dacdee87 ]
      
      When booting a pseries kernel with PREEMPT enabled, it dumps the
      following warning:
      
         BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1
         caller is pseries_processor_idle_init+0x5c/0x22c
         CPU: 13 PID: 1 Comm: swapper/0 Not tainted 4.20.0-rc3-00090-g12201a0128bc-dirty #828
         Call Trace:
         [c000000429437ab0] [c0000000009c8878] dump_stack+0xec/0x164 (unreliable)
         [c000000429437b00] [c0000000005f2f24] check_preemption_disabled+0x154/0x160
         [c000000429437b90] [c000000000cab8e8] pseries_processor_idle_init+0x5c/0x22c
         [c000000429437c10] [c000000000010ed4] do_one_initcall+0x64/0x300
         [c000000429437ce0] [c000000000c54500] kernel_init_freeable+0x3f0/0x500
         [c000000429437db0] [c0000000000112dc] kernel_init+0x2c/0x160
         [c000000429437e20] [c00000000000c1d0] ret_from_kernel_thread+0x5c/0x6c
      
      This happens because the code calls get_lppaca() which calls
      get_paca() and it checks if preemption is disabled through
      check_preemption_disabled().
      
      Preemption should be disabled because the per CPU variable may make no
      sense if there is a preemption (and a CPU switch) after it reads the
      per CPU data and when it is used.
      
      In this device driver specifically, it is not a problem, because this
      code just needs to have access to one lppaca struct, and it does not
      matter if it is the current per CPU lppaca struct or not (i.e. when
      there is a preemption and a CPU migration).
      
      That said, the most appropriate fix seems to be related to avoiding
      the debug_smp_processor_id() call at get_paca(), instead of calling
      preempt_disable() before get_paca().
      Signed-off-by: NBreno Leitao <leitao@debian.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      3049cdc2
    • B
      powerpc/xmon: Fix invocation inside lock region · 115a0d66
      Breno Leitao 提交于
      [ Upstream commit 8d4a862276a9c30a269d368d324fb56529e6d5fd ]
      
      Currently xmon needs to get devtree_lock (through rtas_token()) during its
      invocation (at crash time). If there is a crash while devtree_lock is being
      held, then xmon tries to get the lock but spins forever and never get into
      the interactive debugger, as in the following case:
      
      	int *ptr = NULL;
      	raw_spin_lock_irqsave(&devtree_lock, flags);
      	*ptr = 0xdeadbeef;
      
      This patch avoids calling rtas_token(), thus trying to get the same lock,
      at crash time. This new mechanism proposes getting the token at
      initialization time (xmon_init()) and just consuming it at crash time.
      
      This would allow xmon to be possible invoked independent of devtree_lock
      being held or not.
      Signed-off-by: NBreno Leitao <leitao@debian.org>
      Reviewed-by: NThiago Jung Bauermann <bauerman@linux.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      115a0d66
    • D
      media: uvcvideo: Refactor teardown of uvc on USB disconnect · 819e2e07
      Daniel Axtens 提交于
      [ Upstream commit 10e1fdb95809ed21406f53b5b4f064673a1b9ceb ]
      
      Currently, disconnecting a USB webcam while it is in use prints out a
      number of warnings, such as:
      
      WARNING: CPU: 2 PID: 3118 at /build/linux-ezBi1T/linux-4.8.0/fs/sysfs/group.c:237 sysfs_remove_group+0x8b/0x90
      sysfs group ffffffffa7cd0780 not found for kobject 'event13'
      
      This has been noticed before. [0]
      
      This is because of the order in which things are torn down.
      
      If there are no streams active during a USB disconnect:
      
       - uvc_disconnect() is invoked via device_del() through the bus
         notifier mechanism.
      
       - this calls uvc_unregister_video().
      
       - uvc_unregister_video() unregisters the video device for each
         stream,
      
       - because there are no streams open, it calls uvc_delete()
      
       - uvc_delete() calls uvc_status_cleanup(), which cleans up the status
         input device.
      
       - uvc_delete() calls media_device_unregister(), which cleans up the
         media device
      
       - uvc_delete(), uvc_unregister_video() and uvc_disconnect() all
         return, and we end up back in device_del().
      
       - device_del() then cleans up the sysfs folder for the camera with
         dpm_sysfs_remove(). Because uvc_status_cleanup() and
         media_device_unregister() have already been called, this all works
         nicely.
      
      If, on the other hand, there *are* streams active during a USB disconnect:
      
       - uvc_disconnect() is invoked
      
       - this calls uvc_unregister_video()
      
       - uvc_unregister_video() unregisters the video device for each
         stream,
      
       - uvc_unregister_video() and uvc_disconnect() return, and we end up
         back in device_del().
      
       - device_del() then cleans up the sysfs folder for the camera with
         dpm_sysfs_remove(). Because the status input device and the media
         device are children of the USB device, this also deletes their
         sysfs folders.
      
       - Sometime later, the final stream is closed, invoking uvc_release().
      
       - uvc_release() calls uvc_delete()
      
       - uvc_delete() calls uvc_status_cleanup(), which cleans up the status
         input device. Because the sysfs directory has already been removed,
         this causes a WARNing.
      
       - uvc_delete() calls media_device_unregister(), which cleans up the
         media device. Because the sysfs directory has already been removed,
         this causes another WARNing.
      
      To fix this, we need to make sure the devices are always unregistered
      before the end of uvc_disconnect(). To this, move the unregistration
      into the disconnect path:
      
       - split uvc_status_cleanup() into two parts, one on disconnect that
         unregisters and one on delete that frees.
      
       - move v4l2_device_unregister() and media_device_unregister() into
         the disconnect path.
      
      [0]: https://lkml.org/lkml/2016/12/8/657
      
      [Renamed uvc_input_cleanup() to uvc_input_unregister()]
      Signed-off-by: NDaniel Axtens <dja@axtens.net>
      Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NLaurent Pinchart <laurent.pinchart@ideasonboard.com>
      Signed-off-by: NMauro Carvalho Chehab <mchehab+samsung@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      819e2e07
    • J
      pstore/ram: Do not treat empty buffers as valid · 265242d8
      Joel Fernandes (Google) 提交于
      [ Upstream commit 30696378f68a9e3dad6bfe55938b112e72af00c2 ]
      
      The ramoops backend currently calls persistent_ram_save_old() even
      if a buffer is empty. While this appears to work, it is does not seem
      like the right thing to do and could lead to future bugs so lets avoid
      that. It also prevents misleading prints in the logs which claim the
      buffer is valid.
      
      I got something like:
      
      	found existing buffer, size 0, start 0
      
      When I was expecting:
      
      	no valid data in buffer (sig = ...)
      
      This bails out early (and reports with pr_debug()), since it's an
      acceptable state.
      Signed-off-by: NJoel Fernandes (Google) <joel@joelfernandes.org>
      Co-developed-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      265242d8
    • A
      clk: imx: make mux parent strings const · ed99d79a
      A.s. Dong 提交于
      [ Upstream commit 9e5ef7a57ca75a1b9411c46caeeb6881124284a3 ]
      
      As the commit 2893c379 ("clk: make strings in parent name arrays
      const"), let's make the parent strings const, otherwise we may meet
      the following warning when compiling:
      
      drivers/clk/imx/clk-imx7ulp.c: In function 'imx7ulp_clocks_init':
      drivers/clk/imx/clk-imx7ulp.c:73:35: warning: passing argument 5 of
      	'imx_clk_mux_flags' discards 'const' qualifier from pointer target type
      
        clks[IMX7ULP_CLK_APLL_PRE_SEL] = imx_clk_mux_flags("apll_pre_sel", base + 0x508, 0,
      	1, pll_pre_sels, ARRAY_SIZE(pll_pre_sels), CLK_SET_PARENT_GATE);
                                         ^
      In file included from drivers/clk/imx/clk-imx7ulp.c:23:0:
      drivers/clk/imx/clk.h:200:27: note: expected 'const char **' but argument is
       of type 'const char * const*'
      ...
      
      Cc: Stephen Boyd <sboyd@codeaurora.org>
      Cc: Michael Turquette <mturquette@baylibre.com>
      Cc: Shawn Guo <shawnguo@kernel.org>
      Signed-off-by: NDong Aisheng <aisheng.dong@nxp.com>
      Signed-off-by: NStephen Boyd <sboyd@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      ed99d79a
    • D
      jffs2: Fix use of uninitialized delayed_work, lockdep breakage · c356972f
      Daniel Santos 提交于
      [ Upstream commit a788c5272769ddbcdbab297cf386413eeac04463 ]
      
      jffs2_sync_fs makes the assumption that if CONFIG_JFFS2_FS_WRITEBUFFER
      is defined then a write buffer is available and has been initialized.
      However, this does is not the case when the mtd device has no
      out-of-band buffer:
      
      int jffs2_nand_flash_setup(struct jffs2_sb_info *c)
      {
              if (!c->mtd->oobsize)
                      return 0;
      ...
      
      The resulting call to cancel_delayed_work_sync passing a uninitialized
      (but zeroed) delayed_work struct forces lockdep to become disabled.
      
      [   90.050639] overlayfs: upper fs does not support tmpfile.
      [   90.652264] INFO: trying to register non-static key.
      [   90.662171] the code is fine but needs lockdep annotation.
      [   90.673090] turning off the locking correctness validator.
      [   90.684021] CPU: 0 PID: 1762 Comm: mount_root Not tainted 4.14.63 #0
      [   90.696672] Stack : 00000000 00000000 80d8f6a2 00000038 805f0000 80444600 8fe364f4 805dfbe7
      [   90.713349]         80563a30 000006e2 8068370c 00000001 00000000 00000001 8e2fdc48 ffffffff
      [   90.730020]         00000000 00000000 80d90000 00000000 00000106 00000000 6465746e 312e3420
      [   90.746690]         6b636f6c 03bf0000 f8000000 20676e69 00000000 80000000 00000000 8e2c2a90
      [   90.763362]         80d90000 00000001 00000000 8e2c2a90 00000003 80260dc0 08052098 80680000
      [   90.780033]         ...
      [   90.784902] Call Trace:
      [   90.789793] [<8000f0d8>] show_stack+0xb8/0x148
      [   90.798659] [<8005a000>] register_lock_class+0x270/0x55c
      [   90.809247] [<8005cb64>] __lock_acquire+0x13c/0xf7c
      [   90.818964] [<8005e314>] lock_acquire+0x194/0x1dc
      [   90.828345] [<8003f27c>] flush_work+0x200/0x24c
      [   90.837374] [<80041dfc>] __cancel_work_timer+0x158/0x210
      [   90.847958] [<801a8770>] jffs2_sync_fs+0x20/0x54
      [   90.857173] [<80125cf4>] iterate_supers+0xf4/0x120
      [   90.866729] [<80158fc4>] sys_sync+0x44/0x9c
      [   90.875067] [<80014424>] syscall_common+0x34/0x58
      Signed-off-by: NDaniel Santos <daniel.santos@pobox.com>
      Reviewed-by: NHou Tao <houtao1@huawei.com>
      Signed-off-by: NBoris Brezillon <boris.brezillon@bootlin.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      c356972f
    • N
      efi/libstub: Disable some warnings for x86{,_64} · 50063ba9
      Nathan Chancellor 提交于
      [ Upstream commit 3db5e0ba8b8f4aee631d7ee04b7a11c56cfdc213 ]
      
      When building the kernel with Clang, some disabled warnings appear
      because this Makefile overrides KBUILD_CFLAGS for x86{,_64}. Add them to
      this list so that the build is clean again.
      
      -Wpointer-sign was disabled for the whole kernel before the beginning of Git history.
      
      -Waddress-of-packed-member was disabled for the whole kernel and for
      the early boot code in these commits:
      
        bfb38988 ("kbuild: clang: Disable 'address-of-packed-member' warning")
        20c6c189 ("x86/boot: Disable the address-of-packed-member compiler warning").
      
      -Wgnu was disabled for the whole kernel and for the early boot code in
      these commits:
      
        61163efa ("kbuild: LLVMLinux: Add Kbuild support for building kernel with Clang")
        6c3b56b1 ("x86/boot: Disable Clang warnings about GNU extensions").
      
       [ mingo: Made the changelog more readable. ]
      Tested-by: NSedat Dilek <sedat.dilek@gmail.com>
      Signed-off-by: NNathan Chancellor <natechancellor@gmail.com>
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Reviewed-by: NSedat Dilek <sedat.dilek@gmail.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arend van Spriel <arend.vanspriel@broadcom.com>
      Cc: Bhupesh Sharma <bhsharma@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Eric Snowberg <eric.snowberg@oracle.com>
      Cc: Hans de Goede <hdegoede@redhat.com>
      Cc: Joe Perches <joe@perches.com>
      Cc: Jon Hunter <jonathanh@nvidia.com>
      Cc: Julien Thierry <julien.thierry@arm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: YiFei Zhu <zhuyifei1999@gmail.com>
      Cc: linux-efi@vger.kernel.org
      Link: http://lkml.kernel.org/r/20181129171230.18699-8-ard.biesheuvel@linaro.org
      Link: https://github.com/ClangBuiltLinux/linux/issues/112Signed-off-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      50063ba9
    • C
      rxe: IB_WR_REG_MR does not capture MR's iova field · fded1b0e
      Chuck Lever 提交于
      [ Upstream commit b024dd0eba6e6d568f69d63c5e3153aba94c23e3 ]
      
      FRWR memory registration is done with a series of calls and WRs.
      1. ULP invokes ib_dma_map_sg()
      2. ULP invokes ib_map_mr_sg()
      3. ULP posts an IB_WR_REG_MR on the Send queue
      
      Step 2 generates an iova. It is permissible for ULPs to change this
      iova (with certain restrictions) between steps 2 and 3.
      
      rxe_map_mr_sg captures the MR's iova but later when rxe processes the
      REG_MR WR, it ignores the MR's iova field. If a ULP alters the MR's iova
      after step 2 but before step 3, rxe never captures that change.
      
      When the remote sends an RDMA Read targeting that MR, rxe looks up the
      R_key, but the altered iova does not match the iova stored in the MR,
      causing the RDMA Read request to fail.
      Reported-by: NAnna Schumaker <schumaker.anna@gmail.com>
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      fded1b0e
    • C
      drm/amdgpu: Reorder uvd ring init before uvd resume · e34e54f9
      Chris Wilson 提交于
      [ Upstream commit 3b34c14fd50c302db091f020f26dd00ede902c80 ]
      
      As amd_uvd_resume() accesses the uvd ring, it must be initialised first
      or else we trigger errors like:
      
      [    5.595963] [drm] Found UVD firmware Version: 1.87 Family ID: 17
      [    5.595969] [drm] PSP loading UVD firmware
      [    5.596266] ------------[ cut here ]------------
      [    5.596268] ODEBUG: assert_init not available (active state 0) object type: timer_list hint:           (null)
      [    5.596285] WARNING: CPU: 0 PID: 507 at lib/debugobjects.c:329 debug_print_object+0x6a/0x80
      [    5.596286] Modules linked in: amdgpu(+) hid_logitech_hidpp(+) chash gpu_sched amd_iommu_v2 ttm drm_kms_helper crc32c_intel drm hid_sony ff_memless igb hid_logitech_dj nvme dca i2c_algo_bit nvme_core wmi pinctrl_amd uas usb_storage
      [    5.596299] CPU: 0 PID: 507 Comm: systemd-udevd Tainted: G        W         4.20.0-0.rc1.git4.1.fc30.x86_64 #1
      [    5.596301] Hardware name: System manufacturer System Product Name/ROG STRIX X470-I GAMING, BIOS 0901 07/23/2018
      [    5.596303] RIP: 0010:debug_print_object+0x6a/0x80
      [    5.596305] Code: 8b 43 10 83 c2 01 8b 4b 14 4c 89 e6 89 15 e6 82 b0 02 4c 8b 45 00 48 c7 c7 60 fd 34 a6 48 8b 14 c5 a0 da 08 a6 e8 6a 6a b8 ff <0f> 0b 5b 83 05 d0 45 3e 01 01 5d 41 5c c3 83 05 c5 45 3e 01 01 c3
      [    5.596306] RSP: 0018:ffffa02ac863f8c0 EFLAGS: 00010282
      [    5.596307] RAX: 0000000000000000 RBX: ffffa02ac863f8e0 RCX: 0000000000000006
      [    5.596308] RDX: 0000000000000007 RSI: ffff9160e9a7bfe8 RDI: ffff9160f91d6c60
      [    5.596310] RBP: ffffffffa6742740 R08: 0000000000000002 R09: 0000000000000000
      [    5.596311] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffa634ff69
      [    5.596312] R13: 00000000000b79d0 R14: ffffffffa80f76d8 R15: 0000000000266000
      [    5.596313] FS:  00007f762abf7940(0000) GS:ffff9160f9000000(0000) knlGS:0000000000000000
      [    5.596314] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [    5.596315] CR2: 000055fdc593f000 CR3: 00000007e999c000 CR4: 00000000003406f0
      [    5.596317] Call Trace:
      [    5.596321]  debug_object_assert_init+0x14a/0x180
      [    5.596327]  del_timer+0x2e/0x90
      [    5.596383]  amdgpu_fence_process+0x47/0x100 [amdgpu]
      [    5.596430]  amdgpu_uvd_resume+0xf6/0x120 [amdgpu]
      [    5.596475]  uvd_v7_0_sw_init+0xe0/0x280 [amdgpu]
      [    5.596523]  amdgpu_device_init.cold.30+0xf97/0x14b6 [amdgpu]
      [    5.596563]  ? amdgpu_driver_load_kms+0x53/0x330 [amdgpu]
      [    5.596604]  amdgpu_driver_load_kms+0x86/0x330 [amdgpu]
      [    5.596614]  drm_dev_register+0x115/0x150 [drm]
      [    5.596654]  amdgpu_pci_probe+0xbd/0x120 [amdgpu]
      [    5.596658]  local_pci_probe+0x41/0x90
      [    5.596661]  pci_device_probe+0x188/0x1a0
      [    5.596666]  really_probe+0xf8/0x3b0
      [    5.596669]  driver_probe_device+0xb3/0xf0
      [    5.596672]  __driver_attach+0xe1/0x110
      [    5.596674]  ? driver_probe_device+0xf0/0xf0
      [    5.596676]  bus_for_each_dev+0x79/0xc0
      [    5.596679]  bus_add_driver+0x155/0x230
      [    5.596681]  ? 0xffffffffc07d9000
      [    5.596683]  driver_register+0x6b/0xb0
      [    5.596685]  ? 0xffffffffc07d9000
      [    5.596688]  do_one_initcall+0x5d/0x2be
      [    5.596691]  ? rcu_read_lock_sched_held+0x79/0x80
      [    5.596693]  ? kmem_cache_alloc_trace+0x264/0x290
      [    5.596695]  ? do_init_module+0x22/0x210
      [    5.596698]  do_init_module+0x5a/0x210
      [    5.596701]  load_module+0x2137/0x2430
      [    5.596703]  ? lockdep_hardirqs_on+0xed/0x180
      [    5.596714]  ? __do_sys_init_module+0x150/0x1a0
      [    5.596715]  __do_sys_init_module+0x150/0x1a0
      [    5.596722]  do_syscall_64+0x60/0x1f0
      [    5.596725]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [    5.596726] RIP: 0033:0x7f762b877dee
      [    5.596728] Code: 48 8b 0d 9d 20 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 af 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 6a 20 0c 00 f7 d8 64 89 01 48
      [    5.596729] RSP: 002b:00007ffc777b8558 EFLAGS: 00000246 ORIG_RAX: 00000000000000af
      [    5.596730] RAX: ffffffffffffffda RBX: 000055fdc48da320 RCX: 00007f762b877dee
      [    5.596731] RDX: 00007f762b9f284d RSI: 00000000006c5fc6 RDI: 000055fdc527a060
      [    5.596732] RBP: 00007f762b9f284d R08: 0000000000000003 R09: 0000000000000002
      [    5.596733] R10: 000055fdc48ad010 R11: 0000000000000246 R12: 000055fdc527a060
      [    5.596734] R13: 000055fdc48dca20 R14: 0000000000020000 R15: 0000000000000000
      [    5.596740] irq event stamp: 134618
      [    5.596743] hardirqs last  enabled at (134617): [<ffffffffa513d52e>] console_unlock+0x45e/0x610
      [    5.596744] hardirqs last disabled at (134618): [<ffffffffa50037e8>] trace_hardirqs_off_thunk+0x1a/0x1c
      [    5.596746] softirqs last  enabled at (133146): [<ffffffffa5e00365>] __do_softirq+0x365/0x47c
      [    5.596748] softirqs last disabled at (133139): [<ffffffffa50c64f9>] irq_exit+0x119/0x120
      [    5.596749] ---[ end trace eaee508abfebccdc ]---
      
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108709Reviewed-by: NChristian König <christian.koenig@amd.com>
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Alex Deucher <alexdeucher@gmail.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      e34e54f9