1. 01 12月, 2018 40 次提交
    • M
      dax: Avoid losing wakeup in dax_lock_mapping_entry · e7a121e3
      Matthew Wilcox 提交于
      commit 25bbe21bf427a81b8e3ccd480ea0e1d940256156 upstream.
      
      After calling get_unlocked_entry(), you have to call
      put_unlocked_entry() to avoid subsequent waiters losing wakeups.
      
      Fixes: c2a7d2a1 ("filesystem-dax: Introduce dax_lock_mapping_entry()")
      Cc: stable@vger.kernel.org
      Signed-off-by: NMatthew Wilcox <willy@infradead.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e7a121e3
    • M
      mm, page_alloc: check for max order in hot path · 9dec3855
      Michal Hocko 提交于
      [ Upstream commit c63ae43ba53bc432b414fd73dd5f4b01fcb1ab43 ]
      
      Konstantin has noticed that kvmalloc might trigger the following
      warning:
      
        WARNING: CPU: 0 PID: 6676 at mm/vmstat.c:986 __fragmentation_index+0x54/0x60
        [...]
        Call Trace:
         fragmentation_index+0x76/0x90
         compaction_suitable+0x4f/0xf0
         shrink_node+0x295/0x310
         node_reclaim+0x205/0x250
         get_page_from_freelist+0x649/0xad0
         __alloc_pages_nodemask+0x12a/0x2a0
         kmalloc_large_node+0x47/0x90
         __kmalloc_node+0x22b/0x2e0
         kvmalloc_node+0x3e/0x70
         xt_alloc_table_info+0x3a/0x80 [x_tables]
         do_ip6t_set_ctl+0xcd/0x1c0 [ip6_tables]
         nf_setsockopt+0x44/0x60
         SyS_setsockopt+0x6f/0xc0
         do_syscall_64+0x67/0x120
         entry_SYSCALL_64_after_hwframe+0x3d/0xa2
      
      the problem is that we only check for an out of bound order in the slow
      path and the node reclaim might happen from the fast path already.  This
      is fixable by making sure that kvmalloc doesn't ever use kmalloc for
      requests that are larger than KMALLOC_MAX_SIZE but this also shows that
      the code is rather fragile.  A recent UBSAN report just underlines that
      by the following report
      
        UBSAN: Undefined behaviour in mm/page_alloc.c:3117:19
        shift exponent 51 is too large for 32-bit type 'int'
        CPU: 0 PID: 6520 Comm: syz-executor1 Not tainted 4.19.0-rc2 #1
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
        Call Trace:
         __dump_stack lib/dump_stack.c:77 [inline]
         dump_stack+0xd2/0x148 lib/dump_stack.c:113
         ubsan_epilogue+0x12/0x94 lib/ubsan.c:159
         __ubsan_handle_shift_out_of_bounds+0x2b6/0x30b lib/ubsan.c:425
         __zone_watermark_ok+0x2c7/0x400 mm/page_alloc.c:3117
         zone_watermark_fast mm/page_alloc.c:3216 [inline]
         get_page_from_freelist+0xc49/0x44c0 mm/page_alloc.c:3300
         __alloc_pages_nodemask+0x21e/0x640 mm/page_alloc.c:4370
         alloc_pages_current+0xcc/0x210 mm/mempolicy.c:2093
         alloc_pages include/linux/gfp.h:509 [inline]
         __get_free_pages+0x12/0x60 mm/page_alloc.c:4414
         dma_mem_alloc+0x36/0x50 arch/x86/include/asm/floppy.h:156
         raw_cmd_copyin drivers/block/floppy.c:3159 [inline]
         raw_cmd_ioctl drivers/block/floppy.c:3206 [inline]
         fd_locked_ioctl+0xa00/0x2c10 drivers/block/floppy.c:3544
         fd_ioctl+0x40/0x60 drivers/block/floppy.c:3571
         __blkdev_driver_ioctl block/ioctl.c:303 [inline]
         blkdev_ioctl+0xb3c/0x1a30 block/ioctl.c:601
         block_ioctl+0x105/0x150 fs/block_dev.c:1883
         vfs_ioctl fs/ioctl.c:46 [inline]
         do_vfs_ioctl+0x1c0/0x1150 fs/ioctl.c:687
         ksys_ioctl+0x9e/0xb0 fs/ioctl.c:702
         __do_sys_ioctl fs/ioctl.c:709 [inline]
         __se_sys_ioctl fs/ioctl.c:707 [inline]
         __x64_sys_ioctl+0x7e/0xc0 fs/ioctl.c:707
         do_syscall_64+0xc4/0x510 arch/x86/entry/common.c:290
         entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Note that this is not a kvmalloc path.  It is just that the fast path
      really depends on having sanitzed order as well.  Therefore move the
      order check to the fast path.
      
      Link: http://lkml.kernel.org/r/20181113094305.GM15120@dhcp22.suse.czSigned-off-by: NMichal Hocko <mhocko@suse.com>
      Reported-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Reported-by: NKyungtae Kim <kt0755@gmail.com>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Cc: Balbir Singh <bsingharora@gmail.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Pavel Tatashin <pavel.tatashin@microsoft.com>
      Cc: Oscar Salvador <osalvador@suse.de>
      Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: Aaron Lu <aaron.lu@intel.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Byoungyoung Lee <lifeasageek@gmail.com>
      Cc: "Dae R. Jeong" <threeearcat@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      9dec3855
    • Y
      tmpfs: make lseek(SEEK_DATA/SEK_HOLE) return ENXIO with a negative offset · db89fc00
      Yufen Yu 提交于
      [ Upstream commit 1a413646931cb14442065cfc17561e50f5b5bb44 ]
      
      Other filesystems such as ext4, f2fs and ubifs all return ENXIO when
      lseek (SEEK_DATA or SEEK_HOLE) requests a negative offset.
      
      man 2 lseek says
      
      :      EINVAL whence  is  not  valid.   Or: the resulting file offset would be
      :             negative, or beyond the end of a seekable device.
      :
      :      ENXIO  whence is SEEK_DATA or SEEK_HOLE, and the file offset is  beyond
      :             the end of the file.
      
      Make tmpfs return ENXIO under these circumstances as well.  After this,
      tmpfs also passes xfstests's generic/448.
      
      [akpm@linux-foundation.org: rewrite changelog]
      Link: http://lkml.kernel.org/r/1540434176-14349-1-git-send-email-yuyufen@huawei.comSigned-off-by: NYufen Yu <yuyufen@huawei.com>
      Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: William Kucharski <william.kucharski@oracle.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      db89fc00
    • M
      mm, memory_hotplug: check zone_movable in has_unmovable_pages · b44fd126
      Michal Hocko 提交于
      [ Upstream commit 9d7899999c62c1a81129b76d2a6ecbc4655e1597 ]
      
      Page state checks are racy.  Under a heavy memory workload (e.g.  stress
      -m 200 -t 2h) it is quite easy to hit a race window when the page is
      allocated but its state is not fully populated yet.  A debugging patch to
      dump the struct page state shows
      
        has_unmovable_pages: pfn:0x10dfec00, found:0x1, count:0x0
        page:ffffea0437fb0000 count:1 mapcount:1 mapping:ffff880e05239841 index:0x7f26e5000 compound_mapcount: 1
        flags: 0x5fffffc0090034(uptodate|lru|active|head|swapbacked)
      
      Note that the state has been checked for both PageLRU and PageSwapBacked
      already.  Closing this race completely would require some sort of retry
      logic.  This can be tricky and error prone (think of potential endless
      or long taking loops).
      
      Workaround this problem for movable zones at least.  Such a zone should
      only contain movable pages.  Commit 15c30bc0 ("mm, memory_hotplug:
      make has_unmovable_pages more robust") has told us that this is not
      strictly true though.  Bootmem pages should be marked reserved though so
      we can move the original check after the PageReserved check.  Pages from
      other zones are still prone to races but we even do not pretend that
      memory hotremove works for those so pre-mature failure doesn't hurt that
      much.
      
      Link: http://lkml.kernel.org/r/20181106095524.14629-1-mhocko@kernel.org
      Fixes: 15c30bc0 ("mm, memory_hotplug: make has_unmovable_pages more robust")
      Signed-off-by: NMichal Hocko <mhocko@suse.com>
      Reported-by: NBaoquan He <bhe@redhat.com>
      Tested-by: NBaoquan He <bhe@redhat.com>
      Acked-by: NBaoquan He <bhe@redhat.com>
      Reviewed-by: NOscar Salvador <osalvador@suse.de>
      Acked-by: NBalbir Singh <bsingharora@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      b44fd126
    • V
      z3fold: fix possible reclaim races · 51006672
      Vitaly Wool 提交于
      [ Upstream commit ca0246bb97c23da9d267c2107c07fb77e38205c9 ]
      
      Reclaim and free can race on an object which is basically fine but in
      order for reclaim to be able to map "freed" object we need to encode
      object length in the handle.  handle_to_chunks() is then introduced to
      extract object length from a handle and use it during mapping.
      
      Moreover, to avoid racing on a z3fold "headless" page release, we should
      not try to free that page in z3fold_free() if the reclaim bit is set.
      Also, in the unlikely case of trying to reclaim a page being freed, we
      should not proceed with that page.
      
      While at it, fix the page accounting in reclaim function.
      
      This patch supersedes "[PATCH] z3fold: fix reclaim lock-ups".
      
      Link: http://lkml.kernel.org/r/20181105162225.74e8837d03583a9b707cf559@gmail.comSigned-off-by: NVitaly Wool <vitaly.vul@sony.com>
      Signed-off-by: NJongseok Kim <ks77sj@gmail.com>
      Reported-by-by: NJongseok Kim <ks77sj@gmail.com>
      Reviewed-by: NSnild Dolkow <snild@sony.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      51006672
    • A
      efi/arm: Revert deferred unmap of early memmap mapping · 43b2ceb0
      Ard Biesheuvel 提交于
      [ Upstream commit 33412b8673135b18ea42beb7f5117ed0091798b6 ]
      
      Commit:
      
        3ea86495 ("efi/arm: preserve early mapping of UEFI memory map longer for BGRT")
      
      deferred the unmap of the early mapping of the UEFI memory map to
      accommodate the ACPI BGRT code, which looks up the memory type that
      backs the BGRT table to validate it against the requirements of the UEFI spec.
      
      Unfortunately, this causes problems on ARM, which does not permit
      early mappings to persist after paging_init() is called, resulting
      in a WARN() splat. Since we don't support the BGRT table on ARM anway,
      let's revert ARM to the old behaviour, which is to take down the
      early mapping at the end of efi_init().
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-efi@vger.kernel.org
      Fixes: 3ea86495 ("efi/arm: preserve early mapping of UEFI memory ...")
      Link: http://lkml.kernel.org/r/20181114175544.12860-3-ard.biesheuvel@linaro.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      43b2ceb0
    • S
      powerpc/numa: Suppress "VPHN is not supported" messages · f5c632cf
      Satheesh Rajendran 提交于
      [ Upstream commit 437ccdc8ce629470babdda1a7086e2f477048cbd ]
      
      When VPHN function is not supported and during cpu hotplug event,
      kernel prints message 'VPHN function not supported. Disabling
      polling...'. Currently it prints on every hotplug event, it floods
      dmesg when a KVM guest tries to hotplug huge number of vcpus, let's
      just print once and suppress further kernel prints.
      Signed-off-by: NSatheesh Rajendran <sathnaga@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      f5c632cf
    • T
      NFSv4: Fix an Oops during delegation callbacks · b5ccf003
      Trond Myklebust 提交于
      [ Upstream commit e39d8a186ed002854196668cb7562ffdfbc6d379 ]
      
      If the server sends a CB_GETATTR or a CB_RECALL while the filesystem is
      being unmounted, then we can Oops when releasing the inode in
      nfs4_callback_getattr() and nfs4_callback_recall().
      Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      b5ccf003
    • P
      kdb: Use strscpy with destination buffer size · 2bc40f89
      Prarit Bhargava 提交于
      [ Upstream commit c2b94c72d93d0929f48157eef128c4f9d2e603ce ]
      
      gcc 8.1.0 warns with:
      
      kernel/debug/kdb/kdb_support.c: In function ‘kallsyms_symbol_next’:
      kernel/debug/kdb/kdb_support.c:239:4: warning: ‘strncpy’ specified bound depends on the length of the source argument [-Wstringop-overflow=]
           strncpy(prefix_name, name, strlen(name)+1);
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      kernel/debug/kdb/kdb_support.c:239:31: note: length computed here
      
      Use strscpy() with the destination buffer size, and use ellipses when
      displaying truncated symbols.
      
      v2: Use strscpy()
      Signed-off-by: NPrarit Bhargava <prarit@redhat.com>
      Cc: Jonathan Toppins <jtoppins@redhat.com>
      Cc: Jason Wessel <jason.wessel@windriver.com>
      Cc: Daniel Thompson <daniel.thompson@linaro.org>
      Cc: kgdb-bugreport@lists.sourceforge.net
      Reviewed-by: NDaniel Thompson <daniel.thompson@linaro.org>
      Signed-off-by: NDaniel Thompson <daniel.thompson@linaro.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      2bc40f89
    • P
      drm/amdgpu: fix bug with IH ring setup · 4dc84390
      Philip Yang 提交于
      [ Upstream commit c837243ff4017f493c7d6f4ab57278d812a86859 ]
      
      The bug limits the IH ring wptr address to 40bit. When the system memory
      is bigger than 1TB, the bus address is more than 40bit, this causes the
      interrupt cannot be handled and cleared correctly.
      Reviewed-by: NChristian König <christian.koenig@amd.com>
      Signed-off-by: NPhilip Yang <Philip.Yang@amd.com>
      Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      4dc84390
    • O
      RISC-V: Silence some module warnings on 32-bit · 021e2f3f
      Olof Johansson 提交于
      [ Upstream commit ef3a61406618291c46da168ff91acaa28d85944c ]
      
      Fixes:
      
      arch/riscv/kernel/module.c: In function 'apply_r_riscv_32_rela':
      ./include/linux/kern_levels.h:5:18: warning: format '%llx' expects argument of type 'long long unsigned int', but argument 3 has type 'Elf32_Addr' {aka 'unsigned int'} [-Wformat=]
      arch/riscv/kernel/module.c:23:27: note: format string is defined here
      arch/riscv/kernel/module.c: In function 'apply_r_riscv_pcrel_hi20_rela':
      ./include/linux/kern_levels.h:5:18: warning: format '%llx' expects argument of type 'long long unsigned int', but argument 3 has type 'Elf32_Addr' {aka 'unsigned int'} [-Wformat=]
      arch/riscv/kernel/module.c:104:23: note: format string is defined here
      arch/riscv/kernel/module.c: In function 'apply_r_riscv_hi20_rela':
      ./include/linux/kern_levels.h:5:18: warning: format '%llx' expects argument of type 'long long unsigned int', but argument 3 has type 'Elf32_Addr' {aka 'unsigned int'} [-Wformat=]
      arch/riscv/kernel/module.c:146:23: note: format string is defined here
      arch/riscv/kernel/module.c: In function 'apply_r_riscv_got_hi20_rela':
      ./include/linux/kern_levels.h:5:18: warning: format '%llx' expects argument of type 'long long unsigned int', but argument 3 has type 'Elf32_Addr' {aka 'unsigned int'} [-Wformat=]
      arch/riscv/kernel/module.c:190:60: note: format string is defined here
      arch/riscv/kernel/module.c: In function 'apply_r_riscv_call_plt_rela':
      ./include/linux/kern_levels.h:5:18: warning: format '%llx' expects argument of type 'long long unsigned int', but argument 3 has type 'Elf32_Addr' {aka 'unsigned int'} [-Wformat=]
      arch/riscv/kernel/module.c:214:24: note: format string is defined here
      arch/riscv/kernel/module.c: In function 'apply_r_riscv_call_rela':
      ./include/linux/kern_levels.h:5:18: warning: format '%llx' expects argument of type 'long long unsigned int', but argument 3 has type 'Elf32_Addr' {aka 'unsigned int'} [-Wformat=]
      arch/riscv/kernel/module.c:236:23: note: format string is defined here
      Signed-off-by: NOlof Johansson <olof@lixom.net>
      Signed-off-by: NPalmer Dabbelt <palmer@sifive.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      021e2f3f
    • D
      riscv: add missing vdso_install target · fc9b1d7f
      David Abdurachmanov 提交于
      [ Upstream commit f157d411a9eb170d2ee6b766da7a381962017cc9 ]
      
      Building kernel 4.20 for Fedora as RPM fails, because riscv is missing
      vdso_install target in arch/riscv/Makefile.
      Signed-off-by: NDavid Abdurachmanov <david.abdurachmanov@gmail.com>
      Signed-off-by: NPalmer Dabbelt <palmer@sifive.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      fc9b1d7f
    • T
      SUNRPC: Fix a bogus get/put in generic_key_to_expire() · ab1a5206
      Trond Myklebust 提交于
      [ Upstream commit e3d5e573a54dabdc0f9f3cb039d799323372b251 ]
      Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      ab1a5206
    • H
      block: copy ioprio in __bio_clone_fast() and bounce · 487d58a9
      Hannes Reinecke 提交于
      [ Upstream commit ca474b73896bf6e0c1eb8787eb217b0f80221610 ]
      
      We need to copy the io priority, too; otherwise the clone will run
      with a different priority than the original one.
      
      Fixes: 43b62ce3 ("block: move bio io prio to a new field")
      Signed-off-by: NHannes Reinecke <hare@suse.com>
      Signed-off-by: NJean Delvare <jdelvare@suse.de>
      
      Fixed up subject, and ordered stores.
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      487d58a9
    • K
      perf/x86/intel/uncore: Add more IMC PCI IDs for KabyLake and CoffeeLake CPUs · 08f94d06
      Kan Liang 提交于
      [ Upstream commit c10a8de0d32e95b0b8c7c17b6dc09baea5a5a899 ]
      
      KabyLake and CoffeeLake CPUs have the same client uncore events as SkyLake.
      
      Add the PCI IDs for the KabyLake Y, U, S processor lines and CoffeeLake U,
      H, S processor lines.
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: http://lkml.kernel.org/r/20181019170419.378-1-kan.liang@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      08f94d06
    • P
      sched/fair: Fix cpu_util_wake() for 'execl' type workloads · 08fbd4e0
      Patrick Bellasi 提交于
      [ Upstream commit c469933e772132aad040bd6a2adc8edf9ad6f825 ]
      
      A ~10% regression has been reported for UnixBench's execl throughput
      test by Aaron Lu and Ye Xiaolong:
      
        https://lkml.org/lkml/2018/10/30/765
      
      That test is pretty simple, it does a "recursive" execve() syscall on the
      same binary. Starting from the syscall, this sequence is possible:
      
         do_execve()
           do_execveat_common()
             __do_execve_file()
               sched_exec()
                 select_task_rq_fair()          <==| Task already enqueued
                   find_idlest_cpu()
                     find_idlest_group()
                       capacity_spare_wake()    <==| Functions not called from
      		   cpu_util_wake()           | the wakeup path
      
      which means we can end up calling cpu_util_wake() not only from the
      "wakeup path", as its name would suggest. Indeed, the task doing an
      execve() syscall is already enqueued on the CPU we want to get the
      cpu_util_wake() for.
      
      The estimated utilization for a CPU computed in cpu_util_wake() was
      written under the assumption that function can be called only from the
      wakeup path. If instead the task is already enqueued, we end up with a
      utilization which does not remove the current task's contribution from
      the estimated utilization of the CPU.
      This will wrongly assume a reduced spare capacity on the current CPU and
      increase the chances to migrate the task on execve.
      
      The regression is tracked down to:
      
       commit d519329f ("sched/fair: Update util_est only on util_avg updates")
      
      because in that patch we turn on by default the UTIL_EST sched feature.
      However, the real issue is introduced by:
      
       commit f9be3e59 ("sched/fair: Use util_est in LB and WU paths")
      
      Let's fix this by ensuring to always discount the task estimated
      utilization from the CPU's estimated utilization when the task is also
      the current one. The same benchmark of the bug report, executed on a
      dual socket 40 CPUs Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz machine,
      reports these "Execl Throughput" figures (higher the better):
      
         mainline     : 48136.5 lps
         mainline+fix : 55376.5 lps
      
      which correspond to a 15% speedup.
      
      Moreover, since {cpu_util,capacity_spare}_wake() are not really only
      used from the wakeup path, let's remove this ambiguity by using a better
      matching name: {cpu_util,capacity_spare}_without().
      
      Since we are at that, let's also improve the existing documentation.
      Reported-by: NAaron Lu <aaron.lu@intel.com>
      Reported-by: NYe Xiaolong <xiaolong.ye@intel.com>
      Tested-by: NAaron Lu <aaron.lu@intel.com>
      Signed-off-by: NPatrick Bellasi <patrick.bellasi@arm.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
      Cc: Juri Lelli <juri.lelli@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Morten Rasmussen <morten.rasmussen@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Quentin Perret <quentin.perret@arm.com>
      Cc: Steve Muckle <smuckle@google.com>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Todd Kjos <tkjos@google.com>
      Cc: Vincent Guittot <vincent.guittot@linaro.org>
      Fixes: f9be3e59 (sched/fair: Use util_est in LB and WU paths)
      Link: https://lore.kernel.org/lkml/20181025093100.GB13236@e110439-lin/Signed-off-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      08fbd4e0
    • M
      powerpc/io: Fix the IO workarounds code to work with Radix · b7718632
      Michael Ellerman 提交于
      [ Upstream commit 43c6494fa1499912c8177e71450c0279041152a6 ]
      
      Back in 2006 Ben added some workarounds for a misbehaviour in the
      Spider IO bridge used on early Cell machines, see commit
      014da7ff ("[POWERPC] Cell "Spider" MMIO workarounds"). Later these
      were made to be generic, ie. not tied specifically to Spider.
      
      The code stashes a token in the high bits (59-48) of virtual addresses
      used for IO (eg. returned from ioremap()). This works fine when using
      the Hash MMU, but when we're using the Radix MMU the bits used for the
      token overlap with some of the bits of the virtual address.
      
      This is because the maximum virtual address is larger with Radix, up
      to c00fffffffffffff, and in fact we use that high part of the address
      range for ioremap(), see RADIX_KERN_IO_START.
      
      As it happens the bits that are used overlap with the bits that
      differentiate an IO address vs a linear map address. If the resulting
      address lies outside the linear mapping we will crash (see below), if
      not we just corrupt memory.
      
        virtio-pci 0000:00:00.0: Using 64-bit direct DMA at offset 800000000000000
        Unable to handle kernel paging request for data at address 0xc000000080000014
        ...
        CFAR: c000000000626b98 DAR: c000000080000014 DSISR: 42000000 IRQMASK: 0
        GPR00: c0000000006c54fc c00000003e523378 c0000000016de600 0000000000000000
        GPR04: c00c000080000014 0000000000000007 0fffffff000affff 0000000000000030
               ^^^^
        ...
        NIP [c000000000626c5c] .iowrite8+0xec/0x100
        LR [c0000000006c992c] .vp_reset+0x2c/0x90
        Call Trace:
          .pci_bus_read_config_dword+0xc4/0x120 (unreliable)
          .register_virtio_device+0x13c/0x1c0
          .virtio_pci_probe+0x148/0x1f0
          .local_pci_probe+0x68/0x140
          .pci_device_probe+0x164/0x220
          .really_probe+0x274/0x3b0
          .driver_probe_device+0x80/0x170
          .__driver_attach+0x14c/0x150
          .bus_for_each_dev+0xb8/0x130
          .driver_attach+0x34/0x50
          .bus_add_driver+0x178/0x2f0
          .driver_register+0x90/0x1a0
          .__pci_register_driver+0x6c/0x90
          .virtio_pci_driver_init+0x2c/0x40
          .do_one_initcall+0x64/0x280
          .kernel_init_freeable+0x36c/0x474
          .kernel_init+0x24/0x160
          .ret_from_kernel_thread+0x58/0x7c
      
      This hasn't been a problem because CONFIG_PPC_IO_WORKAROUNDS which
      enables this code is usually not enabled. It is only enabled when it's
      selected by PPC_CELL_NATIVE which is only selected by
      PPC_IBM_CELL_BLADE and that in turn depends on BIG_ENDIAN. So in order
      to hit the bug you need to build a big endian kernel, with IBM Cell
      Blade support enabled, as well as Radix MMU support, and then boot
      that on Power9 using Radix MMU.
      
      Still we can fix the bug, so let's do that. We simply use fewer bits
      for the token, taking the union of the restrictions on the address
      from both Hash and Radix, we end up with 8 bits we can use for the
      token. The only user of the token is iowa_mem_find_bus() which only
      supports 8 token values, so 8 bits is plenty for that.
      
      Fixes: 566ca99a ("powerpc/mm/radix: Add dummy radix_enabled()")
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      b7718632
    • J
      floppy: fix race condition in __floppy_read_block_0() · 73fd491d
      Jens Axboe 提交于
      [ Upstream commit de7b75d82f70c5469675b99ad632983c50b6f7e7 ]
      
      LKP recently reported a hang at bootup in the floppy code:
      
      [  245.678853] INFO: task mount:580 blocked for more than 120 seconds.
      [  245.679906]       Tainted: G                T 4.19.0-rc6-00172-ga9f38e1 #1
      [  245.680959] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [  245.682181] mount           D 6372   580      1 0x00000004
      [  245.683023] Call Trace:
      [  245.683425]  __schedule+0x2df/0x570
      [  245.683975]  schedule+0x2d/0x80
      [  245.684476]  schedule_timeout+0x19d/0x330
      [  245.685090]  ? wait_for_common+0xa5/0x170
      [  245.685735]  wait_for_common+0xac/0x170
      [  245.686339]  ? do_sched_yield+0x90/0x90
      [  245.686935]  wait_for_completion+0x12/0x20
      [  245.687571]  __floppy_read_block_0+0xfb/0x150
      [  245.688244]  ? floppy_resume+0x40/0x40
      [  245.688844]  floppy_revalidate+0x20f/0x240
      [  245.689486]  check_disk_change+0x43/0x60
      [  245.690087]  floppy_open+0x1ea/0x360
      [  245.690653]  __blkdev_get+0xb4/0x4d0
      [  245.691212]  ? blkdev_get+0x1db/0x370
      [  245.691777]  blkdev_get+0x1f3/0x370
      [  245.692351]  ? path_put+0x15/0x20
      [  245.692871]  ? lookup_bdev+0x4b/0x90
      [  245.693539]  blkdev_get_by_path+0x3d/0x80
      [  245.694165]  mount_bdev+0x2a/0x190
      [  245.694695]  squashfs_mount+0x10/0x20
      [  245.695271]  ? squashfs_alloc_inode+0x30/0x30
      [  245.695960]  mount_fs+0xf/0x90
      [  245.696451]  vfs_kern_mount+0x43/0x130
      [  245.697036]  do_mount+0x187/0xc40
      [  245.697563]  ? memdup_user+0x28/0x50
      [  245.698124]  ksys_mount+0x60/0xc0
      [  245.698639]  sys_mount+0x19/0x20
      [  245.699167]  do_int80_syscall_32+0x61/0x130
      [  245.699813]  entry_INT80_32+0xc7/0xc7
      
      showing that we never complete that read request. The reason is that
      the completion setup is racy - it initializes the completion event
      AFTER submitting the IO, which means that the IO could complete
      before/during the init. If it does, we are passing garbage to
      complete() and we may sleep forever waiting for the event to
      occur.
      
      Fixes: 7b7b68bb ("floppy: bail out in open() if drive is not responding to block0 read")
      Reviewed-by: NOmar Sandoval <osandov@fb.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      73fd491d
    • A
      crypto: simd - correctly take reqsize of wrapped skcipher into account · c587ba48
      Ard Biesheuvel 提交于
      [ Upstream commit 508a1c4df085a547187eed346f1bfe5e381797f1 ]
      
      The simd wrapper's skcipher request context structure consists
      of a single subrequest whose size is taken from the subordinate
      skcipher. However, in simd_skcipher_init(), the reqsize that is
      retrieved is not from the subordinate skcipher but from the
      cryptd request structure, whose size is completely unrelated to
      the actual wrapped skcipher.
      Reported-by: NQian Cai <cai@gmx.us>
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Tested-by: NQian Cai <cai@gmx.us>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      c587ba48
    • X
      rtc: pcf2127: fix a kmemleak caused in pcf2127_i2c_gather_write · 49bcb041
      Xulin Sun 提交于
      [ Upstream commit 9bde0afb7a906f1dabdba37162551565740b862d ]
      
      pcf2127_i2c_gather_write() allocates memory as local variable
      for i2c_master_send(), after finishing the master transfer,
      the allocated memory should be freed. The kmemleak is reported:
      
      unreferenced object 0xffff80231e7dba80 (size 64):
        comm "hwclock", pid 27762, jiffies 4296880075 (age 356.944s)
        hex dump (first 32 bytes):
          03 00 12 03 19 02 11 13 00 80 98 18 00 00 ff ff ................
          00 50 00 00 00 00 00 00 02 00 00 00 00 00 00 00 .P..............
        backtrace:
          [<ffff000008221398>] create_object+0xf8/0x278
          [<ffff000008a96264>] kmemleak_alloc+0x74/0xa0
          [<ffff00000821070c>] __kmalloc+0x1ac/0x348
          [<ffff0000087ed1dc>] pcf2127_i2c_gather_write+0x54/0xf8
          [<ffff0000085fd9d4>] _regmap_raw_write+0x464/0x850
          [<ffff0000085fe3f4>] regmap_bulk_write+0x1a4/0x348
          [<ffff0000087ed32c>] pcf2127_rtc_set_time+0xac/0xe8
          [<ffff0000087eaad8>] rtc_set_time+0x80/0x138
          [<ffff0000087ebfb0>] rtc_dev_ioctl+0x398/0x610
          [<ffff00000823f2c0>] do_vfs_ioctl+0xb0/0x848
          [<ffff00000823fae4>] SyS_ioctl+0x8c/0xa8
          [<ffff000008083ac0>] el0_svc_naked+0x34/0x38
          [<ffffffffffffffff>] 0xffffffffffffffff
      Signed-off-by: NXulin Sun <xulin.sun@windriver.com>
      Signed-off-by: NAlexandre Belloni <alexandre.belloni@bootlin.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      49bcb041
    • H
      rtc: cmos: Do not export alarm rtc_ops when we do not support alarms · b411f946
      Hans de Goede 提交于
      [ Upstream commit fbb974ba693bbfb4e24a62181ef16d4e45febc37 ]
      
      When there is no IRQ configured for the RTC, the rtc-cmos code does not
      support alarms, all alarm rtc_ops fail with -EIO / -EINVAL.
      
      The rtc-core expects a rtc driver which does not support rtc alarms to
      not have alarm ops at all. Otherwise the wakealarm sysfs attr will read
      as empty rather then returning an error, making it impossible for
      userspace to find out beforehand if alarms are supported.
      
      A system without an IRQ for the RTC before this patch:
      [root@localhost ~]# cat /sys/class/rtc/rtc0/wakealarm
      [root@localhost ~]#
      
      After this patch:
      [root@localhost ~]# cat /sys/class/rtc/rtc0/wakealarm
      cat: /sys/class/rtc/rtc0/wakealarm: No such file or directory
      [root@localhost ~]#
      
      This fixes gnome-session + systemd trying to use suspend-then-hibernate,
      which causes systemd to abort the suspend when writing the RTC alarm fails.
      
      BugLink: https://github.com/systemd/systemd/issues/9988Signed-off-by: NHans de Goede <hdegoede@redhat.com>
      Signed-off-by: NAlexandre Belloni <alexandre.belloni@bootlin.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      b411f946
    • A
      cpufreq: imx6q: add return value check for voltage scale · 121f89dd
      Anson Huang 提交于
      [ Upstream commit 6ef28a04d1ccf718eee069b72132ce4aa1e52ab9 ]
      
      Add return value check for voltage scale when ARM clock
      rate change fail.
      Signed-off-by: NAnson Huang <Anson.Huang@nxp.com>
      Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      121f89dd
    • S
      KVM: PPC: Move and undef TRACE_INCLUDE_PATH/FILE · 8d976d7a
      Scott Wood 提交于
      [ Upstream commit 28c5bcf74fa07c25d5bd118d1271920f51ce2a98 ]
      
      TRACE_INCLUDE_PATH and TRACE_INCLUDE_FILE are used by
      <trace/define_trace.h>, so like that #include, they should
      be outside #ifdef protection.
      
      They also need to be #undefed before defining, in case multiple trace
      headers are included by the same C file.  This became the case on
      book3e after commit cf4a6085151a ("powerpc/mm: Add missing tracepoint for
      tlbie"), leading to the following build error:
      
         CC      arch/powerpc/kvm/powerpc.o
      In file included from arch/powerpc/kvm/powerpc.c:51:0:
      arch/powerpc/kvm/trace.h:9:0: error: "TRACE_INCLUDE_PATH" redefined
      [-Werror]
        #define TRACE_INCLUDE_PATH .
        ^
      In file included from arch/powerpc/kvm/../mm/mmu_decl.h:25:0,
                        from arch/powerpc/kvm/powerpc.c:48:
      ./arch/powerpc/include/asm/trace.h:224:0: note: this is the location of
      the previous definition
        #define TRACE_INCLUDE_PATH asm
        ^
      cc1: all warnings being treated as errors
      Reported-by: NChristian Zigotzky <chzigotzky@xenosoft.de>
      Signed-off-by: NScott Wood <oss@buserror.net>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      8d976d7a
    • Y
      scsi: hisi_sas: Remove set but not used variable 'dq_list' · c7ae5115
      YueHaibing 提交于
      [ Upstream commit e34ff8edcae89922d187425ab0b82e6a039aa371 ]
      
      Fixes gcc '-Wunused-but-set-variable' warning:
      
      drivers/scsi/hisi_sas/hisi_sas_v1_hw.c: In function 'start_delivery_v1_hw':
      drivers/scsi/hisi_sas/hisi_sas_v1_hw.c:907:20: warning:
       variable 'dq_list' set but not used [-Wunused-but-set-variable]
      
      drivers/scsi/hisi_sas/hisi_sas_v2_hw.c: In function 'start_delivery_v2_hw':
      drivers/scsi/hisi_sas/hisi_sas_v2_hw.c:1671:20: warning:
       variable 'dq_list' set but not used [-Wunused-but-set-variable]
      
      drivers/scsi/hisi_sas/hisi_sas_v3_hw.c: In function 'start_delivery_v3_hw':
      drivers/scsi/hisi_sas/hisi_sas_v3_hw.c:889:20: warning:
       variable 'dq_list' set but not used [-Wunused-but-set-variable]
      
      It never used since introduction in commit
      fa222db0 ("scsi: hisi_sas: Don't lock DQ for complete task sending")
      Signed-off-by: NYueHaibing <yuehaibing@huawei.com>
      Acked-by: NJohn Garry <john.garry@huawei.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      c7ae5115
    • A
      scsi: lpfc: fix remoteport access · 3d57a04f
      Arnd Bergmann 提交于
      [ Upstream commit f8d294324598ec85bea2779512e48c94cbe4d7c6 ]
      
      The addition of a spinlock in lpfc_debugfs_nodelist_data() introduced
      a bug that lets us not skip NULL pointers correctly, as noticed by
      gcc-8:
      
      drivers/scsi/lpfc/lpfc_debugfs.c: In function 'lpfc_debugfs_nodelist_data.constprop':
      drivers/scsi/lpfc/lpfc_debugfs.c:728:13: error: 'nrport' may be used uninitialized in this function [-Werror=maybe-uninitialized]
         if (nrport->port_role & FC_PORT_ROLE_NVME_INITIATOR)
      
      This changes the logic back to what it was, while keeping the added
      spinlock.
      
      Fixes: 9e210178 ("scsi: lpfc: Synchronize access to remoteport via rport")
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      3d57a04f
    • M
      tools/testing/nvdimm: Fix the array size for dimm devices. · 08609aac
      Masayoshi Mizuma 提交于
      [ Upstream commit af31b04b67f4fd7f639fd465a507c154c46fc9fb ]
      
      KASAN reports following global out of bounds access while
      nfit_test is being loaded. The out of bound access happens
      the following reference to dimm_fail_cmd_flags[dimm]. 'dimm' is
      over than the index value, NUM_DCR (==5).
      
        static int override_return_code(int dimm, unsigned int func, int rc)
        {
                if ((1 << func) & dimm_fail_cmd_flags[dimm]) {
      
      dimm_fail_cmd_flags[] definition:
        static unsigned long dimm_fail_cmd_flags[NUM_DCR];
      
      'dimm' is the return value of get_dimm(), and get_dimm() returns
      the index of handle[] array. The handle[] has 7 index. Let's use
      ARRAY_SIZE(handle) as the array size.
      
      KASAN report:
      
      ==================================================================
      BUG: KASAN: global-out-of-bounds in nfit_test_ctl+0x47bb/0x55b0 [nfit_test]
      Read of size 8 at addr ffffffffc10cbbe8 by task kworker/u41:0/8
      ...
      Call Trace:
       dump_stack+0xea/0x1b0
       ? dump_stack_print_info.cold.0+0x1b/0x1b
       ? kmsg_dump_rewind_nolock+0xd9/0xd9
       print_address_description+0x65/0x22e
       ? nfit_test_ctl+0x47bb/0x55b0 [nfit_test]
       kasan_report.cold.6+0x92/0x1a6
       nfit_test_ctl+0x47bb/0x55b0 [nfit_test]
      ...
      The buggy address belongs to the variable:
       dimm_fail_cmd_flags+0x28/0xffffffffffffa440 [nfit_test]
      ==================================================================
      
      Fixes: 39611e83 ("tools/testing/nvdimm: Make DSM failure code injection...")
      Signed-off-by: NMasayoshi Mizuma <m.mizuma@jp.fujitsu.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      08609aac
    • J
      pinctrl: meson: fix meson8b ao pull register bits · c4b25ef5
      Jerome Brunet 提交于
      [ Upstream commit a1705f02704cd8a24d434bfd0141ee8142ad277a ]
      
      AO pull register definition is inverted between pull (up/down) and
      pull enable. Fixing this allows to properly apply bias setting
      through pinconf
      
      Fixes: 0fefcb68 ("pinctrl: Add support for Meson8b")
      Signed-off-by: NJerome Brunet <jbrunet@baylibre.com>
      Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      c4b25ef5
    • J
      pinctrl: meson: fix meson8 ao pull register bits · 93620bc4
      Jerome Brunet 提交于
      [ Upstream commit e91b162d2868672d06010f34aa83d408db13d3c6 ]
      
      AO pull register definition is inverted between pull (up/down) and
      pull enable. Fixing this allows to properly apply bias setting
      through pinconf
      
      Fixes: 6ac73095 ("pinctrl: add driver for Amlogic Meson SoCs")
      Signed-off-by: NJerome Brunet <jbrunet@baylibre.com>
      Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      93620bc4
    • J
      pinctrl: meson: fix gxl ao pull register bits · c74e3fc6
      Jerome Brunet 提交于
      [ Upstream commit ed3a2b74f3eb34c84c8377353f4730f05acdfd05 ]
      
      AO pull register definition is inverted between pull (up/down) and
      pull enable. Fixing this allows to properly apply bias setting
      through pinconf
      
      Fixes: 0f15f500 ("pinctrl: meson: Add GXL pinctrl definitions")
      Signed-off-by: NJerome Brunet <jbrunet@baylibre.com>
      Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      c74e3fc6
    • J
      pinctrl: meson: fix gxbb ao pull register bits · 5922ab4a
      Jerome Brunet 提交于
      [ Upstream commit 4bc51e1e350cd4707ce6e551a93eae26d40b9889 ]
      
      AO pull register definition is inverted between pull (up/down) and
      pull enable. Fixing this allows to properly apply bias setting
      through pinconf
      
      Fixes: 468c234f ("pinctrl: amlogic: Add support for Amlogic Meson GXBB SoC")
      Signed-off-by: NJerome Brunet <jbrunet@baylibre.com>
      Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      5922ab4a
    • J
      pinctrl: meson: fix pinconf bias disable · 71ab26e9
      Jerome Brunet 提交于
      [ Upstream commit e39f9dd8206ad66992ac0e6218ef1ba746f2cce9 ]
      
      If a bias is enabled on a pin of an Amlogic SoC, calling .pin_config_set()
      with PIN_CONFIG_BIAS_DISABLE will not disable the bias. Instead it will
      force a pull-down bias on the pin.
      
      Instead of the pull type register bank, the driver should access the pull
      enable register bank.
      
      Fixes: 6ac73095 ("pinctrl: add driver for Amlogic Meson SoCs")
      Signed-off-by: NJerome Brunet <jbrunet@baylibre.com>
      Acked-by: NNeil Armstrong <narmstrong@baylibre.com>
      Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      71ab26e9
    • A
      fanotify: fix handling of events on child sub-directory · 20663629
      Amir Goldstein 提交于
      commit b469e7e47c8a075cc08bcd1e85d4365134bdcdd5 upstream.
      
      When an event is reported on a sub-directory and the parent inode has
      a mark mask with FS_EVENT_ON_CHILD|FS_ISDIR, the event will be sent to
      fsnotify() even if the event type is not in the parent mark mask
      (e.g. FS_OPEN).
      
      Further more, if that event happened on a mount or a filesystem with
      a mount/sb mark that does have that event type in their mask, the "on
      child" event will be reported on the mount/sb mark.  That is not
      desired, because user will get a duplicate event for the same action.
      
      Note that the event reported on the victim inode is never merged with
      the event reported on the parent inode, because of the check in
      should_merge(): old_fsn->inode == new_fsn->inode.
      
      Fix this by looking for a match of an actual event type (i.e. not just
      FS_ISDIR) in parent's inode mark mask and by not reporting an "on child"
      event to group if event type is only found on mount/sb marks.
      
      [backport hint: The bug seems to have always been in fanotify, but this
                      patch will only apply cleanly to v4.19.y]
      
      Cc: <stable@vger.kernel.org> # v4.19
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      [amir: backport to v4.19]
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      20663629
    • A
      fsnotify: generalize handling of extra event flags · 1dc3c17c
      Amir Goldstein 提交于
      commit 007d1e8395eaa59b0e7ad9eb2b53a40859446a88 upstream.
      
      FS_EVENT_ON_CHILD gets a special treatment in fsnotify() because it is
      not a flag specifying an event type, but rather an extra flags that may
      be reported along with another event and control the handling of the
      event by the backend.
      
      FS_ISDIR is also an "extra flag" and not an "event type" and therefore
      desrves the same treatment. With inotify/dnotify backends it was never
      possible to set FS_ISDIR in mark masks, so it did not matter.
      With fanotify backend, mark adding code jumps through hoops to avoid
      setting the FS_ISDIR in the commulative object mask.
      
      Separate the constant ALL_FSNOTIFY_EVENTS to ALL_FSNOTIFY_FLAGS and
      ALL_FSNOTIFY_EVENTS, so the latter can be used to test for specific
      event types.
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1dc3c17c
    • M
      IB/hfi1: Eliminate races in the SDMA send error path · 6763372b
      Michael J. Ruhl 提交于
      commit a0e0cb82804a6a21d9067022c2dfdf80d11da429 upstream.
      
      pq_update() can only be called in two places: from the completion
      function when the complete (npkts) sequence of packets has been
      submitted and processed, or from setup function if a subset of the
      packets were submitted (i.e. the error path).
      
      Currently both paths can call pq_update() if an error occurrs.  This
      race will cause the n_req value to go negative, hanging file_close(),
      or cause a crash by freeing the txlist more than once.
      
      Several variables are used to determine SDMA send state.  Most of
      these are unnecessary, and have code inspectible races between the
      setup function and the completion function, in both the send path and
      the error path.
      
      The request 'status' value can be set by the setup or by the
      completion function.  This is code inspectibly racy.  Since the status
      is not needed in the completion code or by the caller it has been
      removed.
      
      The request 'done' value races between usage by the setup and the
      completion function.  The completion function does not need this.
      When the number of processed packets matches npkts, it is done.
      
      The 'has_error' value races between usage of the setup and the
      completion function.  This can cause incorrect error handling and leave
      the n_req in an incorrect value (i.e. negative).
      
      Simplify the code by removing all of the unneeded state checks and
      variables.
      
      Clean up iovs node when it is freed.
      
      Eliminate race conditions in the error path:
      
      If all packets are submitted, the completion handler will set the
      completion status correctly (ok or aborted).
      
      If all packets are not submitted, the caller must wait until the
      submitted packets have completed, and then set the completion status.
      
      These two change eliminate the race condition in the error path.
      Reviewed-by: NMitko Haralanov <mitko.haralanov@intel.com>
      Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
      Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6763372b
    • E
      ACPICA: AML interpreter: add region addresses in global list during initialization · 87403e35
      Erik Schmauss 提交于
      commit 4abb951b73ff0a8a979113ef185651aa3c8da19b upstream.
      
      The table load process omitted adding the operation region address
      range to the global list. This omission is problematic because the OS
      queries the global list to check for address range conflicts before
      deciding which drivers to load. This commit may result in warning
      messages that look like the following:
      
      [    7.871761] ACPI Warning: system_IO range 0x00000428-0x0000042F conflicts with op_region 0x00000400-0x0000047F (\PMIO) (20180531/utaddress-213)
      [    7.871769] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver
      
      However, these messages do not signify regressions. It is a result of
      properly adding address ranges within the global address list.
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=200011Tested-by: NJean-Marc Lenoir <archlinux@jihemel.com>
      Signed-off-by: NErik Schmauss <erik.schmauss@intel.com>
      Cc: All applicable <stable@vger.kernel.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Cc: Jean Delvare <jdelvare@suse.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      87403e35
    • M
      can: flexcan: remove not needed struct flexcan_priv::tx_mb and struct flexcan_priv::tx_mb_idx · d5a9ba43
      Marc Kleine-Budde 提交于
      commit e05237f9da42ee52e73acea0bb082d788e111229 upstream.
      
      The previous patch changes the TX path to always use the last mailbox
      regardless of the used offload scheme (rx-fifo or timestamp based). This
      means members "tx_mb" and "tx_mb_idx" of the struct flexcan_priv don't
      depend on the offload scheme, so replace them by compile time constants.
      
      Cc: linux-stable <stable@vger.kernel.org>
      Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d5a9ba43
    • A
      can: flexcan: Always use last mailbox for TX · 24e55897
      Alexander Stein 提交于
      commit cbffaf7aa09edbaea2bc7dc440c945297095e2fd upstream.
      
      Essentially this patch moves the TX mailbox to position 63, regardless
      of timestamp based offloading or RX FIFO. So mainly the iflag register
      usage regarding TX has changed. The rest is consolidating RX FIFO and
      timestamp offloading as they now use both the same TX mailbox.
      
      The reason is a very annoying behavior regarding sending RTR frames when
      _not_ using RX FIFO:
      
      If a TX mailbox sent a RTR frame it becomes a RX mailbox. For that
      reason flexcan_irq disables the TX mailbox again. But if during the time
      the RTR was sent and the TX mailbox is disabled a new CAN frames is
      received, it is lost without notice. The reason is that so-called
      "Move-in" process starts from the lowest mailbox which happen to be a TX
      mailbox set to EMPTY.
      
      Steps to reproduce (I used an imx7d):
      1. generate regular bursts of messages
      2. send a RTR from flexcan with higher priority than burst messages every
         1ms, e.g. cangen -I 0x100 -L 0 -g 1 -R can0
      3. notice a lost message without notification after some seconds
      
      When running an iperf in parallel this problem is occurring even more
      frequently. Using filters is not possible as at least one single CAN-ID
      is allowed. Handling the TX MB during RX is also not possible as there
      is no race-free disable of RX MB.
      
      There is still a slight window when the described problem can occur. But
      for that all RX MB must be in use which is essentially next to an
      overrun. Still there will be no indication if it ever occurs.
      Signed-off-by: NAlexander Stein <alexander.stein@systec-electronic.com>
      Cc: linux-stable <stable@vger.kernel.org>
      Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      24e55897
    • L
      can: hi311x: Use level-triggered interrupt · 50d94ac1
      Lukas Wunner 提交于
      commit f164d0204b1156a7e0d8d1622c1a8d25752befec upstream.
      
      If the hi3110 shares the SPI bus with another traffic-intensive device
      and packets are received in high volume (by a separate machine sending
      with "cangen -g 0 -i -x"), reception stops after a few minutes and the
      counter in /proc/interrupts stops incrementing.  Bus state is "active".
      Bringing the interface down and back up reconvenes the reception.  The
      issue is not observed when the hi3110 is the sole device on the SPI bus.
      
      Using a level-triggered interrupt makes the issue go away and lets the
      hi3110 successfully receive 2 GByte over the course of 5 days while a
      ks8851 Ethernet chip on the same SPI bus handles 6 GByte of traffic.
      
      Unfortunately the hi3110 datasheet is mum on the trigger type.  The pin
      description on page 3 only specifies the polarity (active high):
      http://www.holtic.com/documents/371-hi-3110_v-rev-kpdf.do
      
      Cc: Mathias Duckeck <m.duckeck@kunbus.de>
      Cc: Akshay Bhat <akshay.bhat@timesys.com>
      Cc: Casey Fitzpatrick <casey.fitzpatrick@timesys.com>
      Signed-off-by: NLukas Wunner <lukas@wunner.de>
      Cc: linux-stable <stable@vger.kernel.org>
      Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      50d94ac1
    • O
      can: raw: check for CAN FD capable netdev in raw_sendmsg() · bf8295fa
      Oliver Hartkopp 提交于
      commit a43608fa77213ad5ac5f75994254b9f65d57cfa0 upstream.
      
      When the socket is CAN FD enabled it can handle CAN FD frame
      transmissions.  Add an additional check in raw_sendmsg() as a CAN2.0 CAN
      driver (non CAN FD) should never see a CAN FD frame. Due to the commonly
      used can_dropped_invalid_skb() function the CAN 2.0 driver would drop
      that CAN FD frame anyway - but with this patch the user gets a proper
      -EINVAL return code.
      Signed-off-by: NOliver Hartkopp <socketcan@hartkopp.net>
      Cc: linux-stable <stable@vger.kernel.org>
      Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bf8295fa
    • O
      can: flexcan: handle tx-complete CAN frames via rx-offload infrastructure · 04f98577
      Oleksij Rempel 提交于
      commit ed72bc8bcb9277061e753faf300b20f97323761c upstream.
      
      Current flexcan driver will put TX-ECHO in regular unsorted way, in
      this case TX-ECHO can come after the response to the same TXed message.
      In some cases, for example for J1939 stack, things will break.
      This patch is using new rx-offload API to put the messages just in the
      right place.
      Signed-off-by: NOleksij Rempel <o.rempel@pengutronix.de>
      Cc: linux-stable <stable@vger.kernel.org>
      Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      04f98577