1. 22 2月, 2019 16 次提交
    • A
      kasan: fix random seed generation for tag-based mode · 3f41b609
      Andrey Konovalov 提交于
      There are two issues with assigning random percpu seeds right now:
      
      1. We use for_each_possible_cpu() to iterate over cpus, but cpumask is
         not set up yet at the moment of kasan_init(), and thus we only set
         the seed for cpu #0.
      
      2. A call to get_random_u32() always returns the same number and produces
         a message in dmesg, since the random subsystem is not yet initialized.
      
      Fix 1 by calling kasan_init_tags() after cpumask is set up.
      
      Fix 2 by using get_cycles() instead of get_random_u32(). This gives us
      lower quality random numbers, but it's good enough, as KASAN is meant to
      be used as a debugging tool and not a mitigation.
      
      Link: http://lkml.kernel.org/r/1f815cc914b61f3516ed4cc9bfd9eeca9bd5d9de.1550677973.git.andreyknvl@google.comSigned-off-by: NAndrey Konovalov <andreyknvl@google.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3f41b609
    • D
      tmpfs: fix link accounting when a tmpfile is linked in · 1062af92
      Darrick J. Wong 提交于
      tmpfs has a peculiarity of accounting hard links as if they were
      separate inodes: so that when the number of inodes is limited, as it is
      by default, a user cannot soak up an unlimited amount of unreclaimable
      dcache memory just by repeatedly linking a file.
      
      But when v3.11 added O_TMPFILE, and the ability to use linkat() on the
      fd, we missed accommodating this new case in tmpfs: "df -i" shows that
      an extra "inode" remains accounted after the file is unlinked and the fd
      closed and the actual inode evicted.  If a user repeatedly links
      tmpfiles into a tmpfs, the limit will be hit (ENOSPC) even after they
      are deleted.
      
      Just skip the extra reservation from shmem_link() in this case: there's
      a sense in which this first link of a tmpfile is then cheaper than a
      hard link of another file, but the accounting works out, and there's
      still good limiting, so no need to do anything more complicated.
      
      Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1902182134370.7035@eggly.anvils
      Fixes: f4e0c30c ("allow the temp files created by open() to be linked to")
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NHugh Dickins <hughd@google.com>
      Reported-by: NMatej Kupljen <matej.kupljen@gmail.com>
      Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1062af92
    • J
      psi: avoid divide-by-zero crash inside virtual machines · 4e37504d
      Johannes Weiner 提交于
      We've been seeing hard-to-trigger psi crashes when running inside VM
      instances:
      
          divide error: 0000 [#1] SMP PTI
          Modules linked in: [...]
          CPU: 0 PID: 212 Comm: kworker/0:2 Not tainted 4.16.18-119_fbk9_3817_gfe944c98d695 #119
          Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
          Workqueue: events psi_clock
          RIP: 0010:psi_update_stats+0x270/0x490
          RSP: 0018:ffffc90001117e10 EFLAGS: 00010246
          RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff8800a35a13f8
          RDX: 0000000000000000 RSI: ffff8800a35a1340 RDI: 0000000000000000
          RBP: 0000000000000658 R08: ffff8800a35a1470 R09: 0000000000000000
          R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
          R13: 0000000000000000 R14: 0000000000000000 R15: 00000000000f8502
          FS:  0000000000000000(0000) GS:ffff88023fc00000(0000) knlGS:0000000000000000
          CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
          CR2: 00007fbe370fa000 CR3: 00000000b1e3a000 CR4: 00000000000006f0
          DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
          DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
          Call Trace:
           psi_clock+0x12/0x50
           process_one_work+0x1e0/0x390
           worker_thread+0x2b/0x3c0
           ? rescuer_thread+0x330/0x330
           kthread+0x113/0x130
           ? kthread_create_worker_on_cpu+0x40/0x40
           ? SyS_exit_group+0x10/0x10
           ret_from_fork+0x35/0x40
          Code: 48 0f 47 c7 48 01 c2 45 85 e4 48 89 16 0f 85 e6 00 00 00 4c 8b 49 10 4c 8b 51 08 49 69 d9 f2 07 00 00 48 6b c0 64 4c 8b 29 31 d2 <48> f7 f7 49 69 d5 8d 06 00 00 48 89 c5 4c 69 f0 00 98 0b 00 48
      
      The Code-line points to `period` being 0 inside update_stats(), and we
      divide by that when calculating that period's pressure percentage.
      
      The elapsed period should never be 0.  The reason this can happen is due
      to an off-by-one in the idle time / missing period calculation combined
      with a coarse sched_clock() in the virtual machine.
      
      The target time for aggregation is advanced into the future on a fixed
      grid to prevent clock drift.  So when an aggregation runs after some idle
      period, we can not just set it to "now + psi_period", but have to
      calculate the downtime and advance the target time relative to itself.
      
      However, if the aggregator was disabled exactly one psi_period (ns), we
      drop one idle period in the calculation due to a > when we should do >=.
      In that case, next_update will be advanced from 'now - psi_period' to
      'now' when it should be moved to 'now + psi_period'.  The run finishes
      with last_update == next_update == sched_clock().
      
      With hardware clocks, this exact nanosecond match isn't likely in the
      first place; but if it does happen, the clock will still have moved on and
      the period non-zero by the time the worker runs.  A pointlessly short
      period, but besides the extra work, no harm no foul.  However, a slow
      sched_clock() like we have on VMs might not have advanced either by the
      time the worker runs again.  And when we calculate the elapsed period, the
      result, our pressure divisor, will be 0.  Ouch.
      
      Fix this by correctly handling the situation when the elapsed time between
      aggregation runs is precisely two periods, and advance the expiration
      timestamp correctly to period into the future.
      
      Link: http://lkml.kernel.org/r/20190214193157.15788-1-hannes@cmpxchg.orgSigned-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Reported-by: Łukasz Siudut <lsiudut@fb.com
      Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4e37504d
    • M
      mm: handle lru_add_drain_all for UP properly · 6ea183d6
      Michal Hocko 提交于
      Since for_each_cpu(cpu, mask) added by commit 2d3854a3
      ("cpumask: introduce new API, without changing anything") did not
      evaluate the mask argument if NR_CPUS == 1 due to CONFIG_SMP=n,
      lru_add_drain_all() is hitting WARN_ON() at __flush_work() added by
      commit 4d43d395 ("workqueue: Try to catch flush_work() without
      INIT_WORK().") by unconditionally calling flush_work() [1].
      
      Workaround this issue by using CONFIG_SMP=n specific lru_add_drain_all
      implementation.  There is no real need to defer the implementation to
      the workqueue as the draining is going to happen on the local cpu.  So
      alias lru_add_drain_all to lru_add_drain which does all the necessary
      work.
      
      [akpm@linux-foundation.org: fix various build warnings]
      [1] https://lkml.kernel.org/r/18a30387-6aa5-6123-e67c-57579ecc3f38@roeck-us.net
      Link: http://lkml.kernel.org/r/20190213124334.GH4525@dhcp22.suse.czSigned-off-by: NMichal Hocko <mhocko@suse.com>
      Reported-by: NGuenter Roeck <linux@roeck-us.net>
      Debugged-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6ea183d6
    • M
      mm, page_alloc: fix a division by zero error when boosting watermarks v2 · 94b3334c
      Mel Gorman 提交于
      Yury Norov reported that an arm64 KVM instance could not boot since
      after v5.0-rc1 and could addressed by reverting the patches
      
        1c30844d ("mm: reclaim small amounts of memory when an external
        73444bc4 ("mm, page_alloc: do not wake kswapd with zone lock held")
      
      The problem is that a division by zero error is possible if boosting
      occurs very early in boot if the system has very little memory.  This
      patch avoids the division by zero error.
      
      Link: http://lkml.kernel.org/r/20190213143012.GT9565@techsingularity.net
      Fixes: 1c30844d ("mm: reclaim small amounts of memory when an external fragmentation event occurs")
      Signed-off-by: NMel Gorman <mgorman@techsingularity.net>
      Reported-by: NYury Norov <yury.norov@gmail.com>
      Tested-by: NYury Norov <yury.norov@gmail.com>
      Tested-by: NWill Deacon <will.deacon@arm.com>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      94b3334c
    • R
      mm/debug.c: fix __dump_page() for poisoned pages · 311ade0e
      Robin Murphy 提交于
      Evaluating page_mapping() on a poisoned page ends up dereferencing junk
      and making PF_POISONED_CHECK() considerably crashier than intended:
      
          Unable to handle kernel NULL pointer dereference at virtual address 0000000000000006
          Mem abort info:
            ESR = 0x96000005
            Exception class = DABT (current EL), IL = 32 bits
            SET = 0, FnV = 0
            EA = 0, S1PTW = 0
          Data abort info:
            ISV = 0, ISS = 0x00000005
            CM = 0, WnR = 0
          user pgtable: 4k pages, 39-bit VAs, pgdp = 00000000c2f6ac38
          [0000000000000006] pgd=0000000000000000, pud=0000000000000000
          Internal error: Oops: 96000005 [#1] PREEMPT SMP
          Modules linked in:
          CPU: 2 PID: 491 Comm: bash Not tainted 5.0.0-rc1+ #1
          Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development Platform, BIOS EDK II Dec 17 2018
          pstate: 00000005 (nzcv daif -PAN -UAO)
          pc : page_mapping+0x18/0x118
          lr : __dump_page+0x1c/0x398
          Process bash (pid: 491, stack limit = 0x000000004ebd4ecd)
          Call trace:
           page_mapping+0x18/0x118
           __dump_page+0x1c/0x398
           dump_page+0xc/0x18
           remove_store+0xbc/0x120
           dev_attr_store+0x18/0x28
           sysfs_kf_write+0x40/0x50
           kernfs_fop_write+0x130/0x1d8
           __vfs_write+0x30/0x180
           vfs_write+0xb4/0x1a0
           ksys_write+0x60/0xd0
           __arm64_sys_write+0x18/0x20
           el0_svc_common+0x94/0xf8
           el0_svc_handler+0x68/0x70
           el0_svc+0x8/0xc
          Code: f9400401 d1000422 f240003f 9a801040 (f9400402)
          ---[ end trace cdb5eb5bf435cecb ]---
      
      Fix that by not inspecting the mapping until we've determined that it's
      likely to be valid.  Now the above condition still ends up stopping the
      kernel, but in the correct manner:
      
          page:ffffffbf20000000 is uninitialized and poisoned
          raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
          raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
          page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
          ------------[ cut here ]------------
          kernel BUG at ./include/linux/mm.h:1006!
          Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
          Modules linked in:
          CPU: 1 PID: 483 Comm: bash Not tainted 5.0.0-rc1+ #3
          Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development Platform, BIOS EDK II Dec 17 2018
          pstate: 40000005 (nZcv daif -PAN -UAO)
          pc : remove_store+0xbc/0x120
          lr : remove_store+0xbc/0x120
          ...
      
      Link: http://lkml.kernel.org/r/03b53ee9d7e76cda4b9b5e1e31eea080db033396.1550071778.git.robin.murphy@arm.com
      Fixes: 1c6fb1d8 ("mm: print more information about mapping in __dump_page")
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      311ade0e
    • M
      proc, oom: do not report alien mms when setting oom_score_adj · b2b46993
      Michal Hocko 提交于
      Tetsuo has reported that creating a thousands of processes sharing MM
      without SIGHAND (aka alien threads) and setting
      /proc/<pid>/oom_score_adj will swamp the kernel log and takes ages [1]
      to finish.  This is especially worrisome that all that printing is done
      under RCU lock and this can potentially trigger RCU stall or softlockup
      detector.
      
      The primary reason for the printk was to catch potential users who might
      depend on the behavior prior to 44a70ade ("mm, oom_adj: make sure
      processes sharing mm have same view of oom_score_adj") but after more
      than 2 years without a single report I guess it is safe to simply remove
      the printk altogether.
      
      The next step should be moving oom_score_adj over to the mm struct and
      remove all the tasks crawling as suggested by [2]
      
      [1] http://lkml.kernel.org/r/97fce864-6f75-bca5-14bc-12c9f890e740@i-love.sakura.ne.jp
      [2] http://lkml.kernel.org/r/20190117155159.GA4087@dhcp22.suse.cz
      
      Link: http://lkml.kernel.org/r/20190212102129.26288-1-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com>
      Reported-by: NTetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Yong-Taek Lee <ytk.lee@samsung.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b2b46993
    • Q
      slub: fix SLAB_CONSISTENCY_CHECKS + KASAN_SW_TAGS · 338cfaad
      Qian Cai 提交于
      Enabling SLUB_DEBUG's SLAB_CONSISTENCY_CHECKS with KASAN_SW_TAGS
      triggers endless false positives during boot below due to
      check_valid_pointer() checks tagged pointers which have no addresses
      that is valid within slab pages:
      
        BUG radix_tree_node (Tainted: G    B            ): Freelist Pointer check fails
        -----------------------------------------------------------------------------
      
        INFO: Slab objects=69 used=69 fp=0x          (null) flags=0x7ffffffc000200
        INFO: Object @offset=15060037153926966016 fp=0x
      
        Redzone: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 18 6b 06 00 08 80 ff d0  .........k......
        Object : 18 6b 06 00 08 80 ff d0 00 00 00 00 00 00 00 00  .k..............
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Redzone: bb bb bb bb bb bb bb bb                          ........
        Padding: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ
        CPU: 0 PID: 0 Comm: swapper/0 Tainted: G    B             5.0.0-rc5+ #18
        Call trace:
          dump_backtrace+0x0/0x450
          show_stack+0x20/0x2c
          __dump_stack+0x20/0x28
          dump_stack+0xa0/0xfc
          print_trailer+0x1bc/0x1d0
          object_err+0x40/0x50
          alloc_debug_processing+0xf0/0x19c
          ___slab_alloc+0x554/0x704
          kmem_cache_alloc+0x2f8/0x440
          radix_tree_node_alloc+0x90/0x2fc
          idr_get_free+0x1e8/0x6d0
          idr_alloc_u32+0x11c/0x2a4
          idr_alloc+0x74/0xe0
          worker_pool_assign_id+0x5c/0xbc
          workqueue_init_early+0x49c/0xd50
          start_kernel+0x52c/0xac4
        FIX radix_tree_node: Marking all objects used
      
      Link: http://lkml.kernel.org/r/20190209044128.3290-1-cai@lca.pwSigned-off-by: NQian Cai <cai@lca.pw>
      Reviewed-by: NAndrey Konovalov <andreyknvl@google.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      338cfaad
    • A
      kasan, slub: fix more conflicts with CONFIG_SLAB_FREELIST_HARDENED · d36a63a9
      Andrey Konovalov 提交于
      When CONFIG_KASAN_SW_TAGS is enabled, ptr_addr might be tagged.  Normally,
      this doesn't cause any issues, as both set_freepointer() and
      get_freepointer() are called with a pointer with the same tag.  However,
      there are some issues with CONFIG_SLUB_DEBUG code.  For example, when
      __free_slub() iterates over objects in a cache, it passes untagged
      pointers to check_object().  check_object() in turns calls
      get_freepointer() with an untagged pointer, which causes the freepointer
      to be restored incorrectly.
      
      Add kasan_reset_tag to freelist_ptr(). Also add a detailed comment.
      
      Link: http://lkml.kernel.org/r/bf858f26ef32eb7bd24c665755b3aee4bc58d0e4.1550103861.git.andreyknvl@google.comSigned-off-by: NAndrey Konovalov <andreyknvl@google.com>
      Reported-by: NQian Cai <cai@lca.pw>
      Tested-by: NQian Cai <cai@lca.pw>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d36a63a9
    • A
      kasan, slub: fix conflicts with CONFIG_SLAB_FREELIST_HARDENED · 18e50661
      Andrey Konovalov 提交于
      CONFIG_SLAB_FREELIST_HARDENED hashes freelist pointer with the address of
      the object where the pointer gets stored.  With tag based KASAN we don't
      account for that when building freelist, as we call set_freepointer() with
      the first argument untagged.  This patch changes the code to properly
      propagate tags throughout the loop.
      
      Link: http://lkml.kernel.org/r/3df171559c52201376f246bf7ce3184fe21c1dc7.1549921721.git.andreyknvl@google.comSigned-off-by: NAndrey Konovalov <andreyknvl@google.com>
      Reported-by: NQian Cai <cai@lca.pw>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
      Cc: Kostya Serebryany <kcc@google.com>
      Cc: Evgeniy Stepanov <eugenis@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      18e50661
    • A
      kasan, slub: move kasan_poison_slab hook before page_address · a7101224
      Andrey Konovalov 提交于
      With tag based KASAN page_address() looks at the page flags to see whether
      the resulting pointer needs to have a tag set.  Since we don't want to set
      a tag when page_address() is called on SLAB pages, we call
      page_kasan_tag_reset() in kasan_poison_slab().  However in allocate_slab()
      page_address() is called before kasan_poison_slab().  Fix it by changing
      the order.
      
      [andreyknvl@google.com: fix compilation error when CONFIG_SLUB_DEBUG=n]
        Link: http://lkml.kernel.org/r/ac27cc0bbaeb414ed77bcd6671a877cf3546d56e.1550066133.git.andreyknvl@google.com
      Link: http://lkml.kernel.org/r/cd895d627465a3f1c712647072d17f10883be2a1.1549921721.git.andreyknvl@google.comSigned-off-by: NAndrey Konovalov <andreyknvl@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Evgeniy Stepanov <eugenis@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Kostya Serebryany <kcc@google.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Qian Cai <cai@lca.pw>
      Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a7101224
    • A
      kmemleak: account for tagged pointers when calculating pointer range · a2f77575
      Andrey Konovalov 提交于
      kmemleak keeps two global variables, min_addr and max_addr, which store
      the range of valid (encountered by kmemleak) pointer values, which it
      later uses to speed up pointer lookup when scanning blocks.
      
      With tagged pointers this range will get bigger than it needs to be.  This
      patch makes kmemleak untag pointers before saving them to min_addr and
      max_addr and when performing a lookup.
      
      Link: http://lkml.kernel.org/r/16e887d442986ab87fe87a755815ad92fa431a5f.1550066133.git.andreyknvl@google.comSigned-off-by: NAndrey Konovalov <andreyknvl@google.com>
      Tested-by: NQian Cai <cai@lca.pw>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Evgeniy Stepanov <eugenis@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Kostya Serebryany <kcc@google.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a2f77575
    • A
      kasan, kmemleak: pass tagged pointers to kmemleak · 53128245
      Andrey Konovalov 提交于
      Right now we call kmemleak hooks before assigning tags to pointers in
      KASAN hooks.  As a result, when an objects gets allocated, kmemleak sees a
      differently tagged pointer, compared to the one it sees when the object
      gets freed.  Fix it by calling KASAN hooks before kmemleak's ones.
      
      Link: http://lkml.kernel.org/r/cd825aa4897b0fc37d3316838993881daccbe9f5.1549921721.git.andreyknvl@google.comSigned-off-by: NAndrey Konovalov <andreyknvl@google.com>
      Reported-by: NQian Cai <cai@lca.pw>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Evgeniy Stepanov <eugenis@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Kostya Serebryany <kcc@google.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      53128245
    • A
      kasan: fix assigning tags twice · e1db95be
      Andrey Konovalov 提交于
      When an object is kmalloc()'ed, two hooks are called: kasan_slab_alloc()
      and kasan_kmalloc().  Right now we assign a tag twice, once in each of the
      hooks.  Fix it by assigning a tag only in the former hook.
      
      Link: http://lkml.kernel.org/r/ce8c6431da735aa7ec051fd6497153df690eb021.1549921721.git.andreyknvl@google.comSigned-off-by: NAndrey Konovalov <andreyknvl@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Evgeniy Stepanov <eugenis@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Kostya Serebryany <kcc@google.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Qian Cai <cai@lca.pw>
      Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e1db95be
    • R
      numa: change get_mempolicy() to use nr_node_ids instead of MAX_NUMNODES · 050c17f2
      Ralph Campbell 提交于
      The system call, get_mempolicy() [1], passes an unsigned long *nodemask
      pointer and an unsigned long maxnode argument which specifies the length
      of the user's nodemask array in bits (which is rounded up).  The manual
      page says that if the maxnode value is too small, get_mempolicy will
      return EINVAL but there is no system call to return this minimum value.
      To determine this value, some programs search /proc/<pid>/status for a
      line starting with "Mems_allowed:" and use the number of digits in the
      mask to determine the minimum value.  A recent change to the way this line
      is formatted [2] causes these programs to compute a value less than
      MAX_NUMNODES so get_mempolicy() returns EINVAL.
      
      Change get_mempolicy(), the older compat version of get_mempolicy(), and
      the copy_nodes_to_user() function to use nr_node_ids instead of
      MAX_NUMNODES, thus preserving the defacto method of computing the minimum
      size for the nodemask array and the maxnode argument.
      
      [1] http://man7.org/linux/man-pages/man2/get_mempolicy.2.html
      [2] https://lore.kernel.org/lkml/1545405631-6808-1-git-send-email-longman@redhat.com
      
      Link: http://lkml.kernel.org/r/20190211180245.22295-1-rcampbell@nvidia.com
      Fixes: 4fb8e5b89bcbbbb ("include/linux/nodemask.h: use nr_node_ids (not MAX_NUMNODES) in __nodemask_pr_numnodes()")
      Signed-off-by: NRalph Campbell <rcampbell@nvidia.com>
      Suggested-by: NAlexander Duyck <alexander.duyck@gmail.com>
      Cc: Waiman Long <longman@redhat.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      050c17f2
    • A
      revert "initramfs: cleanup incomplete rootfs" · a841c673
      Andrew Morton 提交于
      Revert ff1522bb ("initramfs: cleanup incomplete rootfs").
      
      Andy reports
      
      : This breaks my setup where I have U-boot provided more size of initramfs
      : than needed.  This allows a bit of flexibility to increase or decrease
      : initramfs compressed image without taking care of bootloader.  The proper
      : solution is to do this if we sure that we didn't get enough memory,
      : otherwise I can't consider the error fatal to clean up rootfs.
      
      Fixes: ff1522bb ("initramfs: cleanup incomplete rootfs")
      Reported-by: NAndy Shevchenko <andy.shevchenko@gmail.com>
      Tested-by: NAndy Shevchenko <andy.shevchenko@gmail.com>
      Cc: David Engraf <david.engraf@sysgo.com>
      Cc: Dominik Brodowski <linux@dominikbrodowski.net>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Philippe Ombredanne <pombredanne@nexb.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a841c673
  2. 21 2月, 2019 5 次提交
  3. 20 2月, 2019 3 次提交
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 40e196a9
      Linus Torvalds 提交于
      Pull networking fixes from David Miller:
      
       1) Fix suspend and resume in mt76x0u USB driver, from Stanislaw
          Gruszka.
      
       2) Missing memory barriers in xsk, from Magnus Karlsson.
      
       3) rhashtable fixes in mac80211 from Herbert Xu.
      
       4) 32-bit MIPS eBPF JIT fixes from Paul Burton.
      
       5) Fix for_each_netdev_feature() on big endian, from Hauke Mehrtens.
      
       6) GSO validation fixes from Willem de Bruijn.
      
       7) Endianness fix for dwmac4 timestamp handling, from Alexandre Torgue.
      
       8) More strict checks in tcp_v4_err(), from Eric Dumazet.
      
       9) af_alg_release should NULL out the sk after the sock_put(), from Mao
          Wenan.
      
      10) Missing unlock in mac80211 mesh error path, from Wei Yongjun.
      
      11) Missing device put in hns driver, from Salil Mehta.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (44 commits)
        sky2: Increase D3 delay again
        vhost: correctly check the return value of translate_desc() in log_used()
        net: netcp: Fix ethss driver probe issue
        net: hns: Fixes the missing put_device in positive leg for roce reset
        net: stmmac: Fix a race in EEE enable callback
        qed: Fix iWARP syn packet mac address validation.
        qed: Fix iWARP buffer size provided for syn packet processing.
        r8152: Add support for MAC address pass through on RTL8153-BD
        mac80211: mesh: fix missing unlock on error in table_path_del()
        net/mlx4_en: fix spelling mistake: "quiting" -> "quitting"
        net: crypto set sk to NULL when af_alg_release.
        net: Do not allocate page fragments that are not skb aligned
        mm: Use fixed constant in page_frag_alloc instead of size + 1
        tcp: tcp_v4_err() should be more careful
        tcp: clear icsk_backoff in tcp_write_queue_purge()
        net: mv643xx_eth: disable clk on error path in mv643xx_eth_shared_probe()
        qmi_wwan: apply SET_DTR quirk to Sierra WP7607
        net: stmmac: handle endianness in dwmac4_get_timestamp
        doc: Mention MSG_ZEROCOPY implementation for UDP
        mlxsw: __mlxsw_sp_port_headroom_set(): Fix a use of local variable
        ...
      40e196a9
    • K
      sky2: Increase D3 delay again · 1765f5dc
      Kai-Heng Feng 提交于
      Another platform requires even longer delay to make the device work
      correctly after S3.
      
      So increase the delay to 300ms.
      
      BugLink: https://bugs.launchpad.net/bugs/1798921Signed-off-by: NKai-Heng Feng <kai.heng.feng@canonical.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1765f5dc
    • J
      vhost: correctly check the return value of translate_desc() in log_used() · 816db766
      Jason Wang 提交于
      When fail, translate_desc() returns negative value, otherwise the
      number of iovs. So we should fail when the return value is negative
      instead of a blindly check against zero.
      
      Detected by CoverityScan, CID# 1442593:  Control flow issues  (DEADCODE)
      
      Fixes: cc5e7107 ("vhost: log dirty page correctly")
      Acked-by: NMichael S. Tsirkin <mst@redhat.com>
      Reported-by: NStephen Hemminger <stephen@networkplumber.org>
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      816db766
  4. 19 2月, 2019 16 次提交