1. 28 4月, 2023 1 次提交
  2. 27 4月, 2023 9 次提交
  3. 10 11月, 2022 1 次提交
    • W
      mm/mempolicy: fix uninit-value in mpol_rebind_policy() · 7057a3c7
      Wang Cheng 提交于
      stable inclusion
      from stable-v5.10.134
      commit ddb3f0b68863bd1c5f43177eea476bce316d4993
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
      
      Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=ddb3f0b68863bd1c5f43177eea476bce316d4993
      
      --------------------------------
      
      commit 018160ad upstream.
      
      mpol_set_nodemask()(mm/mempolicy.c) does not set up nodemask when
      pol->mode is MPOL_LOCAL.  Check pol->mode before access
      pol->w.cpuset_mems_allowed in mpol_rebind_policy()(mm/mempolicy.c).
      
      BUG: KMSAN: uninit-value in mpol_rebind_policy mm/mempolicy.c:352 [inline]
      BUG: KMSAN: uninit-value in mpol_rebind_task+0x2ac/0x2c0 mm/mempolicy.c:368
       mpol_rebind_policy mm/mempolicy.c:352 [inline]
       mpol_rebind_task+0x2ac/0x2c0 mm/mempolicy.c:368
       cpuset_change_task_nodemask kernel/cgroup/cpuset.c:1711 [inline]
       cpuset_attach+0x787/0x15e0 kernel/cgroup/cpuset.c:2278
       cgroup_migrate_execute+0x1023/0x1d20 kernel/cgroup/cgroup.c:2515
       cgroup_migrate kernel/cgroup/cgroup.c:2771 [inline]
       cgroup_attach_task+0x540/0x8b0 kernel/cgroup/cgroup.c:2804
       __cgroup1_procs_write+0x5cc/0x7a0 kernel/cgroup/cgroup-v1.c:520
       cgroup1_tasks_write+0x94/0xb0 kernel/cgroup/cgroup-v1.c:539
       cgroup_file_write+0x4c2/0x9e0 kernel/cgroup/cgroup.c:3852
       kernfs_fop_write_iter+0x66a/0x9f0 fs/kernfs/file.c:296
       call_write_iter include/linux/fs.h:2162 [inline]
       new_sync_write fs/read_write.c:503 [inline]
       vfs_write+0x1318/0x2030 fs/read_write.c:590
       ksys_write+0x28b/0x510 fs/read_write.c:643
       __do_sys_write fs/read_write.c:655 [inline]
       __se_sys_write fs/read_write.c:652 [inline]
       __x64_sys_write+0xdb/0x120 fs/read_write.c:652
       do_syscall_x64 arch/x86/entry/common.c:51 [inline]
       do_syscall_64+0x54/0xd0 arch/x86/entry/common.c:82
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      Uninit was created at:
       slab_post_alloc_hook mm/slab.h:524 [inline]
       slab_alloc_node mm/slub.c:3251 [inline]
       slab_alloc mm/slub.c:3259 [inline]
       kmem_cache_alloc+0x902/0x11c0 mm/slub.c:3264
       mpol_new mm/mempolicy.c:293 [inline]
       do_set_mempolicy+0x421/0xb70 mm/mempolicy.c:853
       kernel_set_mempolicy mm/mempolicy.c:1504 [inline]
       __do_sys_set_mempolicy mm/mempolicy.c:1510 [inline]
       __se_sys_set_mempolicy+0x44c/0xb60 mm/mempolicy.c:1507
       __x64_sys_set_mempolicy+0xd8/0x110 mm/mempolicy.c:1507
       do_syscall_x64 arch/x86/entry/common.c:51 [inline]
       do_syscall_64+0x54/0xd0 arch/x86/entry/common.c:82
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      KMSAN: uninit-value in mpol_rebind_task (2)
      https://syzkaller.appspot.com/bug?id=d6eb90f952c2a5de9ea718a1b873c55cb13b59dc
      
      This patch seems to fix below bug too.
      KMSAN: uninit-value in mpol_rebind_mm (2)
      https://syzkaller.appspot.com/bug?id=f2fecd0d7013f54ec4162f60743a2b28df40926b
      
      The uninit-value is pol->w.cpuset_mems_allowed in mpol_rebind_policy().
      When syzkaller reproducer runs to the beginning of mpol_new(),
      
      	    mpol_new() mm/mempolicy.c
      	  do_mbind() mm/mempolicy.c
      	kernel_mbind() mm/mempolicy.c
      
      `mode` is 1(MPOL_PREFERRED), nodes_empty(*nodes) is `true` and `flags`
      is 0. Then
      
      	mode = MPOL_LOCAL;
      	...
      	policy->mode = mode;
      	policy->flags = flags;
      
      will be executed. So in mpol_set_nodemask(),
      
      	    mpol_set_nodemask() mm/mempolicy.c
      	  do_mbind()
      	kernel_mbind()
      
      pol->mode is 4 (MPOL_LOCAL), that `nodemask` in `pol` is not initialized,
      which will be accessed in mpol_rebind_policy().
      
      Link: https://lkml.kernel.org/r/20220512123428.fq3wofedp6oiotd4@ppc.localdomainSigned-off-by: NWang Cheng <wanngchenng@gmail.com>
      Reported-by: <syzbot+217f792c92599518a2ab@syzkaller.appspotmail.com>
      Tested-by: <syzbot+217f792c92599518a2ab@syzkaller.appspotmail.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      Reviewed-by: NWei Li <liwei391@huawei.com>
      7057a3c7
  4. 18 7月, 2022 1 次提交
  5. 06 7月, 2022 1 次提交
  6. 07 6月, 2022 1 次提交
  7. 21 4月, 2022 1 次提交
  8. 26 1月, 2022 1 次提交
    • A
      mm: mempolicy: fix THP allocations escaping mempolicy restrictions · 32a6b9a8
      Andrey Ryabinin 提交于
      stable inclusion
      from stable-v5.10.89
      commit ee6f34215c5dfa2257298cc362cd79e14af5a25a
      bugzilla: 186140 https://gitee.com/openeuler/kernel/issues/I4S8HA
      
      Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=ee6f34215c5dfa2257298cc362cd79e14af5a25a
      
      --------------------------------
      
      alloc_pages_vma() may try to allocate THP page on the local NUMA node
      first:
      
      	page = __alloc_pages_node(hpage_node,
      		gfp | __GFP_THISNODE | __GFP_NORETRY, order);
      
      And if the allocation fails it retries allowing remote memory:
      
      	if (!page && (gfp & __GFP_DIRECT_RECLAIM))
          		page = __alloc_pages_node(hpage_node,
      					gfp, order);
      
      However, this retry allocation completely ignores memory policy nodemask
      allowing allocation to escape restrictions.
      
      The first appearance of this bug seems to be the commit ac5b2c18
      ("mm: thp: relax __GFP_THISNODE for MADV_HUGEPAGE mappings").
      
      The bug disappeared later in the commit 89c83fb5 ("mm, thp:
      consolidate THP gfp handling into alloc_hugepage_direct_gfpmask") and
      reappeared again in slightly different form in the commit 76e654cc
      ("mm, page_alloc: allow hugepage fallback to remote nodes when
      madvised")
      
      Fix this by passing correct nodemask to the __alloc_pages() call.
      
      The demonstration/reproducer of the problem:
      
          $ mount -oremount,size=4G,huge=always /dev/shm/
          $ echo always > /sys/kernel/mm/transparent_hugepage/defrag
          $ cat mbind_thp.c
          #include <unistd.h>
          #include <sys/mman.h>
          #include <sys/stat.h>
          #include <fcntl.h>
          #include <assert.h>
          #include <stdlib.h>
          #include <stdio.h>
          #include <numaif.h>
      
          #define SIZE 2ULL << 30
          int main(int argc, char **argv)
          {
              int fd;
              unsigned long long i;
              char *addr;
              pid_t pid;
              char buf[100];
              unsigned long nodemask = 1;
      
              fd = open("/dev/shm/test", O_RDWR|O_CREAT);
              assert(fd > 0);
              assert(ftruncate(fd, SIZE) == 0);
      
              addr = mmap(NULL, SIZE, PROT_READ|PROT_WRITE,
                                 MAP_SHARED, fd, 0);
      
              assert(mbind(addr, SIZE, MPOL_BIND, &nodemask, 2, MPOL_MF_STRICT|MPOL_MF_MOVE)==0);
              for (i = 0; i < SIZE; i+=4096) {
                addr[i] = 1;
              }
              pid = getpid();
              snprintf(buf, sizeof(buf), "grep shm /proc/%d/numa_maps", pid);
              system(buf);
              sleep(10000);
      
              return 0;
          }
          $ gcc mbind_thp.c -o mbind_thp -lnuma
          $ numactl -H
          available: 2 nodes (0-1)
          node 0 cpus: 0 2
          node 0 size: 1918 MB
          node 0 free: 1595 MB
          node 1 cpus: 1 3
          node 1 size: 2014 MB
          node 1 free: 1731 MB
          node distances:
          node   0   1
            0:  10  20
            1:  20  10
          $ rm -f /dev/shm/test; taskset -c 0 ./mbind_thp
          7fd970a00000 bind:0 file=/dev/shm/test dirty=524288 active=0 N0=396800 N1=127488 kernelpagesize_kB=4
      
      Link: https://lkml.kernel.org/r/20211208165343.22349-1-arbn@yandex-team.com
      Fixes: ac5b2c18 ("mm: thp: relax __GFP_THISNODE for MADV_HUGEPAGE mappings")
      Signed-off-by: NAndrey Ryabinin <arbn@yandex-team.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Acked-by: NMel Gorman <mgorman@techsingularity.net>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      32a6b9a8
  9. 29 11月, 2021 3 次提交
  10. 14 7月, 2021 6 次提交
  11. 03 11月, 2020 1 次提交
  12. 14 10月, 2020 2 次提交
  13. 15 8月, 2020 1 次提交
  14. 13 8月, 2020 5 次提交
  15. 17 7月, 2020 1 次提交
  16. 10 6月, 2020 3 次提交
  17. 04 6月, 2020 1 次提交
  18. 08 4月, 2020 1 次提交
    • P
      mm/mempolicy: Allow lookup_node() to handle fatal signal · ba841078
      Peter Xu 提交于
      lookup_node() uses gup to pin the page and get node information.  It
      checks against ret>=0 assuming the page will be filled in.  However it's
      also possible that gup will return zero, for example, when the thread is
      quickly killed with a fatal signal.  Teach lookup_node() to gracefully
      return an error -EFAULT if it happens.
      
      Meanwhile, initialize "page" to NULL to avoid potential risk of
      exploiting the pointer.
      
      Fixes: 4426e945 ("mm/gup: allow VM_FAULT_RETRY for multiple times")
      Reported-by: syzbot+693dc11fcb53120b5559@syzkaller.appspotmail.com
      Signed-off-by: NPeter Xu <peterx@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ba841078