1. 20 5月, 2023 2 次提交
    • O
      !786 Support userswap feature · 22ff6706
      openeuler-ci-bot 提交于
      Merge Pull Request from: @anred 
       
      This patch series optimizes userswap mainly including swap-in and
      swap-out.
      
      We tested the concurrent scenario of multi-threaded page fault and
      multi-threaded swap-in in the uswap demo;and the remapping in the
      swap-out phase and the copy-free function in the swap-in phase were ok.
      During the test, related debugging functions including CONFIG_DEBUG_VM,
      lockdep, slub debug, kasan and kmemleak are enabled. 
       
      Link:https://gitee.com/openeuler/kernel/pulls/786 
      
      Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com> 
      Reviewed-by: Jialin Zhang <zhangjialin11@huawei.com> 
      Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com> 
      22ff6706
    • O
      !790 mm: enable ksm per process and cgroup · dcc34901
      openeuler-ci-bot 提交于
      Merge Pull Request from: @sun_nanyong 
       
      patch 1~6: backport mainline patchset(mm: process/cgroup ksm
      support)and patches it depends on.
      patch 7:Add control file "memory.ksm" to enable ksm per cgroup. 
       
      Link:https://gitee.com/openeuler/kernel/pulls/790 
      
      Reviewed-by: Jialin Zhang <zhangjialin11@huawei.com> 
      Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com> 
      Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com> 
      dcc34901
  2. 19 5月, 2023 8 次提交
    • N
      memcg: support ksm merge any mode per cgroup · 0f6fb357
      Nanyong Sun 提交于
      hulk inclusion
      category: feature
      bugzilla: https://gitee.com/openeuler/kernel/issues/I72R0B
      CVE: NA
      
      ----------------------------------------------------------------------
      
      Add control file "memory.ksm" to enable ksm per cgroup.
      Echo to 1 will set all tasks currently in the cgroup to ksm merge
      any mode, which means ksm gets enabled for all vma's of a process.
      Meanwhile echo to 0 will disable ksm for them and unmerge the
      merged pages.
      Cat the file will show the above state and ksm related profits
      of this cgroup.
      Signed-off-by: NNanyong Sun <sunnanyong@huawei.com>
      0f6fb357
    • D
      mm/ksm: unmerge and clear VM_MERGEABLE when setting PR_SET_MEMORY_MERGE=0 · 351ceedb
      David Hildenbrand 提交于
      mainline inclusion
      from mainline-v6.4-rc1
      commit 24139c07
      category: feature
      bugzilla: https://gitee.com/openeuler/kernel/issues/I72R0B
      CVE: NA
      
      Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=24139c07f413ef4b555482c758343d71392a19bc
      
      ----------------------------------------------------------------------
      
      Patch series "mm/ksm: improve PR_SET_MEMORY_MERGE=0 handling and cleanup
      disabling KSM", v2.
      
      (1) Make PR_SET_MEMORY_MERGE=0 unmerge pages like setting MADV_UNMERGEABLE
      does, (2) add a selftest for it and (3) factor out disabling of KSM from
      s390/gmap code.
      
      This patch (of 3):
      
      Let's unmerge any KSM pages when setting PR_SET_MEMORY_MERGE=0, and clear
      the VM_MERGEABLE flag from all VMAs -- just like KSM would.  Of course,
      only do that if we previously set PR_SET_MEMORY_MERGE=1.
      
      Link: https://lkml.kernel.org/r/20230422205420.30372-1-david@redhat.com
      Link: https://lkml.kernel.org/r/20230422205420.30372-2-david@redhat.comSigned-off-by: NDavid Hildenbrand <david@redhat.com>
      Acked-by: NStefan Roesch <shr@devkernel.io>
      Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
      Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Janosch Frank <frankja@linux.ibm.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Shuah Khan <shuah@kernel.org>
      Cc: Sven Schnelle <svens@linux.ibm.com>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Conflicts:
      	mm/ksm.c
      Signed-off-by: NNanyong Sun <sunnanyong@huawei.com>
      351ceedb
    • S
      mm: add new KSM process and sysfs knobs · a098d41e
      Stefan Roesch 提交于
      mainline inclusion
      from mainline-v6.4-rc1
      commit d21077fb
      category: feature
      bugzilla: https://gitee.com/openeuler/kernel/issues/I72R0B
      CVE: NA
      
      Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d21077fbc2fc987c2e593c34dc3b4d84e546dc9f
      
      ----------------------------------------------------------------------
      
      This adds the general_profit KSM sysfs knob and the process profit metric
      knobs to ksm_stat.
      
      1) expose general_profit metric
      
         The documentation mentions a general profit metric, however this
         metric is not calculated.  In addition the formula depends on the size
         of internal structures, which makes it more difficult for an
         administrator to make the calculation.  Adding the metric for a better
         user experience.
      
      2) document general_profit sysfs knob
      
      3) calculate ksm process profit metric
      
         The ksm documentation mentions the process profit metric and how to
         calculate it.  This adds the calculation of the metric.
      
      4) mm: expose ksm process profit metric in ksm_stat
      
         This exposes the ksm process profit metric in /proc/<pid>/ksm_stat.
         The documentation mentions the formula for the ksm process profit
         metric, however it does not calculate it.  In addition the formula
         depends on the size of internal structures.  So it makes sense to
         expose it.
      
      5) document new procfs ksm knobs
      
      Link: https://lkml.kernel.org/r/20230418051342.1919757-3-shr@devkernel.ioSigned-off-by: NStefan Roesch <shr@devkernel.io>
      Reviewed-by: NBagas Sanjaya <bagasdotme@gmail.com>
      Acked-by: NDavid Hildenbrand <david@redhat.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Rik van Riel <riel@surriel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NNanyong Sun <sunnanyong@huawei.com>
      a098d41e
    • S
      mm: add new api to enable ksm per process · 2cd2cdfe
      Stefan Roesch 提交于
      mainline inclusion
      from mainline-v6.4-rc1
      commit d7597f59
      category: feature
      bugzilla: https://gitee.com/openeuler/kernel/issues/I72R0B
      CVE: NA
      
      Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d7597f59d1d33e9efbffa7060deb9ee5bd119e62
      
      ----------------------------------------------------------------------
      
      Patch series "mm: process/cgroup ksm support", v9.
      
      So far KSM can only be enabled by calling madvise for memory regions.  To
      be able to use KSM for more workloads, KSM needs to have the ability to be
      enabled / disabled at the process / cgroup level.
      
      Use case 1:
        The madvise call is not available in the programming language.  An
        example for this are programs with forked workloads using a garbage
        collected language without pointers.  In such a language madvise cannot
        be made available.
      
        In addition the addresses of objects get moved around as they are
        garbage collected.  KSM sharing needs to be enabled "from the outside"
        for these type of workloads.
      
      Use case 2:
        The same interpreter can also be used for workloads where KSM brings
        no benefit or even has overhead.  We'd like to be able to enable KSM on
        a workload by workload basis.
      
      Use case 3:
        With the madvise call sharing opportunities are only enabled for the
        current process: it is a workload-local decision.  A considerable number
        of sharing opportunities may exist across multiple workloads or jobs (if
        they are part of the same security domain).  Only a higler level entity
        like a job scheduler or container can know for certain if its running
        one or more instances of a job.  That job scheduler however doesn't have
        the necessary internal workload knowledge to make targeted madvise
        calls.
      
      Security concerns:
      
        In previous discussions security concerns have been brought up.  The
        problem is that an individual workload does not have the knowledge about
        what else is running on a machine.  Therefore it has to be very
        conservative in what memory areas can be shared or not.  However, if the
        system is dedicated to running multiple jobs within the same security
        domain, its the job scheduler that has the knowledge that sharing can be
        safely enabled and is even desirable.
      
      Performance:
      
        Experiments with using UKSM have shown a capacity increase of around 20%.
      
        Here are the metrics from an instagram workload (taken from a machine
        with 64GB main memory):
      
         full_scans: 445
         general_profit: 20158298048
         max_page_sharing: 256
         merge_across_nodes: 1
         pages_shared: 129547
         pages_sharing: 5119146
         pages_to_scan: 4000
         pages_unshared: 1760924
         pages_volatile: 10761341
         run: 1
         sleep_millisecs: 20
         stable_node_chains: 167
         stable_node_chains_prune_millisecs: 2000
         stable_node_dups: 2751
         use_zero_pages: 0
         zero_pages_sharing: 0
      
      After the service is running for 30 minutes to an hour, 4 to 5 million
      shared pages are common for this workload when using KSM.
      
      Detailed changes:
      
      1. New options for prctl system command
         This patch series adds two new options to the prctl system call.
         The first one allows to enable KSM at the process level and the second
         one to query the setting.
      
      The setting will be inherited by child processes.
      
      With the above setting, KSM can be enabled for the seed process of a cgroup
      and all processes in the cgroup will inherit the setting.
      
      2. Changes to KSM processing
         When KSM is enabled at the process level, the KSM code will iterate
         over all the VMA's and enable KSM for the eligible VMA's.
      
         When forking a process that has KSM enabled, the setting will be
         inherited by the new child process.
      
      3. Add general_profit metric
         The general_profit metric of KSM is specified in the documentation,
         but not calculated.  This adds the general profit metric to
         /sys/kernel/debug/mm/ksm.
      
      4. Add more metrics to ksm_stat
         This adds the process profit metric to /proc/<pid>/ksm_stat.
      
      5. Add more tests to ksm_tests and ksm_functional_tests
         This adds an option to specify the merge type to the ksm_tests.
         This allows to test madvise and prctl KSM.
      
         It also adds a two new tests to ksm_functional_tests: one to test
         the new prctl options and the other one is a fork test to verify that
         the KSM process setting is inherited by client processes.
      
      This patch (of 3):
      
      So far KSM can only be enabled by calling madvise for memory regions.  To
      be able to use KSM for more workloads, KSM needs to have the ability to be
      enabled / disabled at the process / cgroup level.
      
      1. New options for prctl system command
      
         This patch series adds two new options to the prctl system call.
         The first one allows to enable KSM at the process level and the second
         one to query the setting.
      
         The setting will be inherited by child processes.
      
         With the above setting, KSM can be enabled for the seed process of a
         cgroup and all processes in the cgroup will inherit the setting.
      
      2. Changes to KSM processing
      
         When KSM is enabled at the process level, the KSM code will iterate
         over all the VMA's and enable KSM for the eligible VMA's.
      
         When forking a process that has KSM enabled, the setting will be
         inherited by the new child process.
      
        1) Introduce new MMF_VM_MERGE_ANY flag
      
           This introduces the new flag MMF_VM_MERGE_ANY flag.  When this flag
           is set, kernel samepage merging (ksm) gets enabled for all vma's of a
           process.
      
        2) Setting VM_MERGEABLE on VMA creation
      
           When a VMA is created, if the MMF_VM_MERGE_ANY flag is set, the
           VM_MERGEABLE flag will be set for this VMA.
      
        3) support disabling of ksm for a process
      
           This adds the ability to disable ksm for a process if ksm has been
           enabled for the process with prctl.
      
        4) add new prctl option to get and set ksm for a process
      
           This adds two new options to the prctl system call
           - enable ksm for all vmas of a process (if the vmas support it).
           - query if ksm has been enabled for a process.
      
      3. Disabling MMF_VM_MERGE_ANY for storage keys in s390
      
         In the s390 architecture when storage keys are used, the
         MMF_VM_MERGE_ANY will be disabled.
      
      Link: https://lkml.kernel.org/r/20230418051342.1919757-1-shr@devkernel.io
      Link: https://lkml.kernel.org/r/20230418051342.1919757-2-shr@devkernel.ioSigned-off-by: NStefan Roesch <shr@devkernel.io>
      Acked-by: NDavid Hildenbrand <david@redhat.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Bagas Sanjaya <bagasdotme@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Conflicts:
      	kernel/sys.c mm/ksm.c mm/mmap.c
      Signed-off-by: NNanyong Sun <sunnanyong@huawei.com>
      2cd2cdfe
    • X
      ksm: add profit monitoring documentation · ac02d6e7
      xu xin 提交于
      mainline inclusion
      from mainline-v6.1-rc1
      commit 21b7bdb5
      category: feature
      bugzilla: https://gitee.com/openeuler/kernel/issues/I72R0B
      CVE: NA
      
      Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=21b7bdb504ae6b0a795c8d63818611ce02b532c1
      
      ----------------------------------------------------------------------
      
      Add the description of KSM profit and how to determine it separately in
      system-wide range and inner a single process.
      
      Link: https://lkml.kernel.org/r/20220830144003.299870-1-xu.xin16@zte.com.cnSigned-off-by: Nxu xin <xu.xin16@zte.com.cn>
      Reviewed-by: NXiaokai Ran <ran.xiaokai@zte.com.cn>
      Reviewed-by: NYang Yang <yang.yang29@zte.com.cn>
      Reviewed-by: NBagas Sanjaya <bagasdotme@gmail.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Izik Eidus <izik.eidus@ravellosystems.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Conflicts:
      	Documentation/admin-guide/mm/ksm.rst
      Signed-off-by: NNanyong Sun <sunnanyong@huawei.com>
      ac02d6e7
    • X
      ksm: count allocated ksm rmap_items for each process · 8c3ecf85
      xu xin 提交于
      mainline inclusion
      from mainline-v6.1-rc1
      commit cb4df4ca
      category: feature
      bugzilla: https://gitee.com/openeuler/kernel/issues/I72R0B
      CVE: NA
      
      Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=cb4df4cae4f2bd8cf7a32eff81178fce31600f7c
      
      ----------------------------------------------------------------------
      
      Patch series "ksm: count allocated rmap_items and update documentation",
      v5.
      
      KSM can save memory by merging identical pages, but also can consume
      additional memory, because it needs to generate rmap_items to save each
      scanned page's brief rmap information.
      
      To determine how beneficial the ksm-policy (like madvise), they are using
      brings, so we add a new interface /proc/<pid>/ksm_stat for each process
      The value "ksm_rmap_items" in it indicates the total allocated ksm
      rmap_items of this process.
      
      The detailed description can be seen in the following patches' commit
      message.
      
      This patch (of 2):
      
      KSM can save memory by merging identical pages, but also can consume
      additional memory, because it needs to generate rmap_items to save each
      scanned page's brief rmap information.  Some of these pages may be merged,
      but some may not be abled to be merged after being checked several times,
      which are unprofitable memory consumed.
      
      The information about whether KSM save memory or consume memory in
      system-wide range can be determined by the comprehensive calculation of
      pages_sharing, pages_shared, pages_unshared and pages_volatile.  A simple
      approximate calculation:
      
      	profit =~ pages_sharing * sizeof(page) - (all_rmap_items) *
      	         sizeof(rmap_item);
      
      where all_rmap_items equals to the sum of pages_sharing, pages_shared,
      pages_unshared and pages_volatile.
      
      But we cannot calculate this kind of ksm profit inner single-process wide
      because the information of ksm rmap_item's number of a process is lacked.
      For user applications, if this kind of information could be obtained, it
      helps upper users know how beneficial the ksm-policy (like madvise) they
      are using brings, and then optimize their app code.  For example, one
      application madvise 1000 pages as MERGEABLE, while only a few pages are
      really merged, then it's not cost-efficient.
      
      So we add a new interface /proc/<pid>/ksm_stat for each process in which
      the value of ksm_rmap_itmes is only shown now and so more values can be
      added in future.
      
      So similarly, we can calculate the ksm profit approximately for a single
      process by:
      
      	profit =~ ksm_merging_pages * sizeof(page) - ksm_rmap_items *
      		 sizeof(rmap_item);
      
      where ksm_merging_pages is shown at /proc/<pid>/ksm_merging_pages, and
      ksm_rmap_items is shown in /proc/<pid>/ksm_stat.
      
      Link: https://lkml.kernel.org/r/20220830143731.299702-1-xu.xin16@zte.com.cn
      Link: https://lkml.kernel.org/r/20220830143838.299758-1-xu.xin16@zte.com.cnSigned-off-by: Nxu xin <xu.xin16@zte.com.cn>
      Reviewed-by: NXiaokai Ran <ran.xiaokai@zte.com.cn>
      Reviewed-by: NYang Yang <yang.yang29@zte.com.cn>
      Signed-off-by: NCGEL ZTE <cgel.zte@gmail.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Bagas Sanjaya <bagasdotme@gmail.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Izik Eidus <izik.eidus@ravellosystems.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Conflicts:
      	include/linux/mm_types.h
      Signed-off-by: NNanyong Sun <sunnanyong@huawei.com>
      8c3ecf85
    • X
      ksm: count ksm merging pages for each process · 44acbc78
      xu xin 提交于
      mainline inclusion
      from mainline-v5.19-rc1
      commit 76093853
      category: feature
      bugzilla: https://gitee.com/openeuler/kernel/issues/I72R0B
      CVE: NA
      
      Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7609385337a4feb6236e42dcd0df2185683ce839
      
      ----------------------------------------------------------------------
      
      Some applications or containers want to use KSM by calling madvise() to
      advise areas of address space to be MERGEABLE.  But they may not know
      which applications are more likely to cause real merges in the
      deployment.  If this patch is applied, it helps them know their
      corresponding number of merged pages, and then optimize their app code.
      
      As current KSM only counts the number of KSM merging pages(e.g.
      ksm_pages_sharing and ksm_pages_shared) of the whole system, we cannot see
      the more fine-grained KSM merging, for the upper application optimization,
      the merging area cannot be set easily according to the KSM page merging
      probability of each process.  Therefore, it is necessary to add extra
      statistical means so that the upper level users can know the detailed KSM
      merging information of each process.
      
      We add a new proc file named as ksm_merging_pages under /proc/<pid>/ to
      indicate the involved ksm merging pages of this process.
      
      [akpm@linux-foundation.org: fix comment typo, remove BUG_ON()s]
      Link: https://lkml.kernel.org/r/20220325082318.2352853-1-xu.xin16@zte.com.cnSigned-off-by: Nxu xin <xu.xin16@zte.com.cn>
      Reported-by: Nkernel test robot <lkp@intel.com>
      Reviewed-by: NYang Yang <yang.yang29@zte.com.cn>
      Reviewed-by: NRan Xiaokai <ran.xiaokai@zte.com.cn>
      Reported-by: NZeal Robot <zealci@zte.com.cn>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Ohhoon Kwon <ohoono.kwon@samsung.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Feng Tang <feng.tang@intel.com>
      Cc: Yang Yang <yang.yang29@zte.com.cn>
      Cc: Ran Xiaokai <ran.xiaokai@zte.com.cn>
      Cc: Zeal Robot <zealci@zte.com.cn>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Conflicts:
      	include/linux/mm_types.h
      Signed-off-by: NNanyong Sun <sunnanyong@huawei.com>
      44acbc78
    • O
      !778 [sync] PR-774: Backport CVEs and bugfixes · 665edcec
      openeuler-ci-bot 提交于
      Merge Pull Request from: @openeuler-sync-bot 
       
      
      Origin pull request: 
      https://gitee.com/openeuler/kernel/pulls/774 
       
      Pull new CVEs:
      CVE-2023-32269
      CVE-2023-2002
      CVE-2023-26544
      CVE-2023-0459
      
      mm bugfixes from Yu Kuai
      fs bugfix from yangerkun
      fs perfs from Zhihao Cheng
      
       
       
      Link:https://gitee.com/openeuler/kernel/pulls/778 
      
      Reviewed-by: Zheng Zengkai <zhengzengkai@huawei.com> 
      Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> 
      665edcec
  3. 18 5月, 2023 16 次提交
  4. 17 5月, 2023 14 次提交
    • O
      !570 Net: m1600: Support nebula-matrix m1600-series network card · ea365acb
      openeuler-ci-bot 提交于
      Merge Pull Request from: @nebula_matrix 
       
      Add m1600-driver for nebula-matrix m1600 series smart NIC.
      
      M1600-NIC is a series of network interface card for the Data Center Area.
      The driver supports link-speed 10GbE. M1600 devices support SR-IOV.This
      driver is used for both of Physical Function(PF) and Virtual Function(VF).
      M1600 devices support MSI-X interrupt vector for each Tx/Rx queue and
      interrupt moderation. M1600 devices support also various offload features
      such as checksum offload, Receive-Side Scaling(RSS)
      
       
       
      Link:https://gitee.com/openeuler/kernel/pulls/570 
      
      Reviewed-by: Liu Chao <liuchao173@huawei.com> 
      Reviewed-by: Jialin Zhang <zhangjialin11@huawei.com> 
      Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com> 
      ea365acb
    • O
      !629 arm64: Add initial support for FEAT_WFxT · 37a96b69
      openeuler-ci-bot 提交于
      Merge Pull Request from: @did-you-collect-the-wool-today 
       
      The ARMv8.7 WFxT feature is a new take on the good old WFI/WFE
      instructions as they behave the same way, only taking an extra timeout
      parameter.
      
      This small series aims at adding the minimal support for this feature,
      enabling it for both the kernel and KVM. 
       
      Link:https://gitee.com/openeuler/kernel/pulls/629 
      
      Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> 
      Reviewed-by: Jialin Zhang <zhangjialin11@huawei.com> 
      Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com> 
      37a96b69
    • N
      Net: m1600: Add m1600-driver for nebula-matrix m1600 series smart NIC. · 01e6e207
      nebula_matrix 提交于
      m1600-dirver inclusion
      category: feature
      bugzilla: https://gitee.com/openeuler/kernel/issues/I6RNK1
      CVE: NA
      
      --------------------------------
      
      The m1600 smart NIC is a series of network interface card
      for the Data Center Area. The m1600 network card supports
      checksum, sriov and other hardware offloading.
      Signed-off-by: NYi Chen <open@nebula-matrix.com>
      01e6e207
    • J
      net: hns3: fix reset timeout when enable full VF · 9f6b3672
      Jijie Shao 提交于
      mainlist inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I72OPH
      CVE: NA
      
      Reference: https://patchwork.kernel.org/project/netdevbpf/patch/20230512100014.2522-5-lanhao@huawei.com/
      
      ----------------------------------------------------------------------
      
      The timeout of the cmdq reset command has been increased to
      resolve the reset timeout issue in the full VF scenario.
      The timeout of other cmdq commands remains unchanged.
      
      Fixes: 8d307f8e ("net: hns3: create new set of unified hclge_comm_cmd_send APIs")
      Signed-off-by: NJijie Shao <shaojijie@huawei.com>
      Signed-off-by: NHao Lan <lanhao@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NJiantao Xiao <xiaojiantao1@h-partners.com>
      9f6b3672
    • J
      net: hns3: fix output information incomplete for dumping tx queue info with debugfs · 73361056
      Jie Wang 提交于
      mainlist inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I72OPH
      CVE: NA
      
      Reference: https://patchwork.kernel.org/project/netdevbpf/patch/20230512100014.2522-2-lanhao@huawei.com/
      
      ----------------------------------------------------------------------
      
      In function hns3_dump_tx_queue_info, The print buffer is not enough when
      the tx BD number is configured to 32760. As a result several BD
      information wouldn't be displayed.
      
      So fix it by increasing the tx queue print buffer length.
      
      Fixes: 630a6738 ("net: hns3: adjust string spaces of some parameters of tx bd info in debugfs")
      Signed-off-by: NJie Wang <wangjie125@huawei.com>
      Signed-off-by: NHao Lan <lanhao@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NJiantao Xiao <xiaojiantao1@h-partners.com>
      73361056
    • O
      !772 timekeeping: Make CLOCKSOURCE_VALIDATE_LAST_CYCLE configurable · 824c7466
      openeuler-ci-bot 提交于
      Merge Pull Request from: @liaoyu15 
       
      Currently, only x86 architecture supports the CLOCKSOURCE_VALIDATE_LAST_CYCLE
      option. This option ensures that the timestamps returned by the clocksource are
      monotonically increasing, and helps avoid issues caused by hardware failures.
      
      This commit makes CLOCKSOURCE_VALIDATE_LAST_CYCLE configurable on
      the arm64 architecture, helps increase system stability and reliability. 
       
      Link:https://gitee.com/openeuler/kernel/pulls/772 
      
      Reviewed-by: Jialin Zhang <zhangjialin11@huawei.com> 
      Reviewed-by: Wei Li <liwei391@huawei.com> 
      Reviewed-by: Liu Chao <liuchao173@huawei.com> 
      Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com> 
      824c7466
    • Y
      config: make CLOCKSOURCE_VALIDATE_LAST_CYCLE not set by default · c4178985
      Yu Liao 提交于
      openeuler inclusion
      category: feature
      bugzilla: https://gitee.com/openeuler/kernel/issues/I7393E
      CVE: NA
      
      -------------------------------
      
      The ARM ARM states the following (where a CPU is a device):
      
      	The system counter must provide a uniform view of system time. More
      	precisely, it must be impossible for the following sequence of events
      	to show system time going backwards:
      
      	1) Device A reads the time from the system counter.
      
      	2) Device A communicates with another agent in the system, Device B.
      
      	3) After recognizing the communication from Device A, Device B reads
      	   the time from the system counter.
      
      So make CLOCKSOURCE_VALIDATE_LAST_CYCLE unset by default.
      Signed-off-by: NYu Liao <liaoyu15@huawei.com>
      c4178985
    • O
      !768 Backport 5.10.151 LTS · 2ddfdd8a
      openeuler-ci-bot 提交于
      Merge Pull Request from: @sanglipeng 
       
      Backport 5.10.151 LTS patches from upstream.
      
      Total patches: 5 
       
      Link:https://gitee.com/openeuler/kernel/pulls/768 
      
      Reviewed-by: Jialin Zhang <zhangjialin11@huawei.com> 
      Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com> 
      2ddfdd8a
    • M
      filemap: Correct the conditions for marking a folio as accessed · 2f9a5b33
      Matthew Wilcox (Oracle) 提交于
      mainline inclusion
      from mainline-5.19-rc4
      commit 5ccc944d
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I6YDHU
      CVE: NA
      
      Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=5ccc944dce3df5fd2fd683a7df4fd49d1068eba2
      
      -------------------------------------------------
      
      We had an off-by-one error which meant that we never marked the first page
      in a read as accessed.  This was visible as a slowdown when re-reading
      a file as pages were being evicted from cache too soon.  In reviewing
      this code, we noticed a second bug where a multi-page folio would be
      marked as accessed multiple times when doing reads that were less than
      the size of the folio.
      
      Abstract the comparison of whether two file positions are in the same
      folio into a new function, fixing both of these bugs.
      Reported-by: NYu Kuai <yukuai3@huawei.com>
      Reviewed-by: NKent Overstreet <kent.overstreet@gmail.com>
      Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org>
      
      Conflict: folios is not supported yet
      Signed-off-by: NYu Kuai <yukuai3@huawei.com>
      Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
      Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
      (cherry picked from commit 16dd0b34)
      2f9a5b33
    • Y
      Revert "filemap: Correct the conditions for marking a folio as accessed" · fc9bda4f
      Yu Kuai 提交于
      hulk inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I6YDHU
      CVE: NA
      
      --------------------------------
      
      This reverts commit 939325bb.
      
      Because this commit make a mistake to judge if the page is the same.
      Signed-off-by: NYu Kuai <yukuai3@huawei.com>
      Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
      Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
      (cherry picked from commit 728c0094)
      fc9bda4f
    • H
      netrom: Fix use-after-free caused by accept on already connected socket · 98ff0e69
      Hyunwoo Kim 提交于
      stable inclusion
      from stable-v5.10.168
      commit dd6991251a1382a9b4984962a0c7a467e9d71812
      category: bugfix
      bugzilla: https://gitee.com/src-openeuler/kernel/issues/I70OFF
      CVE: CVE-2023-32269
      
      Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=dd6991251a1382a9b4984962a0c7a467e9d71812
      
      --------------------------------
      
      [ Upstream commit 61179292 ]
      
      If you call listen() and accept() on an already connect()ed
      AF_NETROM socket, accept() can successfully connect.
      This is because when the peer socket sends data to sendmsg,
      the skb with its own sk stored in the connected socket's
      sk->sk_receive_queue is connected, and nr_accept() dequeues
      the skb waiting in the sk->sk_receive_queue.
      
      As a result, nr_accept() allocates and returns a sock with
      the sk of the parent AF_NETROM socket.
      
      And here use-after-free can happen through complex race conditions:
      ```
                        cpu0                                                     cpu1
                                                                     1. socket_2 = socket(AF_NETROM)
                                                                              .
                                                                              .
                                                                        listen(socket_2)
                                                                        accepted_socket = accept(socket_2)
             2. socket_1 = socket(AF_NETROM)
                  nr_create()    // sk refcount : 1
                connect(socket_1)
                                                                     3. write(accepted_socket)
                                                                          nr_sendmsg()
                                                                          nr_output()
                                                                          nr_kick()
                                                                          nr_send_iframe()
                                                                          nr_transmit_buffer()
                                                                          nr_route_frame()
                                                                          nr_loopback_queue()
                                                                          nr_loopback_timer()
                                                                          nr_rx_frame()
                                                                          nr_process_rx_frame(sk, skb);    // sk : socket_1's sk
                                                                          nr_state3_machine()
                                                                          nr_queue_rx_frame()
                                                                          sock_queue_rcv_skb()
                                                                          sock_queue_rcv_skb_reason()
                                                                          __sock_queue_rcv_skb()
                                                                          __skb_queue_tail(list, skb);    // list : socket_1's sk->sk_receive_queue
             4. listen(socket_1)
                  nr_listen()
                uaf_socket = accept(socket_1)
                  nr_accept()
                  skb_dequeue(&sk->sk_receive_queue);
                                                                     5. close(accepted_socket)
                                                                          nr_release()
                                                                          nr_write_internal(sk, NR_DISCREQ)
                                                                          nr_transmit_buffer()    // NR_DISCREQ
                                                                          nr_route_frame()
                                                                          nr_loopback_queue()
                                                                          nr_loopback_timer()
                                                                          nr_rx_frame()    // sk : socket_1's sk
                                                                          nr_process_rx_frame()  // NR_STATE_3
                                                                          nr_state3_machine()    // NR_DISCREQ
                                                                          nr_disconnect()
                                                                          nr_sk(sk)->state = NR_STATE_0;
             6. close(socket_1)    // sk refcount : 3
                  nr_release()    // NR_STATE_0
                  sock_put(sk);    // sk refcount : 0
                  sk_free(sk);
                close(uaf_socket)
                  nr_release()
                  sock_hold(sk);    // UAF
      ```
      
      KASAN report by syzbot:
      ```
      BUG: KASAN: use-after-free in nr_release+0x66/0x460 net/netrom/af_netrom.c:520
      Write of size 4 at addr ffff8880235d8080 by task syz-executor564/5128
      
      Call Trace:
       <TASK>
       __dump_stack lib/dump_stack.c:88 [inline]
       dump_stack_lvl+0xd1/0x138 lib/dump_stack.c:106
       print_address_description mm/kasan/report.c:306 [inline]
       print_report+0x15e/0x461 mm/kasan/report.c:417
       kasan_report+0xbf/0x1f0 mm/kasan/report.c:517
       check_region_inline mm/kasan/generic.c:183 [inline]
       kasan_check_range+0x141/0x190 mm/kasan/generic.c:189
       instrument_atomic_read_write include/linux/instrumented.h:102 [inline]
       atomic_fetch_add_relaxed include/linux/atomic/atomic-instrumented.h:116 [inline]
       __refcount_add include/linux/refcount.h:193 [inline]
       __refcount_inc include/linux/refcount.h:250 [inline]
       refcount_inc include/linux/refcount.h:267 [inline]
       sock_hold include/net/sock.h:775 [inline]
       nr_release+0x66/0x460 net/netrom/af_netrom.c:520
       __sock_release+0xcd/0x280 net/socket.c:650
       sock_close+0x1c/0x20 net/socket.c:1365
       __fput+0x27c/0xa90 fs/file_table.c:320
       task_work_run+0x16f/0x270 kernel/task_work.c:179
       exit_task_work include/linux/task_work.h:38 [inline]
       do_exit+0xaa8/0x2950 kernel/exit.c:867
       do_group_exit+0xd4/0x2a0 kernel/exit.c:1012
       get_signal+0x21c3/0x2450 kernel/signal.c:2859
       arch_do_signal_or_restart+0x79/0x5c0 arch/x86/kernel/signal.c:306
       exit_to_user_mode_loop kernel/entry/common.c:168 [inline]
       exit_to_user_mode_prepare+0x15f/0x250 kernel/entry/common.c:203
       __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline]
       syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:296
       do_syscall_64+0x46/0xb0 arch/x86/entry/common.c:86
       entry_SYSCALL_64_after_hwframe+0x63/0xcd
      RIP: 0033:0x7f6c19e3c9b9
      Code: Unable to access opcode bytes at 0x7f6c19e3c98f.
      RSP: 002b:00007fffd4ba2ce8 EFLAGS: 00000246 ORIG_RAX: 0000000000000133
      RAX: 0000000000000116 RBX: 0000000000000003 RCX: 00007f6c19e3c9b9
      RDX: 0000000000000318 RSI: 00000000200bd000 RDI: 0000000000000006
      RBP: 0000000000000003 R08: 000000000000000d R09: 000000000000000d
      R10: 0000000000000000 R11: 0000000000000246 R12: 000055555566a2c0
      R13: 0000000000000011 R14: 0000000000000000 R15: 0000000000000000
       </TASK>
      
      Allocated by task 5128:
       kasan_save_stack+0x22/0x40 mm/kasan/common.c:45
       kasan_set_track+0x25/0x30 mm/kasan/common.c:52
       ____kasan_kmalloc mm/kasan/common.c:371 [inline]
       ____kasan_kmalloc mm/kasan/common.c:330 [inline]
       __kasan_kmalloc+0xa3/0xb0 mm/kasan/common.c:380
       kasan_kmalloc include/linux/kasan.h:211 [inline]
       __do_kmalloc_node mm/slab_common.c:968 [inline]
       __kmalloc+0x5a/0xd0 mm/slab_common.c:981
       kmalloc include/linux/slab.h:584 [inline]
       sk_prot_alloc+0x140/0x290 net/core/sock.c:2038
       sk_alloc+0x3a/0x7a0 net/core/sock.c:2091
       nr_create+0xb6/0x5f0 net/netrom/af_netrom.c:433
       __sock_create+0x359/0x790 net/socket.c:1515
       sock_create net/socket.c:1566 [inline]
       __sys_socket_create net/socket.c:1603 [inline]
       __sys_socket_create net/socket.c:1588 [inline]
       __sys_socket+0x133/0x250 net/socket.c:1636
       __do_sys_socket net/socket.c:1649 [inline]
       __se_sys_socket net/socket.c:1647 [inline]
       __x64_sys_socket+0x73/0xb0 net/socket.c:1647
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      Freed by task 5128:
       kasan_save_stack+0x22/0x40 mm/kasan/common.c:45
       kasan_set_track+0x25/0x30 mm/kasan/common.c:52
       kasan_save_free_info+0x2b/0x40 mm/kasan/generic.c:518
       ____kasan_slab_free mm/kasan/common.c:236 [inline]
       ____kasan_slab_free+0x13b/0x1a0 mm/kasan/common.c:200
       kasan_slab_free include/linux/kasan.h:177 [inline]
       __cache_free mm/slab.c:3394 [inline]
       __do_kmem_cache_free mm/slab.c:3580 [inline]
       __kmem_cache_free+0xcd/0x3b0 mm/slab.c:3587
       sk_prot_free net/core/sock.c:2074 [inline]
       __sk_destruct+0x5df/0x750 net/core/sock.c:2166
       sk_destruct net/core/sock.c:2181 [inline]
       __sk_free+0x175/0x460 net/core/sock.c:2192
       sk_free+0x7c/0xa0 net/core/sock.c:2203
       sock_put include/net/sock.h:1991 [inline]
       nr_release+0x39e/0x460 net/netrom/af_netrom.c:554
       __sock_release+0xcd/0x280 net/socket.c:650
       sock_close+0x1c/0x20 net/socket.c:1365
       __fput+0x27c/0xa90 fs/file_table.c:320
       task_work_run+0x16f/0x270 kernel/task_work.c:179
       exit_task_work include/linux/task_work.h:38 [inline]
       do_exit+0xaa8/0x2950 kernel/exit.c:867
       do_group_exit+0xd4/0x2a0 kernel/exit.c:1012
       get_signal+0x21c3/0x2450 kernel/signal.c:2859
       arch_do_signal_or_restart+0x79/0x5c0 arch/x86/kernel/signal.c:306
       exit_to_user_mode_loop kernel/entry/common.c:168 [inline]
       exit_to_user_mode_prepare+0x15f/0x250 kernel/entry/common.c:203
       __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline]
       syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:296
       do_syscall_64+0x46/0xb0 arch/x86/entry/common.c:86
       entry_SYSCALL_64_after_hwframe+0x63/0xcd
      ```
      
      To fix this issue, nr_listen() returns -EINVAL for sockets that
      successfully nr_connect().
      
      Reported-by: syzbot+caa188bdfc1eeafeb418@syzkaller.appspotmail.com
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: NHyunwoo Kim <v4bel@theori.io>
      Reviewed-by: NKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NZiyang Xuan <william.xuanziyang@huawei.com>
      Reviewed-by: NLiu Jian <liujian56@huawei.com>
      Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com>
      Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
      (cherry picked from commit 40307a4a)
      98ff0e69
    • R
      bluetooth: Perform careful capability checks in hci_sock_ioctl() · a9fe21d6
      Ruihan Li 提交于
      mainline inclusion
      from mainline-v6.4-rc1
      commit 25c150ac
      category: bugfix
      bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6WHKQ
      CVE: CVE-2023-2002
      
      Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?id=25c150ac103a4ebeed0319994c742a90634ddf18
      
      ----------------------------------------
      
      Previously, capability was checked using capable(), which verified that the
      caller of the ioctl system call had the required capability. In addition,
      the result of the check would be stored in the HCI_SOCK_TRUSTED flag,
      making it persistent for the socket.
      
      However, malicious programs can abuse this approach by deliberately sharing
      an HCI socket with a privileged task. The HCI socket will be marked as
      trusted when the privileged task occasionally makes an ioctl call.
      
      This problem can be solved by using sk_capable() to check capability, which
      ensures that not only the current task but also the socket opener has the
      specified capability, thus reducing the risk of privilege escalation
      through the previously identified vulnerability.
      
      Cc: stable@vger.kernel.org
      Fixes: f81f5b2d ("Bluetooth: Send control open and close messages for HCI raw sockets")
      Signed-off-by: NRuihan Li <lrh2000@pku.edu.cn>
      Signed-off-by: NLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      Signed-off-by: NZiyang Xuan <william.xuanziyang@huawei.com>
      Reviewed-by: NLiu Jian <liujian56@huawei.com>
      Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
      Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
      (cherry picked from commit ec557f34)
      a9fe21d6
    • D
      fs/ntfs3: Delete duplicate condition in ntfs_read_mft() · 7387a959
      Dan Carpenter 提交于
      mainline inclusion
      from mainline-v6.2-rc1
      commit 65801516
      category: bugfix
      bugzilla: 188526, https://gitee.com/src-openeuler/kernel/issues/I71SYO
      CVE: NA
      
      Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=658015167a8432b88f5d032e9d85d8fd50e5bf2c
      
      --------------------------------
      
      There were two patches which addressed the same bug and added the same
      condition:
      
      commit 6db62086 ("fs/ntfs3: Validate data run offset")
      commit 887bfc54 ("fs/ntfs3: Fix slab-out-of-bounds read in run_unpack")
      
      Delete one condition.
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NKonstantin Komarov <almaz.alexandrovich@paragon-software.com>
      Signed-off-by: NZhaoLong Wang <wangzhaolong1@huawei.com>
      Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com>
      Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
      Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
      (cherry picked from commit 1ca5b2ca)
      7387a959
    • H
      fs/ntfs3: Fix slab-out-of-bounds read in run_unpack · 0800886c
      Hawkins Jiawei 提交于
      mainline inclusion
      from mainline-v6.2-rc1
      commit 887bfc54
      category: bugfix
      bugzilla: 188526, https://gitee.com/src-openeuler/kernel/issues/I71SYO
      CVE: CVE-2023-26544
      
      Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=887bfc546097fbe8071dac13b2fef73b77920899
      
      --------------------------------
      
      Syzkaller reports slab-out-of-bounds bug as follows:
      ==================================================================
      BUG: KASAN: slab-out-of-bounds in run_unpack+0x8b7/0x970 fs/ntfs3/run.c:944
      Read of size 1 at addr ffff88801bbdff02 by task syz-executor131/3611
      
      [...]
      Call Trace:
       <TASK>
       __dump_stack lib/dump_stack.c:88 [inline]
       dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
       print_address_description mm/kasan/report.c:317 [inline]
       print_report.cold+0x2ba/0x719 mm/kasan/report.c:433
       kasan_report+0xb1/0x1e0 mm/kasan/report.c:495
       run_unpack+0x8b7/0x970 fs/ntfs3/run.c:944
       run_unpack_ex+0xb0/0x7c0 fs/ntfs3/run.c:1057
       ntfs_read_mft fs/ntfs3/inode.c:368 [inline]
       ntfs_iget5+0xc20/0x3280 fs/ntfs3/inode.c:501
       ntfs_loadlog_and_replay+0x124/0x5d0 fs/ntfs3/fsntfs.c:272
       ntfs_fill_super+0x1eff/0x37f0 fs/ntfs3/super.c:1018
       get_tree_bdev+0x440/0x760 fs/super.c:1323
       vfs_get_tree+0x89/0x2f0 fs/super.c:1530
       do_new_mount fs/namespace.c:3040 [inline]
       path_mount+0x1326/0x1e20 fs/namespace.c:3370
       do_mount fs/namespace.c:3383 [inline]
       __do_sys_mount fs/namespace.c:3591 [inline]
       __se_sys_mount fs/namespace.c:3568 [inline]
       __x64_sys_mount+0x27f/0x300 fs/namespace.c:3568
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x63/0xcd
       [...]
       </TASK>
      
      The buggy address belongs to the physical page:
      page:ffffea00006ef600 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1bbd8
      head:ffffea00006ef600 order:3 compound_mapcount:0 compound_pincount:0
      flags: 0xfff00000010200(slab|head|node=0|zone=1|lastcpupid=0x7ff)
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
       ffff88801bbdfe00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
       ffff88801bbdfe80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      >ffff88801bbdff00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
                         ^
       ffff88801bbdff80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
       ffff88801bbe0000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      ==================================================================
      
      Kernel will tries to read record and parse MFT from disk in
      ntfs_read_mft().
      
      Yet the problem is that during enumerating attributes in record,
      kernel doesn't check whether run_off field loading from the disk
      is a valid value.
      
      To be more specific, if attr->nres.run_off is larger than attr->size,
      kernel will passes an invalid argument run_buf_size in
      run_unpack_ex(), which having an integer overflow. Then this invalid
      argument will triggers the slab-out-of-bounds Read bug as above.
      
      This patch solves it by adding the sanity check between
      the offset to packed runs and attribute size.
      
      link: https://lore.kernel.org/all/0000000000009145fc05e94bd5c3@google.com/#t
      Reported-and-tested-by: syzbot+8d6fbb27a6aded64b25b@syzkaller.appspotmail.com
      Signed-off-by: NHawkins Jiawei <yin31149@gmail.com>
      Signed-off-by: NKonstantin Komarov <almaz.alexandrovich@paragon-software.com>
      Signed-off-by: NZhaoLong Wang <wangzhaolong1@huawei.com>
      Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com>
      Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
      Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
      (cherry picked from commit cf4fe9cf)
      0800886c