1. 30 4月, 2022 4 次提交
  2. 23 3月, 2022 1 次提交
  3. 18 3月, 2022 1 次提交
  4. 22 1月, 2022 1 次提交
  5. 07 11月, 2021 1 次提交
  6. 19 10月, 2021 1 次提交
    • V
      ocfs2: mount fails with buffer overflow in strlen · b15fa922
      Valentin Vidic 提交于
      Starting with kernel 5.11 built with CONFIG_FORTIFY_SOURCE mouting an
      ocfs2 filesystem with either o2cb or pcmk cluster stack fails with the
      trace below.  Problem seems to be that strings for cluster stack and
      cluster name are not guaranteed to be null terminated in the disk
      representation, while strlcpy assumes that the source string is always
      null terminated.  This causes a read outside of the source string
      triggering the buffer overflow detection.
      
        detected buffer overflow in strlen
        ------------[ cut here ]------------
        kernel BUG at lib/string.c:1149!
        invalid opcode: 0000 [#1] SMP PTI
        CPU: 1 PID: 910 Comm: mount.ocfs2 Not tainted 5.14.0-1-amd64 #1
          Debian 5.14.6-2
        RIP: 0010:fortify_panic+0xf/0x11
        ...
        Call Trace:
         ocfs2_initialize_super.isra.0.cold+0xc/0x18 [ocfs2]
         ocfs2_fill_super+0x359/0x19b0 [ocfs2]
         mount_bdev+0x185/0x1b0
         legacy_get_tree+0x27/0x40
         vfs_get_tree+0x25/0xb0
         path_mount+0x454/0xa20
         __x64_sys_mount+0x103/0x140
         do_syscall_64+0x3b/0xc0
         entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      Link: https://lkml.kernel.org/r/20210929180654.32460-1-vvidic@valentin-vidic.from.hrSigned-off-by: NValentin Vidic <vvidic@valentin-vidic.from.hr>
      Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
      Cc: Mark Fasheh <mark@fasheh.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Cc: Changwei Ge <gechangwei@live.cn>
      Cc: Gang He <ghe@suse.com>
      Cc: Jun Piao <piaojun@huawei.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b15fa922
  7. 07 5月, 2021 1 次提交
  8. 25 2月, 2021 1 次提交
  9. 15 11月, 2020 1 次提交
    • W
      ocfs2: initialize ip_next_orphan · f5785283
      Wengang Wang 提交于
      Though problem if found on a lower 4.1.12 kernel, I think upstream has
      same issue.
      
      In one node in the cluster, there is the following callback trace:
      
         # cat /proc/21473/stack
         __ocfs2_cluster_lock.isra.36+0x336/0x9e0 [ocfs2]
         ocfs2_inode_lock_full_nested+0x121/0x520 [ocfs2]
         ocfs2_evict_inode+0x152/0x820 [ocfs2]
         evict+0xae/0x1a0
         iput+0x1c6/0x230
         ocfs2_orphan_filldir+0x5d/0x100 [ocfs2]
         ocfs2_dir_foreach_blk+0x490/0x4f0 [ocfs2]
         ocfs2_dir_foreach+0x29/0x30 [ocfs2]
         ocfs2_recover_orphans+0x1b6/0x9a0 [ocfs2]
         ocfs2_complete_recovery+0x1de/0x5c0 [ocfs2]
         process_one_work+0x169/0x4a0
         worker_thread+0x5b/0x560
         kthread+0xcb/0xf0
         ret_from_fork+0x61/0x90
      
      The above stack is not reasonable, the final iput shouldn't happen in
      ocfs2_orphan_filldir() function.  Looking at the code,
      
        2067         /* Skip inodes which are already added to recover list, since dio may
        2068          * happen concurrently with unlink/rename */
        2069         if (OCFS2_I(iter)->ip_next_orphan) {
        2070                 iput(iter);
        2071                 return 0;
        2072         }
        2073
      
      The logic thinks the inode is already in recover list on seeing
      ip_next_orphan is non-NULL, so it skip this inode after dropping a
      reference which incremented in ocfs2_iget().
      
      While, if the inode is already in recover list, it should have another
      reference and the iput() at line 2070 should not be the final iput
      (dropping the last reference).  So I don't think the inode is really in
      the recover list (no vmcore to confirm).
      
      Note that ocfs2_queue_orphans(), though not shown up in the call back
      trace, is holding cluster lock on the orphan directory when looking up
      for unlinked inodes.  The on disk inode eviction could involve a lot of
      IOs which may need long time to finish.  That means this node could hold
      the cluster lock for very long time, that can lead to the lock requests
      (from other nodes) to the orhpan directory hang for long time.
      
      Looking at more on ip_next_orphan, I found it's not initialized when
      allocating a new ocfs2_inode_info structure.
      
      This causes te reflink operations from some nodes hang for very long
      time waiting for the cluster lock on the orphan directory.
      
      Fix: initialize ip_next_orphan as NULL.
      Signed-off-by: NWengang Wang <wen.gang.wang@oracle.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
      Cc: Mark Fasheh <mark@fasheh.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Cc: Changwei Ge <gechangwei@live.cn>
      Cc: Gang He <ghe@suse.com>
      Cc: Jun Piao <piaojun@huawei.com>
      Cc: <stable@vger.kernel.org>
      Link: https://lkml.kernel.org/r/20201109171746.27884-1-wen.gang.wang@oracle.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f5785283
  10. 08 8月, 2020 1 次提交
  11. 03 6月, 2020 1 次提交
    • G
      ocfs2: mount shared volume without ha stack · 912f655d
      Gang He 提交于
      Usually we create and use a ocfs2 shared volume on the top of ha stack.
      For pcmk based ha stack, which includes DLM, corosync and pacemaker
      services.
      
      The customers complained they could not mount existent ocfs2 volume in
      the single node without ha stack, e.g.  single node backup/restore
      scenario.
      
      Like this case, the customers just want to access the data from the
      existent ocfs2 volume quickly, but do not want to restart or setup ha
      stack.
      
      Then, I'd like to add a mount option "nocluster", if the users use this
      option to mount a ocfs2 shared volume, the whole mount will not depend
      on the ha related services.  the command will mount the existent ocfs2
      volume directly (like local mount), for avoiding setup the ha stack.
      Signed-off-by: NGang He <ghe@suse.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Mark Fasheh <mark@fasheh.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Cc: Joseph Qi <jiangqi903@gmail.com>
      Cc: Changwei Ge <gechangwei@live.cn>
      Cc: Jun Piao <piaojun@huawei.com>
      Link: http://lkml.kernel.org/r/20200423053300.22661-1-ghe@suse.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      912f655d
  12. 03 4月, 2020 1 次提交
  13. 04 11月, 2019 1 次提交
  14. 25 9月, 2019 1 次提交
  15. 13 7月, 2019 1 次提交
  16. 31 5月, 2019 1 次提交
  17. 02 5月, 2019 1 次提交
  18. 07 4月, 2019 1 次提交
    • C
      block: remove CONFIG_LBDAF · 72deb455
      Christoph Hellwig 提交于
      Currently support for 64-bit sector_t and blkcnt_t is optional on 32-bit
      architectures.  These types are required to support block device and/or
      file sizes larger than 2 TiB, and have generally defaulted to on for
      a long time.  Enabling the option only increases the i386 tinyconfig
      size by 145 bytes, and many data structures already always use
      64-bit values for their in-core and on-disk data structures anyway,
      so there should not be a large change in dynamic memory usage either.
      
      Dropping this option removes a somewhat weird non-default config that
      has cause various bugs or compiler warnings when actually used.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      72deb455
  19. 06 3月, 2019 1 次提交
  20. 06 4月, 2018 3 次提交
  21. 01 2月, 2018 2 次提交
  22. 28 11月, 2017 1 次提交
    • L
      Rename superblock flags (MS_xyz -> SB_xyz) · 1751e8a6
      Linus Torvalds 提交于
      This is a pure automated search-and-replace of the internal kernel
      superblock flags.
      
      The s_flags are now called SB_*, with the names and the values for the
      moment mirroring the MS_* flags that they're equivalent to.
      
      Note how the MS_xyz flags are the ones passed to the mount system call,
      while the SB_xyz flags are what we then use in sb->s_flags.
      
      The script to do this was:
      
          # places to look in; re security/*: it generally should *not* be
          # touched (that stuff parses mount(2) arguments directly), but
          # there are two places where we really deal with superblock flags.
          FILES="drivers/mtd drivers/staging/lustre fs ipc mm \
                  include/linux/fs.h include/uapi/linux/bfs_fs.h \
                  security/apparmor/apparmorfs.c security/apparmor/include/lib.h"
          # the list of MS_... constants
          SYMS="RDONLY NOSUID NODEV NOEXEC SYNCHRONOUS REMOUNT MANDLOCK \
                DIRSYNC NOATIME NODIRATIME BIND MOVE REC VERBOSE SILENT \
                POSIXACL UNBINDABLE PRIVATE SLAVE SHARED RELATIME KERNMOUNT \
                I_VERSION STRICTATIME LAZYTIME SUBMOUNT NOREMOTELOCK NOSEC BORN \
                ACTIVE NOUSER"
      
          SED_PROG=
          for i in $SYMS; do SED_PROG="$SED_PROG -e s/MS_$i/SB_$i/g"; done
      
          # we want files that contain at least one of MS_...,
          # with fs/namespace.c and fs/pnode.c excluded.
          L=$(for i in $SYMS; do git grep -w -l MS_$i $FILES; done| sort|uniq|grep -v '^fs/namespace.c'|grep -v '^fs/pnode.c')
      
          for f in $L; do sed -i $f $SED_PROG; done
      Requested-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1751e8a6
  23. 16 11月, 2017 1 次提交
  24. 07 9月, 2017 1 次提交
  25. 17 7月, 2017 1 次提交
    • D
      VFS: Convert sb->s_flags & MS_RDONLY to sb_rdonly(sb) · bc98a42c
      David Howells 提交于
      Firstly by applying the following with coccinelle's spatch:
      
      	@@ expression SB; @@
      	-SB->s_flags & MS_RDONLY
      	+sb_rdonly(SB)
      
      to effect the conversion to sb_rdonly(sb), then by applying:
      
      	@@ expression A, SB; @@
      	(
      	-(!sb_rdonly(SB)) && A
      	+!sb_rdonly(SB) && A
      	|
      	-A != (sb_rdonly(SB))
      	+A != sb_rdonly(SB)
      	|
      	-A == (sb_rdonly(SB))
      	+A == sb_rdonly(SB)
      	|
      	-!(sb_rdonly(SB))
      	+!sb_rdonly(SB)
      	|
      	-A && (sb_rdonly(SB))
      	+A && sb_rdonly(SB)
      	|
      	-A || (sb_rdonly(SB))
      	+A || sb_rdonly(SB)
      	|
      	-(sb_rdonly(SB)) != A
      	+sb_rdonly(SB) != A
      	|
      	-(sb_rdonly(SB)) == A
      	+sb_rdonly(SB) == A
      	|
      	-(sb_rdonly(SB)) && A
      	+sb_rdonly(SB) && A
      	|
      	-(sb_rdonly(SB)) || A
      	+sb_rdonly(SB) || A
      	)
      
      	@@ expression A, B, SB; @@
      	(
      	-(sb_rdonly(SB)) ? 1 : 0
      	+sb_rdonly(SB)
      	|
      	-(sb_rdonly(SB)) ? A : B
      	+sb_rdonly(SB) ? A : B
      	)
      
      to remove left over excess bracketage and finally by applying:
      
      	@@ expression A, SB; @@
      	(
      	-(A & MS_RDONLY) != sb_rdonly(SB)
      	+(bool)(A & MS_RDONLY) != sb_rdonly(SB)
      	|
      	-(A & MS_RDONLY) == sb_rdonly(SB)
      	+(bool)(A & MS_RDONLY) == sb_rdonly(SB)
      	)
      
      to make comparisons against the result of sb_rdonly() (which is a bool)
      work correctly.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      bc98a42c
  26. 05 6月, 2017 1 次提交
  27. 02 3月, 2017 1 次提交
  28. 13 12月, 2016 1 次提交
  29. 30 11月, 2016 1 次提交
  30. 08 10月, 2016 1 次提交
  31. 27 7月, 2016 1 次提交
  32. 08 6月, 2016 1 次提交
  33. 05 4月, 2016 1 次提交
    • K
      mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros · 09cbfeaf
      Kirill A. Shutemov 提交于
      PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
      ago with promise that one day it will be possible to implement page
      cache with bigger chunks than PAGE_SIZE.
      
      This promise never materialized.  And unlikely will.
      
      We have many places where PAGE_CACHE_SIZE assumed to be equal to
      PAGE_SIZE.  And it's constant source of confusion on whether
      PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
      especially on the border between fs and mm.
      
      Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
      breakage to be doable.
      
      Let's stop pretending that pages in page cache are special.  They are
      not.
      
      The changes are pretty straight-forward:
      
       - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
      
       - page_cache_get() -> get_page();
      
       - page_cache_release() -> put_page();
      
      This patch contains automated changes generated with coccinelle using
      script below.  For some reason, coccinelle doesn't patch header files.
      I've called spatch for them manually.
      
      The only adjustment after coccinelle is revert of changes to
      PAGE_CAHCE_ALIGN definition: we are going to drop it later.
      
      There are few places in the code where coccinelle didn't reach.  I'll
      fix them manually in a separate patch.  Comments and documentation also
      will be addressed with the separate patch.
      
      virtual patch
      
      @@
      expression E;
      @@
      - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      expression E;
      @@
      - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      @@
      - PAGE_CACHE_SHIFT
      + PAGE_SHIFT
      
      @@
      @@
      - PAGE_CACHE_SIZE
      + PAGE_SIZE
      
      @@
      @@
      - PAGE_CACHE_MASK
      + PAGE_MASK
      
      @@
      expression E;
      @@
      - PAGE_CACHE_ALIGN(E)
      + PAGE_ALIGN(E)
      
      @@
      expression E;
      @@
      - page_cache_get(E)
      + get_page(E)
      
      @@
      expression E;
      @@
      - page_cache_release(E)
      + put_page(E)
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      09cbfeaf
  34. 26 3月, 2016 1 次提交
    • J
      ocfs2: fix occurring deadlock by changing ocfs2_wq from global to local · 35ddf78e
      jiangyiwen 提交于
      This patch fixes a deadlock, as follows:
      
        Node 1                Node 2                  Node 3
      1)volume a and b are    only mount vol a        only mount vol b
        mounted
      
      2)                      start to mount b        start to mount a
      
      3)                      check hb of Node 3      check hb of Node 2
                              in vol a, qs_holds++    in vol b, qs_holds++
      
      4) -------------------- all nodes' network down --------------------
      
      5)                      progress of mount b     the same situation as
                              failed, and then call   Node 2
                              ocfs2_dismount_volume.
                              but the process is hung,
                              since there is a work
                              in ocfs2_wq cannot beo
                              completed. This work is
                              about vol a, because
                              ocfs2_wq is global wq.
                              BTW, this work which is
                              scheduled in ocfs2_wq is
                              ocfs2_orphan_scan_work,
                              and the context in this work
                              needs to take inode lock
                              of orphan_dir, because
                              lockres owner are Node 1 and
                              all nodes' nework has been down
                              at the same time, so it can't
                              get the inode lock.
      
      6)                      Why can't this node be fenced
                              when network disconnected?
                              Because the process of
                              mount is hung what caused qs_holds
                              is not equal 0.
      
      Because all works in the ocfs2_wq are relative to the super block.
      
      The solution is to change the ocfs2_wq from global to local.  In other
      words, move it into struct ocfs2_super.
      Signed-off-by: NYiwen Jiang <jiangyiwen@huawei.com>
      Reviewed-by: NJoseph Qi <joseph.qi@huawei.com>
      Cc: Xue jiufei <xuejiufei@huawei.com>
      Cc: Mark Fasheh <mfasheh@suse.de>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      35ddf78e