提交 · 73cc86252bd8a547349d2856497eceaf4f2d1766 · openeuler / Kernel

05 4月, 2016 1 次提交

GFS2: Get rid of dead code in inode_go_demote_ok · 73cc8625

由 Bob Peterson 提交于 4月 01, 2016

Function inode_go_demote_ok had some code that was only executed
if gl_holders was not empty. However, if gl_holders was not empty,
the only caller, demote_ok(), returns before inode_go_demote_ok
would ever be called. Therefore, it's dead code, so I removed it.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Acked-by: NSteven Whitehouse <swhiteho@redhat.com>

73cc8625

24 3月, 2016 1 次提交

GFS2: ignore unlock failures after withdraw · 3e11e530

由 Benjamin Marzinski 提交于 3月 23, 2016

After gfs2 has withdrawn the filesystem, it may still have many locks not
in the unlocked state.  If it is using lock_dlm, it will failed trying
the unlocks since it has already unmounted the lock manager. Instead, it
should set the SDF_SKIP_DLM_UNLOCK flag on withdraw, to signal that
it can skip the lock_manager on unlocks, and failback to lock_nolock
style unlocking.
Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: NBob Peterson <rpeterso@redhat.com>

3e11e530

16 3月, 2016 21 次提交

autofs4: use pr_xxx() macros directly for logging · 8a78d593

由 Ian Kent 提交于 3月 15, 2016

Use the standard pr_xxx() log macros directly for log prints instead of
the AUTOFS_XXX() macros.
Signed-off-by: NIan Kent <ikent@redhat.com>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8a78d593

autofs4: change log print macros to not insert newline · 90967c87

由 Ian Kent 提交于 3月 15, 2016

Common kernel coding practice is to include the newline of log prints
within the log text rather than hidden away in a macro.

To avoid introducing inconsistencies as changes are made change the log
macros to not include the newline.
Signed-off-by: NIan Kent <raven@themaw.net>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

90967c87

autofs4: make autofs log prints consistent · cab49f9e

由 Ian Kent 提交于 3月 15, 2016

Use the pr_*() print in AUTOFS_*() macros instead of printks and include
the module name in log message macros.  Also use the AUTOFS_*() macros
everywhere instead of raw printks.
Signed-off-by: NIan Kent <raven@themaw.net>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

cab49f9e

autofs4: fix some white space errors · 0266725a

由 Ian Kent 提交于 3月 15, 2016

Fix some white space format errors.
Signed-off-by: NIan Kent <raven@themaw.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0266725a

autofs4: fix invalid ioctl return in autofs4_root_ioctl_unlocked() · e3cd8067

由 Ian Kent 提交于 3月 15, 2016

The return from an ioctl if an invalid ioctl is passed in should be
EINVAL not ENOSYS.
Signed-off-by: NIan Kent <raven@themaw.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e3cd8067

autofs4: fix coding style line length in autofs4_wait() · b3f67a98

由 Ian Kent 提交于 3月 15, 2016

The need for this is questionable but checkpatch.pl complains about the
line length and it's a straightfoward change.
Signed-off-by: NIan Kent <raven@themaw.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b3f67a98

autofs4: fix coding style problem in autofs4_get_set_timeout() · aa330ddc

由 Ian Kent 提交于 3月 15, 2016

Refactor autofs4_get_set_timeout() to eliminate coding style error.
Signed-off-by: NIan Kent <raven@themaw.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

aa330ddc

autofs4: coding style fixes · e9a7c2f1

由 Ian Kent 提交于 3月 15, 2016

Try and make the coding style completely consistent throughtout the
autofs module and inline with kernel coding style recommendations.
Signed-off-by: NIan Kent <raven@themaw.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e9a7c2f1

autofs: show pipe inode in mount options · c83aa55d

由 Stanislav Kinsburskiy 提交于 3月 15, 2016

This is required for CRIU (Checkpoint Restart In Userspace) to migrate a
mount point when write end in user space is closed.

Below is a brief description of the problem.

To migrate a non-catatonic autofs mount point, one has to restore the
control pipe between kernel and autofs master process.

One of the autofs masters is systemd, which closes pipe write end after
passing it to the kernel with mount call.

To be able to restore the systemd control pipe one has to know which
read pipe end in systemd corresponds to the write pipe end in the
kernel.  The pipe "fd" in mount options is not enough because it was
closed and probably replaced by some other descriptor.

Thus, some other attribute is required to be able to find the read pipe
end.  The best attribute to use to find the correct pipe end is inode
number becuase it's unique for the whole system and can't be reused
while the autofs mount exists.

This attribute can also be used to recognize a situation where an autofs
mount has no master (no process with specified "pgrp" or no file
descriptor with "pipe_ino", specified in autofs mount options).
Signed-off-by: NStanislav Kinsburskiy <skinsbursky@virtuozzo.com>
Signed-off-by: NIan Kent <raven@themaw.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c83aa55d

mm: simplify lock_page_memcg() · 62cccb8c

由 Johannes Weiner 提交于 3月 15, 2016

Now that migration doesn't clear page->mem_cgroup of live pages anymore,
it's safe to make lock_page_memcg() and the memcg stat functions take
pages, and spare the callers from memcg objects.

[akpm@linux-foundation.org: fix warnings]
Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
Suggested-by: NVladimir Davydov <vdavydov@virtuozzo.com>
Acked-by: NVladimir Davydov <vdavydov@virtuozzo.com>
Cc: Michal Hocko <mhocko@suse.cz>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

62cccb8c

mm: memcontrol: generalize locking for the page->mem_cgroup binding · 81f8c3a4

由 Johannes Weiner 提交于 3月 15, 2016

These patches tag the page cache radix tree eviction entries with the
memcg an evicted page belonged to, thus making per-cgroup LRU reclaim
work properly and be as adaptive to new cache workingsets as global
reclaim already is.

This should have been part of the original thrash detection patch
series, but was deferred due to the complexity of those patches.

This patch (of 5):

So far the only sites that needed to exclude charge migration to
stabilize page->mem_cgroup have been per-cgroup page statistics, hence
the name mem_cgroup_begin_page_stat().  But per-cgroup thrash detection
will add another site that needs to ensure page->mem_cgroup lifetime.

Rename these locking functions to the more generic lock_page_memcg() and
unlock_page_memcg().  Since charge migration is a cgroup1 feature only,
we might be able to delete it at some point, and these now easy to
identify locking sites along with it.
Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
Suggested-by: NVladimir Davydov <vdavydov@virtuozzo.com>
Acked-by: NVladimir Davydov <vdavydov@virtuozzo.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

81f8c3a4

fs/mpage.c:mpage_readpages(): use lru_to_page() helper · 02c43638

由 Andrew Morton 提交于 3月 15, 2016

Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

02c43638

ocfs2/dlm: fix a variable overflow problem in dlmdomain.c · 8d67d3c2

由 Jun Piao 提交于 3月 15, 2016

In dlm_send_join_cancels(), node is defined with type unsigned int, but
initialized with -1, this will lead variable overflow.  Although this
won't cause any runtime problem, the code looks a little uncoordinated.
Signed-off-by: NJun Piao <piaojun@huawei.com>
Reviewed-by: NJoseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8d67d3c2

ocfs2: fix a tiny race that leads file system read-only · 814ce694

由 Jiufei Xue 提交于 3月 15, 2016

when o2hb detect a node down, it first set the dead node to recovery map
and create ocfs2rec which will replay journal for dead node.  o2hb
thread then call dlm_do_local_recovery_cleanup() to delete the lock for
dead node.  After the lock of dead node is gone, locks for other nodes
can be granted and may modify the meta data without replaying journal of
the dead node.  The detail is described as follows.

     N1                         N2                   N3(master)
modify the extent tree of
inode, and commit
dirty metadata to journal,
then goes down.
                                                 o2hb thread detects
                                                 N1 goes down, set
                                                 recovery map and
                                                 delete the lock of N1.

                                                 dlm_thread flush ast
                                                 for the lock of N2.
                        do not detect the death
                        of N1, so recovery map is
                        empty.

                        read inode from disk
                        without replaying
                        the journal of N1 and
                        modify the extent tree
                        of the inode that N1
                        had modified.
                                                 ocfs2rec recover the
                                                 journal of N1.
                                                 The modification of N2
                                                 is lost.

The modification of N1 and N2 are not serial, and it will lead to
read-only file system.  We can set recovery_waiting flag to the lock
resource after delete the lock for dead node to prevent other node from
getting the lock before dlm recovery.  After dlm recovery, the recovery
map on N2 is not empty, ocfs2_inode_lock_full_nested() will wait for ocfs2
recovery.
Signed-off-by: NJiufei Xue <xuejiufei@huawei.com>
Reviewed-by: NJoseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

814ce694

ocfs2/dlm: return EINVAL when the lockres on migration target is in DROPPING_REF state · d277f33e

由 xuejiufei 提交于 3月 15, 2016

If master migrate this lock resource to node when it happened to purge
it, a new lock resource will be created and inserted into hash list.  If
then master goes down, the lock resource being purged is recovered, so
there exist two lock resource with different owner.  So return error to
master if the lock resource is in DROPPING state, master will retry to
migrate this lock resource.
Signed-off-by: Nxuejiufei <xuejiufei@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Reviewed-by: NJoseph Qi <joseph.qi@huawei.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d277f33e

ocfs2/dlm: clear DROPPING_REF flag when the master goes down · 8c034396

由 xuejiufei 提交于 3月 15, 2016

If the master goes down after return in-progress for deref message.  The
lock resource on non-master node can not be purged.  Clear the
DROPPING_REF flag and recovery it.
Signed-off-by: Nxuejiufei <xuejiufei@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Reviewed-by: NJoseph Qi <joseph.qi@huawei.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8c034396

ocfs2/dlm: return in progress if master can not clear the refmap bit right now · 842b90b6

由 xuejiufei 提交于 3月 15, 2016

Master returns in-progress to non-master node when it can not clear the
refmap bit right now.  And non-master node will not purge the lock
resource until receiving deref done message.
Signed-off-by: Nxuejiufei <xuejiufei@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Reviewed-by: NJoseph Qi <joseph.qi@huawei.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

842b90b6

ocfs2/dlm: add DEREF_DONE message · 60d663cb

由 xuejiufei 提交于 3月 15, 2016

This series of patches is to fix the dis-order issue of setting/clearing
refmap bit described below.

Node 1                               Node 2(master)
dlmlock
dlm_do_master_request
                                dlm_master_request_handler
                                -> dlm_lockres_set_refmap_bit
dlmlock succeed
dlmunlock succeed

dlm_purge_lockres
                                dlm_deref_handler
                                -> find lock resource is in
                                   DLM_LOCK_RES_SETREF_INPROG state,
                                   so dispatch a deref work
dlm_purge_lockres succeed.

call dlmlock again
dlm_do_master_request
                                dlm_master_request_handler
                                -> dlm_lockres_set_refmap_bit

                                deref work trigger, call
                                dlm_lockres_clear_refmap_bit
                                to clear Node 1 from refmap

                                dlm_purge_lockres succeed

dlm_send_remote_lock_request
                                return DLM_IVLOCKID because
                                the lockres is not exist
BUG if the lockres is $RECOVERY

This series of patches add a new message to keep the order of set and
clear.  Other nodes can purge the lock resource only after the refmap bit
on master is cleared.

This patch is to add DEREF_DONE message and corresponding handler.  Node
can purge the lock resource after receiving this message.  As a new
message is added, so increase the minor number of dlm protocol version.
Signed-off-by: Nxuejiufei <xuejiufei@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Reviewed-by: NJoseph Qi <joseph.qi@huawei.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

60d663cb

ocfs2/dlm: fix a typo in dlmcommon.h · 39b29af0

由 Joseph Qi 提交于 3月 15, 2016

Refer to cluster/tcp.h, NET_MAX_PAYLOAD_BYTES is a typo for
O2NET_MAX_PAYLOAD_BYTES.

Since currently DLM_MIG_LOCKRES_RESERVED is not actually used, it won't
cause any problem.  But we'd better correct it for further use.
Signed-off-by: NJoseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <joseph.qi@huawei.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

39b29af0

ocfs2: use spinlock_irqsave() to downconvert lock in ocfs2_osb_dump() · bfd97a03

由 jiangyiwen 提交于 3月 15, 2016

Commit a75e9cca ("ocfs2: use spinlock irqsave for downconvert lock")
missed an unmodified place in ocfs2_osb_dump(), so it still exists a
deadlock scenario.

    ocfs2_wake_downconvert_thread
    ocfs2_rw_unlock
    ocfs2_dio_end_io
    dio_complete
    .....
    bio_endio
    req_bio_endio
    ....
    scsi_io_completion
    blk_done_softirq
    __do_softirq
    do_softirq
    irq_exit
    do_IRQ
    ocfs2_osb_dump
    cat /sys/kernel/debug/ocfs2/${uuid}/fs_state

This patch still uses spin_lock_irqsave() - replace spin_lock() to solve
this situation.
Signed-off-by: NYiwen Jiang <jiangyiwen@huawei.com>
Reviewed-by: NJoseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

bfd97a03

ocfs2/cluster: replace the interrupt safe spinlocks with common ones · 4d548f61

由 jiangyiwen 提交于 3月 15, 2016

There actually no hardware or software interrupts in the context which
using o2hb_live_lock, so we don't need to worry about race conditions
caused by irq/softirq with spinlock held.  Turning off irq is not good
for system performance after all.  Just replace them with a non
interrupt safe function.
Signed-off-by: NYiwen Jiang <jiangyiwen@huawei.com>
Reviewed-by: NJoseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4d548f61

15 3月, 2016 5 次提交

GFS2: Eliminate parameter non_block on gfs2_inode_lookup · 73b462d2

由 Bob Peterson 提交于 12月 18, 2015

Now that we're not filtering out I_FREEING inodes from our lookups
anymore, we can eliminate the non_block parameter from the lookup
function.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Acked-by: NSteven Whitehouse <swhiteho@redhat.com>

73b462d2

GFS2: Don't filter out I_FREEING inodes anymore · ff34245d

由 Bob Peterson 提交于 3月 27, 2015

This patch basically reverts a very old patch from 2008,
7a9f53b3, with the title
"Alternate gfs2_iget to avoid looking up inodes being freed".
The original patch was designed to avoid a deadlock caused by lock
ordering with try_rgrp_unlink. The patch forced the function to not
find inodes that were being removed by VFS. The problem is, that
made it impossible for nodes to delete their own unlinked dinodes
after a certain point in time, because the inode needed was not found
by this filtering process. There is no longer a need for the patch,
since function try_rgrp_unlink no longer locks the inode: All it does
is queue the glock onto the delete work_queue, so there should be no
more deadlock.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

ff34245d

GFS2: Prevent delete work from occurring on glocks used for create · a4923865

由 Bob Peterson 提交于 12月 07, 2015

This patch tries to prevent delete work (queued via iopen callback)
from executing if the glock is currently being used to create
a new inode.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Acked-by: NSteven Whitehouse <swhiteho@redhat.com>

a4923865

GFS2: Fix direct IO write rounding error · 2df6f471

由 Bob Peterson 提交于 1月 27, 2016

The fsx test in xfstests was failing because it was using direct IO
writes which were using a bad calculation. It was using
loff_t lstart = offset & (PAGE_CACHE_SIZE - 1); when it should be
loff_t lstart = offset & ~(PAGE_CACHE_SIZE - 1);
Thus, the write at offset 0x67e00 was calculating lstart to be
0xe00, the address of our corruption. Instead, it should have been
0x67000. This patch fixes the calculation.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Acked-by: NSteven Whitehouse <swhiteho@redhat.com>

2df6f471

gfs2: avoid uninitialized variable warning · 67893f12

由 Arnd Bergmann 提交于 1月 26, 2016

We get a bogus warning about a potential uninitialized variable
use in gfs2, because the compiler does not figure out that we
never use the leaf number if get_leaf_nr() returns an error:

fs/gfs2/dir.c: In function 'get_first_leaf':
fs/gfs2/dir.c:802:9: warning: 'leaf_no' may be used uninitialized in this function [-Wmaybe-uninitialized]
fs/gfs2/dir.c: In function 'dir_split_leaf':
fs/gfs2/dir.c:1021:8: warning: 'leaf_no' may be used uninitialized in this function [-Wmaybe-uninitialized]

Changing the 'if (!error)' to 'if (!IS_ERR_VALUE(error))' is
sufficient to let gcc understand that this is exactly the same
condition as in IS_ERR() so it can optimize the code path enough
to understand it.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NBob Peterson <rpeterso@redhat.com>

67893f12

14 3月, 2016 4 次提交

ext4: clean up error handling in the MMP support · 03046886

由 vikram.jadhav07 提交于 3月 13, 2016

There is memory leak as both caller function kmmpd() and callee
read_mmp_block() not releasing bh_check  (i.e buffer_head).
Given patch fixes this problem.

[ Additional changes suggested by Andreas Dilger -- TYT ]
Signed-off-by: NJadhav Vikram <vikramjadhavpucsd2007@gmail.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

03046886

jbd2: do not fail journal because of frozen_buffer allocation failure · 490c1b44

由 Michal Hocko 提交于 3月 13, 2016

Journal transaction might fail prematurely because the frozen_buffer
is allocated by GFP_NOFS request:
[   72.440013] do_get_write_access: OOM for frozen_buffer
[   72.440014] EXT4-fs: ext4_reserve_inode_write:4729: aborting transaction: Out of memory in __ext4_journal_get_write_access
[   72.440015] EXT4-fs error (device sda1) in ext4_reserve_inode_write:4735: Out of memory
(...snipped....)
[   72.495559] do_get_write_access: OOM for frozen_buffer
[   72.495560] EXT4-fs: ext4_reserve_inode_write:4729: aborting transaction: Out of memory in __ext4_journal_get_write_access
[   72.496839] do_get_write_access: OOM for frozen_buffer
[   72.496841] EXT4-fs: ext4_reserve_inode_write:4729: aborting transaction: Out of memory in __ext4_journal_get_write_access
[   72.505766] Aborting journal on device sda1-8.
[   72.505851] EXT4-fs (sda1): Remounting filesystem read-only

This wasn't a problem until "mm: page_alloc: do not lock up GFP_NOFS
allocations upon OOM" because small GPF_NOFS allocations never failed.
This allocation seems essential for the journal and GFP_NOFS is too
restrictive to the memory allocator so let's use __GFP_NOFAIL here to
emulate the previous behavior.
Signed-off-by: NMichal Hocko <mhocko@suse.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

490c1b44

ext4: use __GFP_NOFAIL in ext4_free_blocks() · adb7ef60

由 Konstantin Khlebnikov 提交于 3月 13, 2016

This might be unexpected but pages allocated for sbi->s_buddy_cache are
charged to current memory cgroup. So, GFP_NOFS allocation could fail if
current task has been killed by OOM or if current memory cgroup has no
free memory left. Block allocator cannot handle such failures here yet.
Signed-off-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

adb7ef60

ext4: fix compile error while opening the macro DOUBLE_CHECK · a2821e34

由 Aihua Zhang 提交于 3月 13, 2016

the error is:
    fs/ext4/mballoc.c:475:43: error: 'struct ext4_group_info' has
no member named 'bb_bitmap'.
    so, the definition of macro DOUBLE_CHECK should before
'struct ext4_group_info', I fixed it, and I moved the macro
AGGRESSIVE_CHECK together, because I think they shoule be together.
Signed-off-by: NAihua Zhang <zhangaihua1@huawei.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

a2821e34

13 3月, 2016 2 次提交

ext4: print ext4 mount option data_err=abort correctly · 7915a861

由 Ales Novak 提交于 3月 12, 2016

If data_err=abort option is specified for an ext3/ext4 mount,
/proc/mounts does show it as "(null)". This is caused by token2str()
returning NULL for Opt_data_err_abort (due to its pattern containing
'=').
Signed-off-by: NAles Novak <alnovak@suse.cz>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

7915a861

ext4: fix NULL pointer dereference in ext4_mark_inode_dirty() · 5e1021f2

由 Eryu Guan 提交于 3月 12, 2016

ext4_reserve_inode_write() in ext4_mark_inode_dirty() could fail on
error (e.g. EIO) and iloc.bh can be NULL in this case. But the error is
ignored in the following "if" condition and ext4_expand_extra_isize()
might be called with NULL iloc.bh set, which triggers NULL pointer
dereference.

This is uncovered by commit 8b4953e1 ("ext4: reserve code points for
the project quota feature"), which enlarges the ext4_inode size, and
run the following script on new kernel but with old mke2fs:

  #/bin/bash
  mnt=/mnt/ext4
  devname=ext4-error
  dev=/dev/mapper/$devname
  fsimg=/home/fs.img

  trap cleanup 0 1 2 3 9 15

  cleanup()
  {
          umount $mnt >/dev/null 2>&1
          dmsetup remove $devname
          losetup -d $backend_dev
          rm -f $fsimg
          exit 0
  }

  rm -f $fsimg
  fallocate -l 1g $fsimg
  backend_dev=`losetup -f --show $fsimg`
  devsize=`blockdev --getsz $backend_dev`

  good_tab="0 $devsize linear $backend_dev 0"
  error_tab="0 $devsize error $backend_dev 0"

  dmsetup create $devname --table "$good_tab"

  mkfs -t ext4 $dev
  mount -t ext4 -o errors=continue,strictatime $dev $mnt

  dmsetup load $devname --table "$error_tab" && dmsetup resume $devname
  echo 3 > /proc/sys/vm/drop_caches
  ls -l $mnt
  exit 0

[ Patch changed to simplify the function a tiny bit. -- Ted ]
Signed-off-by: NEryu Guan <guaneryu@gmail.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

5e1021f2

11 3月, 2016 1 次提交

pstore: Add support for 64 Bit address space · 764fd639

由 Wiebe, Wladislav (Nokia - DE/Ulm) 提交于 11月 13, 2015

Some architectures have their reserved RAM in 64 Bit address space.
Therefore convert mem_address module parameter to ullong.
Signed-off-by: NWladislav Wiebe <wladislav.wiebe@nokia.com>
Acked-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NTony Luck <tony.luck@intel.com>

764fd639

10 3月, 2016 5 次提交

ext4: drop unneeded BUFFER_TRACE in ext4_delete_inline_entry() · a8ed9b86

由 Geliang Tang 提交于 3月 10, 2016

BUFFER_TRACE info "call ext4_handle_dirty_metadata" doesn't match the
code, so drop it.
Signed-off-by: NGeliang Tang <geliangtang@163.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

a8ed9b86

ext4: fix misspellings in comments. · b8a07463

由 Adam Buchbinder 提交于 3月 09, 2016

Signed-off-by: NAdam Buchbinder <adam.buchbinder@gmail.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

b8a07463

jbd2: fix FS corruption possibility in jbd2_journal_destroy() on umount path · c0a2ad9b

由 OGAWA Hirofumi 提交于 3月 09, 2016

On umount path, jbd2_journal_destroy() writes latest transaction ID
(->j_tail_sequence) to be used at next mount.

The bug is that ->j_tail_sequence is not holding latest transaction ID
in some cases. So, at next mount, there is chance to conflict with
remaining (not overwritten yet) transactions.

	mount (id=10)
	write transaction (id=11)
	write transaction (id=12)
	umount (id=10) <= the bug doesn't write latest ID

	mount (id=10)
	write transaction (id=11)
	crash

	mount
	[recovery process]
		transaction (id=11)
		transaction (id=12) <= valid transaction ID, but old commit
                                       must not replay

Like above, this bug become the cause of recovery failure, or FS
corruption.

So why ->j_tail_sequence doesn't point latest ID?

Because if checkpoint transactions was reclaimed by memory pressure
(i.e. bdev_try_to_free_page()), then ->j_tail_sequence is not updated.
(And another case is, __jbd2_journal_clean_checkpoint_list() is called
with empty transaction.)

So in above cases, ->j_tail_sequence is not pointing latest
transaction ID at umount path. Plus, REQ_FLUSH for checkpoint is not
done too.

So, to fix this problem with minimum changes, this patch updates
->j_tail_sequence, and issue REQ_FLUSH.  (With more complex changes,
some optimizations would be possible to avoid unnecessary REQ_FLUSH
for example though.)

BTW,

	journal->j_tail_sequence =
		++journal->j_transaction_sequence;

Increment of ->j_transaction_sequence seems to be unnecessary, but
ext3 does this.
Signed-off-by: NOGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org

c0a2ad9b

ext4: more efficient SEEK_DATA implementation · 2d90c160

由 Jan Kara 提交于 3月 09, 2016

Using SEEK_DATA in a huge sparse file can easily lead to sotflockups as
ext4_seek_data() iterates hole block-by-block. Fix the problem by using
returned hole size from ext4_map_blocks() and thus skip the hole in one
go.

Update also SEEK_HOLE implementation to follow the same pattern as
SEEK_DATA to make future maintenance easier.

Furthermore we add cond_resched() to both ext4_seek_data() and
ext4_seek_hole() to avoid softlockups in case evil user creates huge
fragmented file and we have to go through lots of extents.
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

2d90c160

ext4: cleanup handling of bh->b_state in DAX mmap · e3fb8eb1

由 Jan Kara 提交于 3月 09, 2016

ext4_dax_mmap_get_block() updates bh->b_state directly instead of using
ext4_update_bh_state(). This is mostly a cosmetic issue since DAX code
always passes on-stack buffer_head but clean this up to make code more
uniform.
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

e3fb8eb1

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功