- 19 2月, 2015 22 次提交
-
-
由 Yan, Zheng 提交于
Signed-off-by: NYan, Zheng <zyan@redhat.com>
-
由 Yan, Zheng 提交于
So that MDS can check if any request is already completed and process completed requests in clientreplay stage. When completed requests are processed in clientreplay stage, MDS can avoid sending traceless replies. Signed-off-by: NYan, Zheng <zyan@redhat.com>
-
由 Ilya Dryomov 提交于
Signed-off-by: NIlya Dryomov <idryomov@redhat.com>
-
由 Chaitanya Huilgol 提交于
TCP_NODELAY socket option set on connection sockets, disables Nagle’s algorithm and improves latency characteristics. tcp_nodelay(default)/notcp_nodelay option flags provided to enable/disable setting the socket option. Signed-off-by: NChaitanya Huilgol <chaitanya.huilgol@sandisk.com> [idryomov@redhat.com: NO_TCP_NODELAY -> TCP_NODELAY, minor adjustments] Signed-off-by: NIlya Dryomov <idryomov@redhat.com>
-
由 Ilya Dryomov 提交于
If the clone is resized down to 0, it becomes standalone. If such resize is carried over while an image is mapped we would detect this and call rbd_dev_parent_put() which means "let go of all parent state, including the spec(s) of parent images(s)". This leads to a mismatch between "rbd info" and sysfs parent fields, so a fix is in order. # rbd create --image-format 2 --size 1 foo # rbd snap create foo@snap # rbd snap protect foo@snap # rbd clone foo@snap bar # DEV=$(rbd map bar) # rbd resize --allow-shrink --size 0 bar # rbd resize --size 1 bar # rbd info bar | grep parent parent: rbd/foo@snap Before: # cat /sys/bus/rbd/devices/0/parent (no parent image) After: # cat /sys/bus/rbd/devices/0/parent pool_id 0 pool_name rbd image_id 10056b8b4567 image_name foo snap_id 2 snap_name snap overlap 0 Signed-off-by: NIlya Dryomov <idryomov@redhat.com> Reviewed-by: NJosh Durgin <jdurgin@redhat.com> Reviewed-by: NAlex Elder <elder@linaro.org>
-
由 Yan, Zheng 提交于
ceph_handle_snapdir() checks ceph_mdsc_do_request()'s return value and creates snapdir inode if it's -ENOENT Signed-off-by: NYan, Zheng <zyan@redhat.com>
-
由 Yan, Zheng 提交于
ceph_add_cap() calls __check_cap_issue(), which clears directory inode' complete flag. so we should set the complete flag for empty directory should be set after calling ceph_add_cap(). Signed-off-by: NYan, Zheng <zyan@redhat.com>
-
由 Yan, Zheng 提交于
Signed-off-by: NYan, Zheng <zyan@redhat.com>
-
由 Yan, Zheng 提交于
remove all unsupported operations from {inode,file}_operations. Signed-off-by: NYan, Zheng <zyan@redhat.com>
-
由 Yan, Zheng 提交于
struct timespec uses 'long' to present second and nanosecond. 'long' is 64 bits on 64bits machine. ceph MDS expects time stamp to be encoded as struct ceph_timespec, which uses 'u32' to present second and nanosecond. Signed-off-by: NYan, Zheng <zyan@redhat.com>
-
由 Yan, Zheng 提交于
when inode has inline data but its size > PAGE_SIZE (it was truncated to larger size), previous direct read code return -EIO. This patch adds code to return zeros for data whose offset > PAGE_SIZE. Signed-off-by: NYan, Zheng <zyan@redhat.com>
-
由 Yan, Zheng 提交于
use an atomic variable to track number of sessions, this can avoid block operation inside wait loops. Signed-off-by: NYan, Zheng <zyan@redhat.com>
-
由 Yan, Zheng 提交于
we should not do block operation in wait_event_interruptible()'s condition check function, but reading inline data can block. so move the read inline data code to ceph_get_caps() Signed-off-by: NYan, Zheng <zyan@redhat.com>
-
由 Yan, Zheng 提交于
check_cap_flush() calls mutex_lock(), which may block. So we can't use it as condition check function for wait_event(); Signed-off-by: NYan, Zheng <zyan@redhat.com>
-
由 Ilya Dryomov 提交于
header_rwsem should be released on errors. Also remove useless rbd_dev->mapping.size != rbd_dev->header.image_size test. Signed-off-by: NIlya Dryomov <idryomov@redhat.com>
-
由 Yan, Zheng 提交于
When snaprealm is created, its initial reference count is zero. But in some rare cases, the newly created snaprealm is not referenced by anyone. This causes snaprealm with zero reference count not freed. The fix is set reference count of newly snaprealm to 1. The reference is return the function who requests to create the snaprealm. When the function finishes its job, it releases the reference. Signed-off-by: NYan, Zheng <zyan@redhat.com>
-
由 Yan, Zheng 提交于
A bug is found in striped_read() of fs/ceph/file.c. striped_read() calls ceph_zero_pape_vector_range(). The first argument, page_align + read + ret, passed to ceph_zero_pape_vector_range() is wrong. When a file has holes, this wrong parameter may cause memory corruption either in kernal space or user space. Kernel space memory may be corrupted in the case of non direct IO; user space memory may be corrupted in the case of direct IO. In the latter case, the application doing direct IO may crash due to memory corruption, as we have experienced. The correct value should be initial_align + read + ret, where intial_align = o_direct ? buf_align : io_align. Compared with page_align, the current page offest, initial_align is the initial page offest, which should be used to calculate the page and offset in ceph_zero_pape_vector_range(). Reported-by: Ncaifeng zhu <zhucaifeng@unissoft-nj.com> Signed-off-by: NYan, Zheng <zyan@redhat.com>
-
由 Rickard Strandqvist 提交于
Remove the function ceph_get_cached_acl() that is not used anywhere. This was partially found by using a static code analysis program called cppcheck. Signed-off-by: NRickard Strandqvist <rickard_strandqvist@spectrumdigital.se> Reviewed-by: NYan, Zheng <zyan@redhat.com>
-
由 Rickard Strandqvist 提交于
It's been largely superseded by dup_token() and unused for over 2 years, identified by cppcheck. Signed-off-by: NRickard Strandqvist <rickard_strandqvist@spectrumdigital.se> [idryomov@redhat.com: changelog] Signed-off-by: NIlya Dryomov <idryomov@redhat.com>
-
由 Yan, Zheng 提交于
mark session as readonly and wake up all cap waiters. Signed-off-by: NYan, Zheng <zyan@redhat.com>
-
由 Ilya Dryomov 提交于
Signed-off-by: NIlya Dryomov <idryomov@redhat.com>
-
由 Ilya Dryomov 提交于
On Mon, Dec 22, 2014 at 5:35 PM, Sage Weil <sage@newdream.net> wrote: > On Mon, 22 Dec 2014, Ilya Dryomov wrote: >> Actually, pool op stuff has been unused for over two years - looks like >> it was added for rbd create_snap and that got ripped out in 2012. It's >> unlikely we'd ever need to manage pools or snaps from the kernel client >> so I think it makes sense to nuke it. Sage? > > Yep! Signed-off-by: NIlya Dryomov <idryomov@redhat.com>
-
- 09 2月, 2015 5 次提交
-
-
由 Linus Torvalds 提交于
-
git://git.rocketboards.org/linux-socfpga-next由 Linus Torvalds 提交于
Pull nios2 fix from Ley Foon Tan: "This fixes incorrect behavior of some user programs" * tag 'nios2-fixes-v3.19-final' of git://git.rocketboards.org/linux-socfpga-next: nios2: fix unhandled signals
-
git://git.kvack.org/~bcrl/aio-fixes由 Linus Torvalds 提交于
Pull aio nested sleep annotation from Ben LaHaise, * git://git.kvack.org/~bcrl/aio-fixes: aio: annotate aio_read_event_ring for sleep patterns
-
由 Linus Torvalds 提交于
Merge tag 'trace-fixes-v3.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull ftrace fixes from Steven Rostedt: "During testing Sedat Dilek hit a "suspicious RCU usage" splat that pointed out a real bug. During suspend and resume the tlb_flush tracepoint is called when the CPU is going offline. As the CPU has been noted as offline, RCU is ignoring that CPU, which means that it can not use RCU protected locks. When tracepoints are activated, they require RCU locking, and if RCU is ignoring a CPU that runs a tracepoint, there is a chance that the tracepoint could cause corruption. The solution was to change the tracepoint into a TRACE_EVENT_CONDITION() which allows us to check a condition to determine if the tracepoint should be called or not. If the condition is not met, the rcu protected code will not be executed. By adding the condition "cpu_online(smp_processor_id())", this will prevent the RCU protected code from being executed if the CPU is marked offline. After adding this, another bug was discovered. As RCU checks rcu callers, if a rcu call is not done, there is no check (obviously). We found that tracepoints could be added in RCU ignored locations and not have lockdep complain until the tracepoint is activated. This missed places where tracepoints were added in places they should not have been. To fix this, code was added in 3.18 that if lockdep is enabled, any tracepoint will still call the rcu checks even if the tracepoint is not enabled. The bug here, is that the check does not take the CONDITION into account. As the condition may prevent tracepoints from being activated in RCU ignored areas (as the one patch does), we get false positives when we enable lockdep and hit a tracepoint that the condition prevents it from being called in a RCU ignored location. The fix for this is to add the CONDITION to the rcu checks, even if the tracepoint is not enabled" * tag 'trace-fixes-v3.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: x86/tlb/trace: Do not trace on CPU that is offline tracing: Add condition check to RCU lockdep checks
-
由 Chung-Ling Tang 提交于
Follow other architectures for user fault handling. Signed-off-by: NChung-Ling Tang <cltang@codesourcery.com> Acked-by: NLey Foon Tan <lftan@altera.com>
-
- 08 2月, 2015 4 次提交
-
-
由 Steven Rostedt (Red Hat) 提交于
When taking a CPU down for suspend and resume, a tracepoint may be called when the CPU has been designated offline. As tracepoints require RCU for protection, they must not be called if the current CPU is offline. Unfortunately, trace_tlb_flush() is called in this scenario as was noted by LOCKDEP: ... Disabling non-boot CPUs ... intel_pstate CPU 1 exiting =============================== smpboot: CPU 1 didn't die... [ INFO: suspicious RCU usage. ] 3.19.0-rc7-next-20150204.1-iniza-small #1 Not tainted ------------------------------- include/trace/events/tlb.h:35 suspicious rcu_dereference_check() usage! other info that might help us debug this: RCU used illegally from offline CPU! rcu_scheduler_active = 1, debug_locks = 0 no locks held by swapper/1/0. stack backtrace: CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.19.0-rc7-next-20150204.1-iniza-small #1 Hardware name: SAMSUNG ELECTRONICS CO., LTD. 530U3BI/530U4BI/530U4BH/530U3BI/530U4BI/530U4BH, BIOS 13XK 03/28/2013 0000000000000001 ffff88011a44fe18 ffffffff817e370d 0000000000000011 ffff88011a448290 ffff88011a44fe48 ffffffff810d6847 ffff8800c66b9600 0000000000000001 ffff88011a44c000 ffffffff81cb3900 ffff88011a44fe78 Call Trace: [<ffffffff817e370d>] dump_stack+0x4c/0x65 [<ffffffff810d6847>] lockdep_rcu_suspicious+0xe7/0x120 [<ffffffff810b71a5>] idle_task_exit+0x205/0x2c0 [<ffffffff81054c4e>] play_dead_common+0xe/0x50 [<ffffffff81054ca5>] native_play_dead+0x15/0x140 [<ffffffff8102963f>] arch_cpu_idle_dead+0xf/0x20 [<ffffffff810cd89e>] cpu_startup_entry+0x37e/0x580 [<ffffffff81053e20>] start_secondary+0x140/0x150 intel_pstate CPU 2 exiting ... By converting the tlb_flush tracepoint to a TRACE_EVENT_CONDITION where the condition is cpu_online(smp_processor_id()), we can avoid calling RCU protected code when the CPU is offline. Link: http://lkml.kernel.org/r/CA+icZUUGiGDoL5NU8RuxKzFjoLjEKRtUWx=JB8B9a0EQv-eGzQ@mail.gmail.com Cc: stable@vger.kernel.org # 3.17+ Fixes: d17d8f9d "x86/mm: Add tracepoints for TLB flushes" Reported-by: NSedat Dilek <sedat.dilek@gmail.com> Tested-by: NSedat Dilek <sedat.dilek@gmail.com> Suggested-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: NDave Hansen <dave@sr71.net> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Steven Rostedt (Red Hat) 提交于
The trace_tlb_flush() tracepoint can be called when a CPU is going offline. When a CPU is offline, RCU is no longer watching that CPU and since the tracepoint is protected by RCU, it must not be called. To prevent the tlb_flush tracepoint from being called when the CPU is offline, it was converted to a TRACE_EVENT_CONDITION where the condition checks if the CPU is online before calling the tracepoint. Unfortunately, this was not enough to stop lockdep from complaining about it. Even though the RCU protected code of the tracepoint will never be called, the condition is hidden within the tracepoint, and even though the condition prevents RCU code from being called, the lockdep checks are outside the tracepoint (this is to test tracepoints even when they are not enabled). Even though tracepoints should be checked to be RCU safe when they are not enabled, the condition should still be considered when checking RCU. Link: http://lkml.kernel.org/r/CA+icZUUGiGDoL5NU8RuxKzFjoLjEKRtUWx=JB8B9a0EQv-eGzQ@mail.gmail.com Fixes: 3a630178 "tracing: generate RCU warnings even when tracepoints are disabled" Cc: stable@vger.kernel.org # 3.18+ Acked-by: NDave Hansen <dave@sr71.net> Reported-by: NSedat Dilek <sedat.dilek@gmail.com> Tested-by: NSedat Dilek <sedat.dilek@gmail.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband由 Linus Torvalds 提交于
Pull one more infiniband revert from Roland Dreier: "One more last-second RDMA change for 3.19: Yann realized that the previous revert of new userspace ABI did not go far enough, and we're still exposing a change that we don't want. Revert even closer to 3.18 interface to make sure we get things right in the long run" Yann Droneaud pipes up: "I hope this could go in v3.19 as, at this stage, we don't want to expose any bits of this ABI in a released kernel" * tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: Revert "IB/core: Add support for extended query device caps"
-
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs由 Linus Torvalds 提交于
Pull btrfs fix from Chris Mason: "Forrest Liu tracked down a missing blk_finish_plug in the btrfs logging code. This isn't a new bug, and it's hard to hit. But, it's safe enough for inclusion now, and in my for-linus branch" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Btrfs: add missing blk_finish_plug in btrfs_sync_log()
-
- 07 2月, 2015 6 次提交
-
-
由 Linus Torvalds 提交于
Merge branches 'timers-urgent-for-linus' and 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer and x86 fix from Ingo Molnar: "A CLOCK_TAI early expiry fix and an x86 microcode driver oops fix" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: hrtimer: Fix incorrect tai offset calculation for non high-res timer systems * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, microcode: Return error from driver init code when loader is disabled
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip由 Linus Torvalds 提交于
Pull scheduler fixes from Ingo Molnar: "Misc fixes" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/deadline: Fix deadline parameter modification handling sched/wait: Remove might_sleep() from wait_event_cmd() sched: Fix crash if cpuset_cpumask_can_shrink() is passed an empty cpumask sched/fair: Avoid using uninitialized variable in preferred_group_nid()
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip由 Linus Torvalds 提交于
Pull core kernel fixes from Ingo Molnar: "Two liblockdep fixes and a CPU hotplug race fix" * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: tools/liblockdep: don't include host headers tools/liblockdep: ignore generated .so file smpboot: Add missing get_online_cpus() in smpboot_register_percpu_thread()
-
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound由 Linus Torvalds 提交于
Pull sound fixes from Takashi Iwai: "Hopefully the final pull request for 3.19: this ended up with a slightly higher volume than wished, but I put them all as they are either stable or 3.19 regression fixes. Most of commits are from ASoC, and have been stewed for a while in linux-next. The only change in the common code is the regression fixes for ASoC AC97 stuff wrt device registrations. The rest are device-specific, mostly small fixes in various ASoC drivers and ak411x on ice1724 boards" * tag 'sound-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ASoC: Intel: fix sst firmware path for cht-bsw-rt5672 ARM: dts: Fix I2S1, I2S2 compatible for exynos4 SoCs ASoC: sgtl5000: add delay before first I2C access MAINTAINERS: ASoC: add maintainer for Intel BDW/HSW ASoC driver ASoC: atmel_ssc_dai: fix the setting for DSP mode ASoC: sgtl5000: Use shift mask when setting codec mode ASoC: tlv320aic3x: Fix data delay configuration ALSA: ak411x: Fix stall in work callback ASoC: Intel: Used lock version to update shim registers ASoC: wm8731: init mutex in i2c init path ASoC: atmel_ssc_dai: fix start event for I2S mode ASoC: rt5640: Add RT5642 ACPI ID for Intel Baytrail ASoC: wm97xx: Reset AC'97 device before registering it ASoC: Add support for allocating AC'97 device before registering it
-
由 Linus Torvalds 提交于
Merge misc fixes from Andrew Morton: "7 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm/debug_pagealloc: fix build failure on ppc and some other archs nilfs2: fix deadlock of segment constructor over I_SYNC flag MAINTAINERS: remove SUPERH website memcg, shmem: fix shmem migration to use lrucare mm: export "high_memory" symbol on !MMU .mailmap: update Konstantin Khlebnikov's email address mm: pagewalk: call pte_hole() for VM_PFNMAP during walk_page_range
-
git://git.linux-mips.org/pub/scm/ralf/upstream-linus由 Linus Torvalds 提交于
Pull MIPS fixes from Ralf Baechle: "The pending MIPS fixes for 3.19. All across the field and nothing particularly severe or dramatic" * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (23 commits) IRQCHIP: mips-gic: Avoid rerouting timer IRQs for smp-cmp MIPS: Fix syscall_get_nr for the syscall exit tracing. MIPS: elf2ecoff: Ignore PT_MIPS_ABIFLAGS program headers. MIPS: elf2ecoff: Rewrite main processing loop to switch. MIPS: fork: Fix MSA/FPU/DSP context duplication race MIPS: Fix C0_Pagegrain[IEC] support. MIPS: traps: Fix inline asm ctc1 missing .set hardfloat MIPS: mipsregs.h: Add write_32bit_cp1_register() MIPS: Fix kernel lockup or crash after CPU offline/online MIPS: OCTEON: fix kernel crash when offlining a CPU MIPS: ARC: Fix build error. MIPS: IRQ: Fix disable_irq on CPU IRQs MIPS: smp-mt,smp-cmp: Enable all HW IRQs on secondary CPUs MIPS: Fix restart of indirect syscalls MIPS: ELF: fix loading o32 binaries on 64-bit kernels MIPS: mips-cm: Fix sparse warnings MIPS: Kconfig: Fix recursive dependency. MIPS: Compat: Fix build error if CONFIG_MIPS32_COMPAT but no compat ABI. MIPS: JZ4740: Fixup #include's (sparse) MIPS: Wire up execveat(2). ...
-
- 06 2月, 2015 3 次提交
-
-
由 Yann Droneaud 提交于
While commit 7e36ef82 ("IB/core: Temporarily disable ex_query_device uverb") is correct as it makes the extended QUERY_DEVICE uverb (which came as part of commit 5a77abf9 ("IB/core: Add support for extended query device caps") and commit 860f10a7 ("IB/core: Add flags for on demand paging support")) not available to userspace, it doesn't address the initial issue regarding ib_copy_to_udata() [1][2]. Additionally, further discussions around this new uverb seems to conclude it would require a different data structure than the one currently described in <rdma/ib_user_verbs.h> [3]. Both of these issues require a revert of the changes, so this patch partially reverts commit 8cdd312c ("IB/mlx5: Implement the ODP capability query verb") and commit 860f10a7 ("IB/core: Add flags for on demand paging support") and fully reverts commit 5a77abf9 ("IB/core: Add support for extended query device caps"). [1] "Re: [PATCH v3 06/17] IB/core: Add support for extended query device caps" http://mid.gmane.org/1418733236.2779.26.camel@opteya.com [2] "Re: [PATCH] IB/core: Temporarily disable ex_query_device uverb" http://mid.gmane.org/1423067503.3030.83.camel@opteya.com [3] "RE: [PATCH v1 1/5] IB/uverbs: ex_query_device: answer must not depend on request's comp_mask" http://mid.gmane.org/2807E5FD2F6FDA4886F6618EAC48510E0CC12C30@CRSMSX101.amr.corp.intel.com Cc: Eli Cohen <eli@mellanox.com> Cc: Haggai Eran <haggaie@mellanox.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Cc: Sagi Grimberg <sagig@mellanox.com> Cc: Shachar Raindel <raindel@mellanox.com> Signed-off-by: NYann Droneaud <ydroneaud@opteya.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Joonsoo Kim 提交于
Kim Phillips reported following build failure. LD init/built-in.o mm/built-in.o: In function `free_pages_prepare': mm/page_alloc.c:770: undefined reference to `.kernel_map_pages' mm/built-in.o: In function `prep_new_page': mm/page_alloc.c:933: undefined reference to `.kernel_map_pages' mm/built-in.o: In function `map_pages': mm/compaction.c:61: undefined reference to `.kernel_map_pages' make: *** [vmlinux] Error 1 Reason for this problem is that commit 031bc574 ("mm/debug-pagealloc: make debug-pagealloc boottime configurable") forgot to remove the old declaration of kernel_map_pages() for some architectures. This patch removes them to fix build failure. Reported-by: NKim Phillips <kim.phillips@freescale.com> Signed-off-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: David Miller <davem@davemloft.net> Cc: David Howells <dhowells@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ryusuke Konishi 提交于
Nilfs2 eventually hangs in a stress test with fsstress program. This issue was caused by the following deadlock over I_SYNC flag between nilfs_segctor_thread() and writeback_sb_inodes(): nilfs_segctor_thread() nilfs_segctor_thread_construct() nilfs_segctor_unlock() nilfs_dispose_list() iput() iput_final() evict() inode_wait_for_writeback() * wait for I_SYNC flag writeback_sb_inodes() * set I_SYNC flag on inode->i_state __writeback_single_inode() do_writepages() nilfs_writepages() nilfs_construct_dsync_segment() nilfs_segctor_sync() * wait for completion of segment constructor inode_sync_complete() * clear I_SYNC flag after __writeback_single_inode() completed writeback_sb_inodes() calls do_writepages() for dirty inodes after setting I_SYNC flag on inode->i_state. do_writepages() in turn calls nilfs_writepages(), which can run segment constructor and wait for its completion. On the other hand, segment constructor calls iput(), which can call evict() and wait for the I_SYNC flag on inode_wait_for_writeback(). Since segment constructor doesn't know when I_SYNC will be set, it cannot know whether iput() will block or not unless inode->i_nlink has a non-zero count. We can prevent evict() from being called in iput() by implementing sop->drop_inode(), but it's not preferable to leave inodes with i_nlink == 0 for long periods because it even defers file truncation and inode deallocation. So, this instead resolves the deadlock by calling iput() asynchronously with a workqueue for inodes with i_nlink == 0. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Cc: Al Viro <viro@zeniv.linux.org.uk> Tested-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-