- 25 6月, 2015 5 次提交
-
-
由 Yan, Zheng 提交于
Signed-off-by: NYan, Zheng <zyan@redhat.com>
-
由 Yan, Zheng 提交于
In most cases that snap context is needed, we are holding reference of CEPH_CAP_FILE_WR. So we can set ceph inode's i_head_snapc when getting the CEPH_CAP_FILE_WR reference, and make codes get snap context from i_head_snapc. This makes the code simpler. Another benefit of this change is that we can handle snap notification more elegantly. Especially when snap context is updated while someone else is doing write. The old queue cap_snap code may set cap_snap's context to ether the old context or the new snap context, depending on if i_head_snapc is set. The new queue capp_snap code always set cap_snap's context to the old snap context. Signed-off-by: NYan, Zheng <zyan@redhat.com>
-
由 Yan, Zheng 提交于
Cached_context in ceph_snap_realm is directly accessed by uninline_data() and get_pool_perm(). This is racy in theory. both uninline_data() and get_pool_perm() do not modify existing object, they only create new object. So we can pass the empty snap context to them. Unlike cached_context in ceph_snap_realm, we do not need to protect the empty snap context. Signed-off-by: NYan, Zheng <zyan@redhat.com>
-
由 Yan, Zheng 提交于
Signed-off-by: NYan, Zheng <zyan@redhat.com>
-
由 Yan, Zheng 提交于
Signed-off-by: NYan, Zheng <zyan@redhat.com> Reviewed-by: NAlex Elder <elder@linaro.org>
-
- 22 4月, 2015 1 次提交
-
-
由 Yan, Zheng 提交于
For CEPH_OSD_CMPXATTR_MODE_U64, OSD expects the u64 to be encoded as string in object's xattr. Signed-off-by: NYan, Zheng <zyan@redhat.com>
-
- 20 4月, 2015 1 次提交
-
-
由 Taesoo Kim 提交于
When ceph_update_writeable_page fails (including -EAGAIN), it unlocks (w/ unlock_page) the page but does not 'release' (w/ page_cache_release) properly. Upon error, properly set *pagep to NULL, indicating an error. Signed-off-by: NTaesoo Kim <tsgatesv@gmail.com> Signed-off-by: NYan, Zheng <zyan@redhat.com>
-
- 12 4月, 2015 1 次提交
-
-
由 Omar Sandoval 提交于
Now that no one is using rw, remove it completely. Signed-off-by: NOmar Sandoval <osandov@osandov.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 19 2月, 2015 1 次提交
-
-
由 Yan, Zheng 提交于
when inode has inline data but its size > PAGE_SIZE (it was truncated to larger size), previous direct read code return -EIO. This patch adds code to return zeros for data whose offset > PAGE_SIZE. Signed-off-by: NYan, Zheng <zyan@redhat.com>
-
- 11 2月, 2015 1 次提交
-
-
由 Kirill A. Shutemov 提交于
Nobody uses it anymore. [akpm@linux-foundation.org: fix filemap_xip.c] Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 09 1月, 2015 1 次提交
-
-
由 Ilya Dryomov 提交于
len is size_t, should be printed with %zu. Signed-off-by: NIlya Dryomov <idryomov@redhat.com>
-
- 18 12月, 2014 6 次提交
-
-
由 Dan Carpenter 提交于
Probably this code was syncing a lot more often then intended because the do_sync variable wasn't set to zero. Cc: stable@vger.kernel.org # v3.11+ Fixes: c62988ec ('ceph: avoid meaningless calling ceph_caps_revoking if sync_mode == WB_SYNC_ALL.') Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NIlya Dryomov <idryomov@redhat.com>
-
由 Yan, Zheng 提交于
Before any data write, convert inline data to normal data and set i_inline_version to CEPH_INLINE_NONE. The OSD request that saves inline data to object contains 3 operations (CMPXATTR, WRITE and SETXATTR). It compares a xattr named 'inline_version' to prevent old data overwrites newer data. Signed-off-by: NYan, Zheng <zyan@redhat.com>
-
由 Yan, Zheng 提交于
we can't use getattr to fetch inline data while holding Fr cap, because it can cause deadlock. If we need to sync read inline data, drop cap refs first, then use getattr to fetch inline data. Signed-off-by: NYan, Zheng <zyan@redhat.com>
-
由 Yan, Zheng 提交于
we can't use getattr to fetch inline data after getting Fcr caps, because it can cause deadlock. The solution is try bringing inline data to page cache when not holding any cap, and hope the inline data page is still there after getting the Fcr caps. If the page is still there, pin it in page cache for later IO. Signed-off-by: NYan, Zheng <zyan@redhat.com>
-
由 Yan, Zheng 提交于
Request reply and cap message can contain inline data. add inline data to the page cache if there is Fc cap. Signed-off-by: NYan, Zheng <zyan@redhat.com>
-
由 Yan, Zheng 提交于
allow specifying position of extent operation in multi-operations osd request. This is required for cephfs to convert inline data to normal data (compare xattr, then write object). Signed-off-by: NYan, Zheng <zyan@redhat.com> Reviewed-by: NIlya Dryomov <idryomov@redhat.com>
-
- 15 10月, 2014 1 次提交
-
-
由 Chao Yu 提交于
Both ceph_update_writeable_page and ceph_setattr will verify file size with max size ceph supported. There are two caller for ceph_update_writeable_page, ceph_write_begin and ceph_page_mkwrite. For ceph_write_begin, we have already verified the size in generic_write_checks of ceph_write_iter; for ceph_page_mkwrite, we have no chance to change file size when mmap. Likewise we have already verified the size in inode_change_ok when we call ceph_setattr. So let's remove the redundant code for max file size verification. Signed-off-by: NChao Yu <chao2.yu@samsung.com> Reviewed-by: NYan, Zheng <zyan@redhat.com>
-
- 07 6月, 2014 1 次提交
-
-
由 Fabian Frederick 提交于
Update the last pr_warning callsites in fs branch Signed-off-by: NFabian Frederick <fabf@skynet.be> Cc: Sage Weil <sage@inktank.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 06 6月, 2014 1 次提交
-
-
由 Zhang Zhen 提交于
If the return value of ceph_osdc_readpages() is not negative, it is certainly greater than or equal to zero. Remove the useless condition judgment and redundant braces. Signed-off-by: NZhang Zhen <zhenzhang.zhang@huawei.com> Reviewed-by: NYan, Zheng <zheng.z.yan@intel.com>
-
- 07 5月, 2014 1 次提交
-
-
由 Al Viro 提交于
unmodified, for now Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 29 1月, 2014 1 次提交
-
-
由 Ilya Dryomov 提交于
PAGE_CACHE_SIZE is unsigned long on all architectures, however size_t is either unsigned int or unsigned long. Rather than change format strings, cast PAGE_CACHE_SIZE to size_t to be in line with dout()s in ceph_page_mkwrite(). Cc: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: NIlya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: NSage Weil <sage@inktank.com>
-
- 01 1月, 2014 2 次提交
-
-
由 Li Wang 提交于
Currently, if one new page allocated into fscache in readpage(), however, with no data read into due to error encountered during reading from OSDs, the slot in fscache is not uncached. This patch fixes this. Signed-off-by: NLi Wang <liwang@ubuntukylin.com> Reviewed-by: NMilosz Tanski <milosz@adfin.com>
-
由 Yan, Zheng 提交于
Adds cap check to the page fault handler. The check prevents page fault handler from adding new page to the page cache while Fcb caps are being revoked. This solves Fc revoking hang in multiple clients mmap IO workload. Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com> Reviewed-by: NSage Weil <sage@inktank.com>
-
- 14 12月, 2013 2 次提交
-
-
由 Li Wang 提交于
Clean up if error occurred rather than going through normal process Signed-off-by: NLi Wang <liwang@ubuntukylin.com> Signed-off-by: NYunchuan Wen <yunchuanwen@ubuntukylin.com> Signed-off-by: NSage Weil <sage@inktank.com>
-
由 Li Wang 提交于
If the length of data to be read in readpage() is exactly PAGE_CACHE_SIZE, the original code does not flush d-cache for data consistency after finishing reading. This patches fixes this. Signed-off-by: NLi Wang <liwang@ubuntukylin.com> Signed-off-by: NSage Weil <sage@inktank.com>
-
- 24 11月, 2013 1 次提交
-
-
由 Li Wang 提交于
ceph_osdc_readpages() returns number of bytes read, currently, the code only allocate full-zero page into fscache, this patch fixes this. Signed-off-by: NLi Wang <liwang@ubuntukylin.com> Reviewed-by: NMilosz Tanski <milosz@adfin.com> Reviewed-by: NSage Weil <sage@inktank.com>
-
- 07 9月, 2013 3 次提交
-
-
由 Milosz Tanski 提交于
Previous patch that allowed us to cleanup most of the issues with pages marked as private_2 when calling ceph_readpages. However, there seams to be a case in the error case clean up in start read that still trigers this from time to time. I've only seen this one a couple times. BUG: Bad page state in process petabucket pfn:335b82 page:ffffea000cd6e080 count:0 mapcount:0 mapping: (null) index:0x0 page flags: 0x200000000001000(private_2) Call Trace: [<ffffffff81563442>] dump_stack+0x46/0x58 [<ffffffff8112c7f7>] bad_page+0xc7/0x120 [<ffffffff8112cd9e>] free_pages_prepare+0x10e/0x120 [<ffffffff8112e580>] free_hot_cold_page+0x40/0x160 [<ffffffff81132427>] __put_single_page+0x27/0x30 [<ffffffff81132d95>] put_page+0x25/0x40 [<ffffffffa02cb409>] ceph_readpages+0x2e9/0x6f0 [ceph] [<ffffffff811313cf>] __do_page_cache_readahead+0x1af/0x260 Signed-off-by: NMilosz Tanski <milosz@adfin.com> Signed-off-by: NSage Weil <sage@inktank.com>
-
由 Milosz Tanski 提交于
In some cases the ceph readapages code code bails without filling all the pages already marked by fscache. When we return back to readahead code this causes a BUG. Signed-off-by: NMilosz Tanski <milosz@adfin.com>
-
由 Milosz Tanski 提交于
Adding support for fscache to the Ceph filesystem. This would bring it to on par with some of the other network filesystems in Linux (like NFS, AFS, etc...) In order to mount the filesystem with fscache the 'fsc' mount option must be passed. Signed-off-by: NMilosz Tanski <milosz@adfin.com> Signed-off-by: NSage Weil <sage@inktank.com>
-
- 28 8月, 2013 1 次提交
-
-
由 Sha Zhengju 提交于
Following we will begin to add memcg dirty page accounting around __set_page_dirty_{buffers,nobuffers} in vfs layer, so we'd better use vfs interface to avoid exporting those details to filesystems. Since vfs set_page_dirty() should be called under page lock, here we don't need elaborate codes to handle racy anymore, and two WARN_ON() are added to detect such exceptions. Thanks very much for Sage and Yan Zheng's coaching! I tested it in a two server's ceph environment that one is client and the other is mds/osd/mon, and run the following fsx test from xfstests: ./fsx 1MB -N 50000 -p 10000 -l 1048576 ./fsx 10MB -N 50000 -p 10000 -l 10485760 ./fsx 100MB -N 50000 -p 10000 -l 104857600 The fsx does lots of mmap-read/mmap-write/truncate operations and the tests completed successfully without triggering any of WARN_ON. Signed-off-by: NSha Zhengju <handai.szj@taobao.com> Reviewed-by: NSage Weil <sage@inktank.com>
-
- 16 8月, 2013 1 次提交
-
-
由 Milosz Tanski 提交于
The invalidatepage code bails if it encounters a non-zero page offset. The current logic that does is non-obvious with multiple if statements. This should be logically and functionally equivalent. Signed-off-by: NMilosz Tanski <milosz@adfin.com> Reviewed-by: NSage Weil <sage@inktank.com>
-
- 10 8月, 2013 1 次提交
-
-
由 Milosz Tanski 提交于
The early bug checks are moot because the VMA layer ensures those things. 1. It will not call invalidatepage unless PagePrivate (or PagePrivate2) are set 2. It will not call invalidatepage without taking a PageLock first. 3. Guantrees that the inode page is mapped. Signed-off-by: NMilosz Tanski <milosz@adfin.com> Reviewed-by: NSage Weil <sage@inktank.com>
-
- 04 7月, 2013 2 次提交
-
-
由 majianpeng 提交于
Signed-off-by: NJianpeng Ma <majianpeng@gmail.com> Reviewed-by: NSage Weil <sage@inktank.com>
-
由 Yan, Zheng 提交于
The client can receive truncate request from MDS at any time. So the page writeback code need to get i_size, truncate_seq and truncate_size atomically Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com> Reviewed-by: NSage Weil <sage@inktank.com>
-
- 22 5月, 2013 2 次提交
-
-
由 Lukas Czerner 提交于
->invalidatepage() aop now accepts range to invalidate so we can make use of it in ceph_invalidatepage(). Signed-off-by: NLukas Czerner <lczerner@redhat.com> Acked-by: NSage Weil <sage@inktank.com> Cc: ceph-devel@vger.kernel.org
-
由 Lukas Czerner 提交于
Currently there is no way to truncate partial page where the end truncate point is not at the end of the page. This is because it was not needed and the functionality was enough for file system truncate operation to work properly. However more file systems now support punch hole feature and it can benefit from mm supporting truncating page just up to the certain point. Specifically, with this functionality truncate_inode_pages_range() can be changed so it supports truncating partial page at the end of the range (currently it will BUG_ON() if 'end' is not at the end of the page). This commit changes the invalidatepage() address space operation prototype to accept range to be invalidated and update all the instances for it. We also change the block_invalidatepage() in the same way and actually make a use of the new length argument implementing range invalidation. Actual file system implementations will follow except the file systems where the changes are really simple and should not change the behaviour in any way .Implementation for truncate_page_range() which will be able to accept page unaligned ranges will follow as well. Signed-off-by: NLukas Czerner <lczerner@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Hugh Dickins <hughd@google.com>
-
- 02 5月, 2013 3 次提交
-
-
由 Alex Elder 提交于
In the incremental move toward supporting distinct data items in an osd request some of the functions had "write_request" parameters to indicate, basically, whether the data belonged to in_data or the out_data. Now that we maintain the data fields in the op structure there is no need to indicate the direction, so get rid of the "write_request" parameters. Signed-off-by: NAlex Elder <elder@inktank.com> Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
-
由 Yan, Zheng 提交于
ceph_writepages_start() reads inode->i_size in two places. It can get different values between successive read, because truncate can change inode->i_size at any time. The race can lead to mismatch between data length of osd request and pages marked as writeback. When osd request finishes, it clear writeback page according to its data length. So some pages can be left in writeback state forever. The fix is only read inode->i_size once, save its value to a local variable and use the local variable when i_size is needed. Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com> Reviewed-by: NAlex Elder <elder@inktank.com>
-
由 Alex Elder 提交于
This ends up being a rather large patch but what it's doing is somewhat straightforward. Basically, this is replacing two calls with one. The first of the two calls is initializing a struct ceph_osd_data with data (either a page array, a page list, or a bio list); the second is setting an osd request op so it associates that data with one of the op's parameters. In place of those two will be a single function that initializes the op directly. That means we sort of fan out a set of the needed functions: - extent ops with pages data - extent ops with pagelist data - extent ops with bio list data and - class ops with page data for receiving a response We also have define another one, but it's only used internally: - class ops with pagelist data for request parameters Note that we *still* haven't gotten rid of the osd request's r_data_in and r_data_out fields. All the osd ops refer to them for their data. For now, these data fields are pointers assigned to the appropriate r_data_* field when these new functions are called. Signed-off-by: NAlex Elder <elder@inktank.com> Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
-