提交 · 4531126753aaf936e2674d28245400c6559ef0ee · openanolis / cloud-kernel

26 3月, 2016 11 次提交

ceph: remove unnecessary NULL check · 45311267

由 Yan, Zheng 提交于 3月 10, 2016

If page->mapping is NULL, releasepage() callback does not get called.
Remove the unnecessary NULL check to make static code analysis tool
happy
Signed-off-by: NYan, Zheng <zyan@redhat.com>

45311267

Y
ceph: avoid updating directory inode's i_size accidentally · a3d714c3
由 Yan, Zheng 提交于 2月 26, 2016
```
Directory inode's i_size is used by readdir cache.
Signed-off-by: NYan, Zheng <zyan@redhat.com>
```
a3d714c3

ceph: fix race during filling readdir cache · af5e5eb5

由 Yan, Zheng 提交于 2月 26, 2016

Readdir cache uses page cache to save dentry pointers. When adding
dentry pointers to middle of a page, we need to make sure the page
already exists. Otherwise the beginning part of the page will be
invalid pointers.
Signed-off-by: NYan, Zheng <zyan@redhat.com>

af5e5eb5

ceph: kill ceph_empty_snapc · 34b759b4

由 Ilya Dryomov 提交于 2月 16, 2016

ceph_empty_snapc->num_snaps == 0 at all times.  Passing such a snapc to
ceph_osdc_alloc_request() (possibly through ceph_osdc_new_request()) is
equivalent to passing NULL, as ceph_osdc_alloc_request() uses it only
for sizing the request message.

Further, in all four cases the subsequent ceph_osdc_build_request() is
passed NULL for snapc, meaning that 0 is encoded for seq and num_snaps
and making ceph_empty_snapc entirely useless.  The two cases where it
actually mattered were removed in commits 86056090 ("ceph: avoid
sending unnessesary FLUSHSNAP message") and 23078637 ("ceph: fix
queuing inode to mdsdir's snaprealm").
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
Reviewed-by: NYan, Zheng <zyan@redhat.com>

34b759b4

ceph: fix a wrong comparison · ce435593

由 Anton Protopopov 提交于 2月 10, 2016

A negative value rc compared to the positive value ENOENT in the
finish_read() function.
Signed-off-by: NAnton Protopopov <a.s.protopopov@gmail.com>
Signed-off-by: NYan, Zheng <zyan@redhat.com>

ce435593

ceph: replace CURRENT_TIME by current_fs_time() · 8bbd4714

由 Deepa Dinamani 提交于 2月 02, 2016

CURRENT_TIME macro is not appropriate for filesystems as it
doesn't use the right granularity for filesystem timestamps.
Use current_fs_time() instead.
Signed-off-by: NDeepa Dinamani <deepa.kernel@gmail.com>
Signed-off-by: NYan, Zheng <zyan@redhat.com>

8bbd4714

ceph: scattered page writeback · 5b64640c

由 Yan, Zheng 提交于 1月 07, 2016

This patch makes ceph_writepages_start() try using single OSD request
to write all dirty pages within a strip unit. When a nonconsecutive
dirty page is found, ceph_writepages_start() tries starting a new write
operation to existing OSD request. If it succeeds, it uses the new
operation to writeback the dirty page.
Signed-off-by: NYan, Zheng <zyan@redhat.com>

5b64640c

ceph: remove useless BUG_ON · a587d71b

由 Yan, Zheng 提交于 1月 27, 2016

ceph_osdc_start_request() never return -EOLDSNAP
Signed-off-by: NYan, Zheng <zyan@redhat.com>

a587d71b

ceph: don't enable rbytes mount option by default · 133e9156

由 Yan, Zheng 提交于 1月 25, 2016

When rbytes mount option is enabled, directory size is recursive
size. Recursive size is not updated instantly. This can cause
directory size to change between successive stat(1)
Signed-off-by: NYan, Zheng <zyan@redhat.com>

133e9156

Y
ceph: encode ctime in cap message · d1eee0c0
由 Yan, Zheng 提交于 1月 22, 2016
```
Signed-off-by: NYan, Zheng <zyan@redhat.com>
```
d1eee0c0

libceph: revamp subs code, switch to SUBSCRIBE2 protocol · 82dcabad

由 Ilya Dryomov 提交于 1月 19, 2016

It is currently hard-coded in the mon_client that mdsmap and monmap
subs are continuous, while osdmap sub is always "onetime". To better
handle full clusters/pools in the osd_client, we need to be able to
issue continuous osdmap subs. Revamp subs code to allow us to specify
for each sub whether it should be continuous or not.

Although not strictly required for the above, switch to SUBSCRIBE2
protocol while at it, eliminating the ambiguity between a request for
"every map since X" and a request for "just the latest" when we don't
have a map yet (i.e. have epoch 0). SUBSCRIBE2 feature bit is now
required - it's been supported since pre-argonaut (2010).

Move "got mdsmap" call to the end of ceph_mdsc_handle_map() - calling
in before we validate the epoch and successfully install the new map
can mess up mon_client sub state.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

82dcabad

14 3月, 2016 2 次提交

ceph_fill_trace(): don't bother with d_instantiate(dn, NULL) · f8b31710

由 Al Viro 提交于 3月 07, 2016

... and use d_add(dn, NULL) in case we need to hash a negative
unhashed rather than using d_rehash() directly.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

f8b31710

ceph: don't bother with d_rehash() in splice_dentry() · f7380af0

由 Al Viro 提交于 3月 06, 2016

d_splice_alias() guarantees that it'll be always hashed
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

f7380af0

05 3月, 2016 1 次提交

ceph: initial CEPH_FEATURE_FS_FILE_LAYOUT_V2 support · 5ea5c5e0

由 Yan, Zheng 提交于 2月 14, 2016

Add support for the format change of MClientReply/MclientCaps.
Also add code that denies access to inodes with pool_ns layouts.
Signed-off-by: NYan, Zheng <zyan@redhat.com>
Reviewed-by: NSage Weil <sage@redhat.com>

5ea5c5e0

05 2月, 2016 2 次提交

Y
ceph: fix snap context leak in error path · db6aed70
由 Yan, Zheng 提交于 1月 26, 2016
```
Signed-off-by: NYan, Zheng <zyan@redhat.com>
```
db6aed70

ceph: checking for IS_ERR instead of NULL · 1418bf07

由 Dan Carpenter 提交于 1月 26, 2016

ceph_osdc_alloc_request() returns NULL on error, it never returns error
pointers.

Fixes: 5be0389d ('ceph: re-send AIO write request when getting -EOLDSNAP error')
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

1418bf07

23 1月, 2016 1 次提交

wrappers for ->i_mutex access · 5955102c

由 Al Viro 提交于 1月 22, 2016

parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested},
inode_foo(inode) being mutex_foo(&inode->i_mutex).

Please, use those for access to ->i_mutex; over the coming cycle
->i_mutex will become rwsem, with ->lookup() done with it held
only shared.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

5955102c

22 1月, 2016 5 次提交

ceph: use i_size_{read,write} to get/set i_size · 99c88e69

由 Yan, Zheng 提交于 12月 30, 2015

Cap message from MDS can update i_size. In that case, we don't
hold i_mutex. So it's unsafe to directly access inode->i_size
while holding i_mutex.
Signed-off-by: NYan, Zheng <zyan@redhat.com>

99c88e69

ceph: re-send AIO write request when getting -EOLDSNAP error · 5be0389d

由 Yan, Zheng 提交于 12月 24, 2015

When receiving -EOLDSNAP from OSD, we need to re-send corresponding
write request. Due to locking issue, we can send new request inside
another OSD request's complete callback. So we use worker to re-send
request for AIO write.
Signed-off-by: NYan, Zheng <zyan@redhat.com>

5be0389d

ceph: Asynchronous IO support · c8fe9b17

由 Yan, Zheng 提交于 12月 23, 2015

The basic idea of AIO support is simple, just call kiocb::ki_complete()
in OSD request's complete callback. But there are several special cases.

when IO span multiple objects, we need to wait until all OSD requests
are complete, then call kiocb::ki_complete(). Error handling in this case
is tricky too. For simplify, AIO both span multiple objects and extends
i_size are not allowed.

Another special case is check EOF for reading (other client can write to
the file and extend i_size concurrently). For simplify, the direct-IO/AIO
code path does do the check, fallback to normal syn read instead.
Signed-off-by: NYan, Zheng <zyan@redhat.com>

c8fe9b17

ceph: Avoid to propagate the invalid page point · 458c4703

由 Minfei Huang 提交于 12月 19, 2015

The variant pagep will still get the invalid page point, although ceph
fails in function ceph_update_writeable_page.

To fix this issue, Assigne the page to pagep until there is no failure
in function ceph_update_writeable_page.
Signed-off-by: NMinfei Huang <mnfhuang@gmail.com>
Signed-off-by: NYan, Zheng <zyan@redhat.com>

458c4703

ceph: fix double page_unlock() in page_mkwrite() · f9cac5ac

由 Yan, Zheng 提交于 12月 17, 2015

ceph_update_writeable_page() unlocks the page on errors, so
page_mkwrite() should not unlock the page again.
Signed-off-by: NYan, Zheng <zyan@redhat.com>

f9cac5ac

15 1月, 2016 1 次提交

kmemcg: account certain kmem allocations to memcg · 5d097056

由 Vladimir Davydov 提交于 1月 14, 2016

Mark those kmem allocations that are known to be easily triggered from
userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them accounted to
memcg.  For the list, see below:

 - threadinfo
 - task_struct
 - task_delay_info
 - pid
 - cred
 - mm_struct
 - vm_area_struct and vm_region (nommu)
 - anon_vma and anon_vma_chain
 - signal_struct
 - sighand_struct
 - fs_struct
 - files_struct
 - fdtable and fdtable->full_fds_bits
 - dentry and external_name
 - inode for all filesystems. This is the most tedious part, because
   most filesystems overwrite the alloc_inode method.

The list is far from complete, so feel free to add more objects.
Nevertheless, it should be close to "account everything" approach and
keep most workloads within bounds.  Malevolent users will be able to
breach the limit, but this was possible even with the former "account
everything" approach (simply because it did not account everything in
fact).

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: NVladimir Davydov <vdavydov@virtuozzo.com>
Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
Acked-by: NMichal Hocko <mhocko@suse.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Greg Thelen <gthelen@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

5d097056

09 12月, 2015 1 次提交

replace ->follow_link() with new method that could stay in RCU mode · 6b255391

由 Al Viro 提交于 11月 17, 2015

new method: ->get_link(); replacement of ->follow_link().  The differences
are:
	* inode and dentry are passed separately
	* might be called both in RCU and non-RCU mode;
the former is indicated by passing it a NULL dentry.
	* when called that way it isn't allowed to block
and should return ERR_PTR(-ECHILD) if it needs to be called
in non-RCU mode.

It's a flagday change - the old method is gone, all in-tree instances
converted.  Conversion isn't hard; said that, so far very few instances
do not immediately bail out when called in RCU mode.  That'll change
in the next commits.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

6b255391

07 12月, 2015 1 次提交

posix acls: Remove duplicate xattr name definitions · 97d79299

由 Andreas Gruenbacher 提交于 12月 02, 2015

Remove POSIX_ACL_XATTR_{ACCESS,DEFAULT} and GFS2_POSIX_ACL_{ACCESS,DEFAULT}
and replace them with the definitions in <include/uapi/linux/xattr.h>.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: NJames Morris <james.l.morris@oracle.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

97d79299

07 11月, 2015 1 次提交

mm, fs: introduce mapping_gfp_constraint() · c62d2555

由 Michal Hocko 提交于 11月 06, 2015

There are many places which use mapping_gfp_mask to restrict a more
generic gfp mask which would be used for allocations which are not
directly related to the page cache but they are performed in the same
context.

Let's introduce a helper function which makes the restriction explicit and
easier to track.  This patch doesn't introduce any functional changes.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: NMichal Hocko <mhocko@suse.com>
Suggested-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c62d2555

03 11月, 2015 7 次提交

libceph: msg signing callouts don't need con argument · 79dbd1ba

由 Ilya Dryomov 提交于 10月 26, 2015

We can use msg->con instead - at the point we sign an outgoing message
or check the signature on the incoming one, msg->con is always set.  We
wouldn't know how to sign a message without an associated session (i.e.
msg->con == NULL) and being able to sign a message using an explicitly
provided authorizer is of no use.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

79dbd1ba

ceph: make fsync() wait unsafe requests that created/modified inode · 68cd5b4b

由 Yan, Zheng 提交于 10月 27, 2015

If we get a unsafe reply for request that created/modified inode,
add the unsafe request to a list in the newly created/modified
inode. So we can make fsync() wait these unsafe requests.
Signed-off-by: NYan, Zheng <zyan@redhat.com>

68cd5b4b

ceph: add request to i_unsafe_dirops when getting unsafe reply · 4c06ace8

由 Yan, Zheng 提交于 10月 27, 2015

Previously we add request to i_unsafe_dirops when registering
request. So ceph_fsync() also waits for imcomplete requests.
This is unnecessary, ceph_fsync() only needs to wait unsafe
requests.
Signed-off-by: NYan, Zheng <zyan@redhat.com>

4c06ace8

ceph: don't invalidate page cache when inode is no longer used · 5e804ac4

由 Yan, Zheng 提交于 10月 26, 2015

ceph_check_caps() invalidate page cache when inode is not used
by any open file. This behaviour is not friendly for workload
that repeatly read files.
Signed-off-by: NYan, Zheng <zyan@redhat.com>

5e804ac4

ceph: combine as many iovec as possile into one OSD request · b5b98989

由 Zhu, Caifeng 提交于 10月 08, 2015

Both ceph_sync_direct_write and ceph_sync_read iterate iovec elements
one by one, send one OSD request for each iovec. This is sub-optimal,
We can combine serveral iovec into one page vector, and send an OSD
request for the whole page vector.
Signed-off-by: NZhu, Caifeng <zhucaifeng@unissoft-nj.com>
Signed-off-by: NYan, Zheng <zyan@redhat.com>

b5b98989

ceph: fix message length computation · 777d738a

由 Arnd Bergmann 提交于 9月 30, 2015

create_request_message() computes the maximum length of a message,
but uses the wrong type for the time stamp: sizeof(struct timespec)
may be 8 or 16 depending on the architecture, while sizeof(struct
ceph_timespec) is always 8, and that is what gets put into the
message.

Found while auditing the uses of timespec for y2038 problems.

Fixes: b8e69066 ("ceph: include time stamp in every MDS request")
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NYan, Zheng <zyan@redhat.com>

777d738a

ceph: fix a comment typo · 1291fb95

由 Geliang Tang 提交于 9月 30, 2015

Signed-off-by: NGeliang Tang <geliangtang@163.com>
Signed-off-by: NYan, Zheng <zyan@redhat.com>

1291fb95

23 10月, 2015 1 次提交

Move locks API users to locks_lock_inode_wait() · 4f656367

由 Benjamin Coddington 提交于 10月 22, 2015

Instead of having users check for FL_POSIX or FL_FLOCK to call the correct
locks API function, use the check within locks_lock_inode_wait().  This
allows for some later cleanup.
Signed-off-by: NBenjamin Coddington <bcodding@redhat.com>
Signed-off-by: NJeff Layton <jeff.layton@primarydata.com>

4f656367

11 9月, 2015 1 次提交

mm: mark most vm_operations_struct const · 7cbea8dc

由 Kirill A. Shutemov 提交于 9月 09, 2015

With two exceptions (drm/qxl and drm/radeon) all vm_operations_struct
structs should be constant.
Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: NOleg Nesterov <oleg@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7cbea8dc

09 9月, 2015 5 次提交

ceph: improve readahead for file holes · 43838685

由 Yan, Zheng 提交于 9月 07, 2015

When readahead encounters file holes, osd reply returns error -ENOENT,
finish_read() skips adding pages to the the page cache. So readahead
does not work for file holes. The fix is adding zero pages to the
page cache when -ENOENT is returned.
Signed-off-by: NYan, Zheng <zyan@redhat.com>

43838685

Y
ceph: get inode size for each append write · 55b0b31c
由 Yan, Zheng 提交于 9月 07, 2015
```
Signed-off-by: NYan, Zheng <zyan@redhat.com>
```
55b0b31c

ceph: cleanup use of ceph_msg_get · 5fdb1389

由 Jianpeng Ma 提交于 8月 18, 2015

Signed-off-by: NJianpeng Ma <jianpeng.ma@intel.com>
Signed-off-by: NYan, Zheng <zyan@redhat.com>

5fdb1389

ceph: no need to get parent inode in ceph_open · e36d571d

由 Jianpeng Ma 提交于 8月 18, 2015

parent inode is needed in creating new inode case.  For ceph_open,
the target inode already exists.
Signed-off-by: NJianpeng Ma <jianpeng.ma@intel.com>
Signed-off-by: NYan, Zheng <zyan@redhat.com>

e36d571d

ceph: remove the useless judgement · a43137f7

由 Jianpeng Ma 提交于 8月 18, 2015

err != 0 is already handled. So skip this.
Signed-off-by: NJianpeng Ma <jianpeng.ma@intel.com>
Signed-off-by: NYan, Zheng <zyan@redhat.com>

a43137f7

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功