提交 · ff3d0046625c1b37df37beb8477135d44dae2823 · openanolis / cloud-kernel

12 2月, 2013 3 次提交

ceph: Convert struct ceph_mds_request to use kuid_t and kgid_t · ff3d0046

由 Eric W. Biederman 提交于 1月 31, 2013

Hold the uid and gid for a pending ceph mds request using the types
kuid_t and kgid_t.  When a request message is finally created convert
the kuid_t and kgid_t values into uids and gids in the initial user
namespace.

Cc: Sage Weil <sage@inktank.com>
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

ff3d0046

ceph: Translate inode uid and gid attributes to/from kuids and kgids. · ab871b90

由 Eric W. Biederman 提交于 1月 31, 2013

- In fill_inode() transate uids and gids in the initial user namespace
  into kuids and kgids stored in inode->i_uid and inode->i_gid.

- In ceph_setattr() if they have changed convert inode->i_uid and
  inode->i_gid into initial user namespace uids and gids for
  transmission.

Cc: Sage Weil <sage@inktank.com>
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

ab871b90

ceph: Translate between uid and gids in cap messages and kuids and kgids · 05cb11c1

由 Eric W. Biederman 提交于 1月 31, 2013

- Make the uid and gid arguments of send_cap_msg() used to compose
  ceph_mds_caps messages of type kuid_t and kgid_t.

- Pass inode->i_uid and inode->i_gid in __send_cap to send_cap_msg()
  through variables of type kuid_t and kgid_t.

- Modify struct ceph_cap_snap to store uids and gids in types kuid_t
  and kgid_t.  This allows capturing inode->i_uid and inode->i_gid in
  ceph_queue_cap_snap() without loss and pssing them to
  __ceph_flush_snaps() where they are removed from struct
  ceph_cap_snap and passed to send_cap_msg().

- In handle_cap_grant translate uid and gids in the initial user
  namespace stored in struct ceph_mds_cap into kuids and kgids
  before setting inode->i_uid and inode->i_gid.

Cc: Sage Weil <sage@inktank.com>
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

05cb11c1

19 12月, 2012 1 次提交

ceph: fix dentry reference leak in ceph_encode_fh() · f6af75da

由 Cyril Roelandt 提交于 12月 18, 2012

dput() was not called in the error path.
Signed-off-by: NCyril Roelandt <tipecaml@gmail.com>
Cc: Sage Weil <sage@inktank.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f6af75da

18 12月, 2012 1 次提交

lseek: the "whence" argument is called "whence" · 965c8e59

由 Andrew Morton 提交于 12月 17, 2012

But the kernel decided to call it "origin" instead.  Fix most of the
sites.
Acked-by: NHugh Dickins <hughd@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

965c8e59

13 12月, 2012 9 次提交

libceph: Unlock unprocessed pages in start_read() error path · 8884d53d

由 David Zafman 提交于 12月 03, 2012

Function start_read() can get an error before processing all pages.
It must not only release the remaining pages, but unlock them too.

This fixes http://tracker.newdream.net/issues/3370Signed-off-by: NDavid Zafman <david.zafman@inktank.com>
Reviewed-by: NAlex Elder <elder@inktank.com>

8884d53d

ceph: call handle_cap_grant() for cap import message · 0e5e1774

由 Yan, Zheng 提交于 11月 19, 2012

If client sends cap message that requests new max size during
exporting caps, the exporting MDS will drop the message quietly.
So the client may wait for the reply that updates the max size
forever. call handle_cap_grant() for cap import message can
avoid this issue.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: NSage Weil <sage@inktank.com>

0e5e1774

ceph: Fix __ceph_do_pending_vmtruncate · a85f50b6

由 Yan, Zheng 提交于 11月 19, 2012

we should set i_truncate_pending to 0 after page cache is truncated
to i_truncate_size
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: NSage Weil <sage@inktank.com>

a85f50b6

ceph: Don't add dirty inode to dirty list if caps is in migration · 0685235f

由 Yan, Zheng 提交于 11月 19, 2012

Add dirty inode to cap_dirty_migrating list instead, this can avoid
ceph_flush_dirty_caps() entering infinite loop.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: NSage Weil <sage@inktank.com>

0685235f

ceph: Fix infinite loop in __wake_requests · ed75ec2c

由 Yan, Zheng 提交于 11月 19, 2012

__wake_requests() will enter infinite loop if we use it to wake
requests in the session->s_waiting list. __wake_requests() deletes
requests from the list and __do_request() adds requests back to
the list.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: NSage Weil <sage@inktank.com>

ed75ec2c

ceph: Don't update i_max_size when handling non-auth cap · 5e62ad30

由 Yan, Zheng 提交于 11月 19, 2012

The cap from non-auth mds doesn't have a meaningful max_size value.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: NSage Weil <sage@inktank.com>

5e62ad30

bdi_register: add __printf verification, fix arg mismatch · d2cc4dde

由 Joe Perches 提交于 11月 29, 2012

__printf is useful to verify format and arguments.
Signed-off-by: NJoe Perches <joe@perches.com>
Reviewed-by: NAlex Elder <elder@inktank.com>

d2cc4dde

libceph: remove 'osdtimeout' option · 83aff95e

由 Sage Weil 提交于 11月 28, 2012

This would reset a connection with any OSD that had an outstanding
request that was taking more than N seconds.  The idea was that if the
OSD was buggy, the client could compensate by resending the request.

In reality, this only served to hide server bugs, and we haven't
actually seen such a bug in quite a while.  Moreover, the userspace
client code never did this.

More importantly, often the request is taking a long time because the
OSD is trying to recover, or overloaded, and killing the connection
and retrying would only make the situation worse by giving the OSD
more work to do.
Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NAlex Elder <elder@inktank.com>

83aff95e

ceph: fix dentry reference leak in ceph_encode_fh(). · cfc84c9f

由 Cyril Roelandt 提交于 11月 20, 2012

dput() was not called in the error path.
Signed-off-by: NCyril Roelandt <tipecaml@gmail.com>
Reviewed-by: NAlex Elder <elder@inktank.com>

cfc84c9f

06 11月, 2012 1 次提交

ceph: Fix i_size update race · 22cddde1

由 Sage Weil 提交于 11月 05, 2012

ceph_aio_write() has an optimization that marks cap EPH_CAP_FILE_WR
dirty before data is copied to page cache and inode size is updated.
If ceph_check_caps() flushes the dirty cap before the inode size is
updated, MDS can miss the new inode size. The fix is move
ceph_{get,put}_cap_refs() into ceph_write_{begin,end}() and call
__ceph_mark_dirty_caps() after inode size is updated.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: NSage Weil <sage@inktank.com>

22cddde1

04 11月, 2012 1 次提交
- Y
  ceph: Hold caps_list_lock when adjusting caps_{use, total}_count · 4d1d0534
  由 Yan, Zheng 提交于 11月 03, 2012
```
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: NSage Weil <sage@inktank.com>
```
  4d1d0534
29 10月, 2012 1 次提交

ceph: fix dentry reference leak in encode_fh() · 52eb5a90

由 David Zafman 提交于 10月 18, 2012

Call to d_find_alias() needs a corresponding dput()

This fixes http://tracker.newdream.net/issues/3271Signed-off-by: NDavid Zafman <david.zafman@inktank.com>
Reviewed-by: NSage Weil <sage@inktank.com>

52eb5a90

27 10月, 2012 2 次提交

ceph: Fix NULL ptr crash in strlen() · b000056a

由 David Zafman 提交于 10月 25, 2012

set_request_path_attr() checks for NULL ptr before calling strlen()

This fixes http://tracker.newdream.net/issues/3404Signed-off-by: NDavid Zafman <david.zafman@inktank.com>
Reviewed-by: NSage Weil <sage@inktank.com>

b000056a

ceph: fix dentry reference leak in encode_fh() · 0f9831a8

由 David Zafman 提交于 10月 18, 2012

Call to d_find_alias() needs a corresponding dput()

This fixes http://tracker.newdream.net/issues/3271Signed-off-by: NDavid Zafman <david.zafman@inktank.com>
Reviewed-by: NSage Weil <sage@inktank.com>

0f9831a8

10 10月, 2012 1 次提交

tmpfs,ceph,gfs2,isofs,reiserfs,xfs: fix fh_len checking · 35c2a7f4

由 Hugh Dickins 提交于 10月 07, 2012

Fuzzing with trinity oopsed on the 1st instruction of shmem_fh_to_dentry(),
	u64 inum = fid->raw[2];
which is unhelpfully reported as at the end of shmem_alloc_inode():

BUG: unable to handle kernel paging request at ffff880061cd3000
IP: [<ffffffff812190d0>] shmem_alloc_inode+0x40/0x40
Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
Call Trace:
 [<ffffffff81488649>] ? exportfs_decode_fh+0x79/0x2d0
 [<ffffffff812d77c3>] do_handle_open+0x163/0x2c0
 [<ffffffff812d792c>] sys_open_by_handle_at+0xc/0x10
 [<ffffffff83a5f3f8>] tracesys+0xe1/0xe6

Right, tmpfs is being stupid to access fid->raw[2] before validating that
fh_len includes it: the buffer kmalloc'ed by do_sys_name_to_handle() may
fall at the end of a page, and the next page not be present.

But some other filesystems (ceph, gfs2, isofs, reiserfs, xfs) are being
careless about fh_len too, in fh_to_dentry() and/or fh_to_parent(), and
could oops in the same way: add the missing fh_len checks to those.
Reported-by: NSasha Levin <levinsasha928@gmail.com>
Signed-off-by: NHugh Dickins <hughd@google.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Sage Weil <sage@inktank.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: stable@vger.kernel.org
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

35c2a7f4

09 10月, 2012 1 次提交

mm: kill vma flag VM_CAN_NONLINEAR · 0b173bc4

由 Konstantin Khlebnikov 提交于 10月 08, 2012

Move actual pte filling for non-linear file mappings into the new special
vma operation: ->remap_pages().

Filesystems must implement this method to get non-linear mapping support,
if it uses filemap_fault() then generic_file_remap_pages() can be used.

Now device drivers can implement this method and obtain nonlinear vma support.
Signed-off-by: NKonstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Carsten Otte <cotte@de.ibm.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>	#arch/tile
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Eric Paris <eparis@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Morris <james.l.morris@oracle.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Kentaro Takeda <takedakn@nttdata.co.jp>
Cc: Matt Helsley <matthltc@us.ibm.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Venkatesh Pallipadi <venki@google.com>
Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0b173bc4

03 10月, 2012 3 次提交

ceph: avoid 32-bit page index overflow · 6285bc23

由 Alex Elder 提交于 10月 02, 2012

A pgoff_t is defined (by default) to have type (unsigned long).  On
architectures such as i686 that's a 32-bit type.  The ceph address
space code was attempting to produce 64 bit offsets by shifting a
page's index by PAGE_CACHE_SHIFT, but the result was not what was
desired because the shift occurred before the result got promoted
to 64 bits.

Fix this by converting all uses of page->index used in this way to
use the page_offset() macro, which ensures the 64-bit result has the
intended value.

This fixes http://tracker.newdream.net/issues/3112Reported-by: NMohamed Pakkeer <pakkeer.mohideen@realimage.com>
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NSage Weil <sage@inktank.com>

6285bc23

ceph: return EIO on invalid layout on GET_DATALOC ioctl · 457712a0

由 Sage Weil 提交于 9月 24, 2012

If the user calls GET_DATALOC on a file with an invalid (e.g.,
zeroed) layout, return EIO to userland.
Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NAlex Elder <elder@inktank.com>

457712a0

fs: push rcu_barrier() from deactivate_locked_super() to filesystems · 8c0a8537

由 Kirill A. Shutemov 提交于 9月 26, 2012

There's no reason to call rcu_barrier() on every
deactivate_locked_super().  We only need to make sure that all delayed rcu
free inodes are flushed before we destroy related cache.

Removing rcu_barrier() from deactivate_locked_super() affects some fast
paths.  E.g.  on my machine exit_group() of a last process in IPC
namespace takes 0.07538s.  rcu_barrier() takes 0.05188s of that time.
Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

8c0a8537

02 10月, 2012 4 次提交

ceph: propagate layout error on osd request creation · 6816282d

由 Sage Weil 提交于 9月 24, 2012

If we are creating an osd request and get an invalid layout, return
an EINVAL to the caller.  We switch up the return to have an error
code instead of NULL implying -ENOMEM.
Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NAlex Elder <elder@inktank.com>

6816282d

ceph: convert to use le32_add_cpu() · b905a7f8

由 Wei Yongjun 提交于 9月 28, 2012

Convert cpu_to_le32(le32_to_cpu(E1) + E2) to use le32_add_cpu().

dpatch engine is used to auto generate this patch.
(https://github.com/weiyj/dpatch)
Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: NSage Weil <sage@inktank.com>

b905a7f8

ceph: Fix oops when handling mdsmap that decreases max_mds · 3e8f43a0

由 Yan, Zheng 提交于 9月 20, 2012

When i >= newmap->m_max_mds, ceph_mdsmap_get_addr(newmap, i) return
NULL. Passing NULL to memcmp() triggers oops.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: NSage Weil <sage@inktank.com>

3e8f43a0

ceph: let path portion of mount "device" be optional · c98f533c

由 Alex Elder 提交于 8月 09, 2012

A recent change to /sbin/mountall causes any trailing '/' character
in the "device" (or fs_spec) field in /etc/fstab to be stripped.  As
a result, an entry for a ceph mount that intends to mount the root
of the name space ends up with now path portion, and the ceph mount
option processing code rejects this.

That is, an entry in /etc/fstab like:
    cephserver:port:/ /mnt ceph defaults 0 0
provides to the ceph code just "cephserver:port:" as the "device,"
and that gets rejected.

Although this is a bug in /sbin/mountall, we can have the ceph mount
code support an empty/nonexistent path, interpreting it to mean the
root of the name space.

RFC 5952 offers recommendations for how to express IPv6 addresses,
and recommends the usage found in RFC 3986 (which specifies the
format for URI's) for representing both IPv4 and IPv6 addresses that
include port numbers.  (See in particular the definition of
"authority" found in the Appendix of RFC 3986.)

According to those standards, no host specification will ever
contain a '/' character.  As a result, it is sufficient to scan a
provided "device" from an /etc/fstab entry for the first '/'
character, and if it's found, treat that as the beginning of the
path.  If no '/' character is present, we can treat the entire
string as the monitor host specification(s), and assume the path
to be the root of the name space.  We'll still require a ':' to
separate the host portion from the (possibly empty) path portion.

This means that we can more formally define how ceph will interpret
the "device" it's provided when processing a mount request:

    "device" will look like:
        <server_spec>[,<server_spec>...]:[<path>]
    where
        <server_spec> is <ip>[:<port>]
        <path> is optional, but if present must begin with '/'

This addresses http://tracker.newdream.net/issues/2919Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NDan Mick <dan.mick@inktank.com>

c98f533c

27 9月, 2012 1 次提交
- A
  ceph: don't abuse d_delete() on failure exits · 2744c171
  由 Al Viro 提交于 9月 26, 2012
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  2744c171
22 8月, 2012 2 次提交

ceph: avoid divide by zero in __validate_layout() · 45f2e081

由 Sage Weil 提交于 8月 21, 2012

If "l->stripe_unit" is zero the the mod on the next line will cause a
divide by zero bug.  This comes from the copy_from_user() in
ceph_ioctl_set_layout_policy().  Passing 0 is valid, though (it means
"do not change") so avoid the % check in that case.
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NAlex Elder <elder@inktank.com>

45f2e081

ceph: tolerate (and warn on) extraneous dentry from mds · 6c5e50fa

由 Sage Weil 提交于 8月 21, 2012

If the MDS gives us a dentry and we weren't prepared to handle it,
WARN_ON_ONCE instead of crashing.
Reported-by: NYan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NAlex Elder <elder@inktank.com>

6c5e50fa

21 8月, 2012 1 次提交

libceph: delay debugfs initialization until we learn global_id · d1c338a5

由 Sage Weil 提交于 8月 19, 2012

The debugfs directory includes the cluster fsid and our unique global_id.
We need to delay the initialization of the debug entry until we have
learned both the fsid and our global_id from the monitor or else the
second client can't create its debugfs entry and will fail (and multiple
client instances aren't properly reflected in debugfs).

Reported by: Yan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NYehuda Sadeh <yehuda@inktank.com>

d1c338a5

03 8月, 2012 1 次提交

ceph: simplify+fix atomic_open · 5ef50c3b

由 Sage Weil 提交于 7月 31, 2012

The initial ->atomic_open op was carried over from the old intent code,
which was incomplete and didn't really work.  Replace it with a fresh
method.  In particular:

 * always attempt to do an atomic open+lookup, both for the create case
   and for lookups of existing files.
 * fix symlink handling by returning 1 to the VFS so that we can follow
   the link to its destination. This fixes a longstanding ceph bug (#2392).
Signed-off-by: NSage Weil <sage@inktank.com>

5ef50c3b

31 7月, 2012 6 次提交

ceph: define snap counts as u32 everywhere · aa711ee3

由 Alex Elder 提交于 7月 13, 2012

There are two structures in which a count of snapshots are
maintained:

    struct ceph_snap_context {
	...
        u32 num_snaps;
	...
    }
and
    struct ceph_snap_realm {
	...
        u32 num_prior_parent_snaps;   /*  had prior to parent_since */
	...
        u32 num_snaps;
	...
    }

These fields never take on negative values (e.g., to hold special
meaning), and so are really inherently unsigned.  Furthermore they
take their value from over-the-wire or on-disk formatted 32-bit
values.

So change their definition to have type u32, and change some spots
elsewhere in the code to account for this change.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

aa711ee3

ceph: fix potential double free · 21ec6ffa

由 Alan Cox 提交于 7月 20, 2012

We re-run the loop but we don't re-set the attrs pointer back to NULL.
Signed-off-by: NAlan Cox <alan@linux.intel.com>
Reviewed-by: NAlex Elder <elder@inktank.com>

21ec6ffa

ceph: close old con before reopening on mds reconnect · a53aab64

由 Sage Weil 提交于 7月 30, 2012

When we detect a mds session reset, close the old ceph_connection before
reopening it.  This ensures we clean up the old socket properly and keep
the ceph_connection state correct.
Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NYehuda Sadeh <yehuda@inktank.com>

a53aab64

libceph: move feature bits to separate header · 1fe60e51

由 Sage Weil 提交于 7月 30, 2012

This is simply cleanup that will keep things more closely synced with the
userland code.
Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NYehuda Sadeh <yehuda@inktank.com>

1fe60e51

ceph: Push file_update_time() into ceph_page_mkwrite() · 3ca9c3bd

由 Jan Kara 提交于 6月 12, 2012

CC: Sage Weil <sage@newdream.net>
CC: ceph-devel@vger.kernel.org
Acked-by: NSage Weil <sage@newdream.net>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

3ca9c3bd

ceph: clean up useless d_parent checks · 8842b3be

由 Sage Weil 提交于 6月 07, 2012

d_parent is never NULL, and IS_ROOT() is the proper way to check for a
(non-self-referential) parent.
Reported-by: NAl Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: NSage Weil <sage@inktank.com>

8842b3be

14 7月, 2012 1 次提交

VFS: Pass mount flags to sget() · 9249e17f

由 David Howells 提交于 6月 25, 2012

Pass mount flags to sget() so that it can use them in initialising a new
superblock before the set function is called.  They could also be passed to the
compare function.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

9249e17f

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功