提交 · ba65dc5ef16f82fba77869cecf7a7d515f61446b · openeuler / raspberrypi-kernel

10 6月, 2016 1 次提交

由 Al Viro 提交于 6月 10, 2016

d_walk() relies upon the tree not getting rearranged under it without
rename_lock being touched.  And we do grab rename_lock around the
places that change the tree topology.  Unfortunately, branch reordering
is just as bad from d_walk() POV and we have two places that do it
without touching rename_lock - one in handling of cursors (for ramfs-style
directories) and another in autofs.  autofs one is a separate story; this
commit deals with the cursors.
	* mark cursor dentries explicitly at allocation time
	* make __dentry_kill() leave ->d_child.next pointing to the next
non-cursor sibling, making sure that it won't be moved around unnoticed
before the parent is relocked on ascend-to-parent path in d_walk().
	* make d_walk() skip cursors explicitly; strictly speaking it's
not necessary (all callbacks we pass to d_walk() are no-ops on cursors),
but it makes analysis easier.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

ba65dc5e

08 6月, 2016 1 次提交

coredump: fix dumping through pipes · 1607f09c

由 Mateusz Guzik 提交于 6月 05, 2016

The offset in the core file used to be tracked with ->written field of
the coredump_params structure. The field was retired in favour of
file->f_pos.

However, ->f_pos is not maintained for pipes which leads to breakage.

Restore explicit tracking of the offset in coredump_params. Introduce
->pos field for this purpose since ->written was already reused.

Fixes: a0083939 ("get rid of coredump_params->written").
Reported-by: NZbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl>
Signed-off-by: NMateusz Guzik <mguzik@redhat.com>
Reviewed-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

1607f09c

29 5月, 2016 6 次提交

<linux/hash.h>: Add support for architecture-specific functions · 468a9428

由 George Spelvin 提交于 5月 26, 2016

This is just the infrastructure; there are no users yet.

This is modelled on CONFIG_ARCH_RANDOM; a CONFIG_ symbol declares
the existence of <asm/hash.h>.

That file may define its own versions of various functions, and define
HAVE_* symbols (no CONFIG_ prefix!) to suppress the generic ones.

Included is a self-test (in lib/test_hash.c) that verifies the basics.
It is NOT in general required that the arch-specific functions compute
the same thing as the generic, but if a HAVE_* symbol is defined with
the value 1, then equality is tested.
Signed-off-by: NGeorge Spelvin <linux@sciencehorizons.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greg Ungerer <gerg@linux-m68k.org>
Cc: Andreas Schwab <schwab@linux-m68k.org>
Cc: Philippe De Muyter <phdm@macq.eu>
Cc: linux-m68k@lists.linux-m68k.org
Cc: Alistair Francis <alistai@xilinx.com>
Cc: Michal Simek <michal.simek@xilinx.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: uclinux-h8-devel@lists.sourceforge.jp

468a9428

Eliminate bad hash multipliers from hash_32() and hash_64() · ef703f49

由 George Spelvin 提交于 5月 26, 2016

The "simplified" prime multipliers made very bad hash functions, so get rid
of them.  This completes the work of 689de1d6.

To avoid the inefficiency which was the motivation for the "simplified"
multipliers, hash_64() on 32-bit systems is changed to use a different
algorithm.  It makes two calls to hash_32() instead.

drivers/media/usb/dvb-usb-v2/af9015.c uses the old GOLDEN_RATIO_PRIME_32
for some horrible reason, so it inherits a copy of the old definition.
Signed-off-by: NGeorge Spelvin <linux@sciencehorizons.net>
Cc: Antti Palosaari <crope@iki.fi>
Cc: Mauro Carvalho Chehab <m.chehab@samsung.com>

ef703f49

Change hash_64() return value to 32 bits · 92d56774

由 George Spelvin 提交于 5月 26, 2016

That's all that's ever asked for, and it makes the return
type of hash_long() consistent.

It also allows (upcoming patch) an optimized implementation
of hash_64 on 32-bit machines.

I tried adding a BUILD_BUG_ON to ensure the number of bits requested
was never more than 32 (most callers use a compile-time constant), but
adding <linux/bug.h> to <linux/hash.h> breaks the tools/perf compiler
unless tools/perf/MANIFEST is updated, and understanding that code base
well enough to update it is too much trouble.  I did the rest of an
allyesconfig build with such a check, and nothing tripped.
Signed-off-by: NGeorge Spelvin <linux@sciencehorizons.net>

92d56774

<linux/sunrpc/svcauth.h>: Define hash_str() in terms of hashlen_string() · 917ea166

由 George Spelvin 提交于 5月 20, 2016

Finally, the first use of previous two patches: eliminate the
separate ad-hoc string hash functions in the sunrpc code.

Now hash_str() is a wrapper around hash_string(), and hash_mem() is
likewise a wrapper around full_name_hash().

Note that sunrpc code *does* call hash_mem() with a zero length, which
is why the previous patch needed to handle that in full_name_hash().
(Thanks, Bruce, for finding that!)

This also eliminates the only caller of hash_long which asks for
more than 32 bits of output.

The comment about the quality of hashlen_string() and full_name_hash()
is jumping the gun by a few patches; they aren't very impressive now,
but will be improved greatly later in the series.
Signed-off-by: NGeorge Spelvin <linux@sciencehorizons.net>
Tested-by: NJ. Bruce Fields <bfields@redhat.com>
Acked-by: NJ. Bruce Fields <bfields@redhat.com>
Cc: Jeff Layton <jlayton@poochiereds.net>
Cc: linux-nfs@vger.kernel.org

917ea166

fs/namei.c: Add hashlen_string() function · fcfd2fbf

由 George Spelvin 提交于 5月 20, 2016

We'd like to make more use of the highly-optimized dcache hash functions
throughout the kernel, rather than have every subsystem create its own,
and a function that hashes basic null-terminated strings is required
for that.

(The name is to emphasize that it returns both hash and length.)

It's actually useful in the dcache itself, specifically d_alloc_name().
Other uses in the next patch.

full_name_hash() is also tweaked to make it more generally useful:
1) Take a "char *" rather than "unsigned char *" argument, to
   be consistent with hash_name().
2) Handle zero-length inputs.  If we want more callers, we don't want
   to make them worry about corner cases.
Signed-off-by: NGeorge Spelvin <linux@sciencehorizons.net>

fcfd2fbf

Pull out string hash to <linux/stringhash.h> · f4bcbe79

由 George Spelvin 提交于 5月 20, 2016

... so they can be used without the rest of <linux/dcache.h>

The hashlen_* macros will make sense next patch.
Signed-off-by: NGeorge Spelvin <linux@sciencehorizons.net>

f4bcbe79

28 5月, 2016 5 次提交

switch ->setxattr() to passing dentry and inode separately · 3767e255

由 Al Viro 提交于 5月 27, 2016

smack ->d_instantiate() uses ->setxattr(), so to be able to call it before
we'd hashed the new dentry and attached it to inode, we need ->setxattr()
instances getting the inode as an explicit argument rather than obtaining
it from dentry.

Similar change for ->getxattr() had been done in commit ce23e640.  Unlike
->getxattr() (which is used by both selinux and smack instances of
->d_instantiate()) ->setxattr() is used only by smack one and unfortunately
it got missed back then.
Reported-by: NSeung-Woo Kim <sw0312.kim@samsung.com>
Tested-by: NCasey Schaufler <casey@schaufler-ca.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

3767e255

make IS_ERR_VALUE() complain about non-pointer-sized arguments · aa00edc1

由 Linus Torvalds 提交于 5月 27, 2016

Now that the allmodconfig x86-64 build is clean wrt IS_ERR_VALUE() uses
on integers, add a cast to a pointer and back to the argument, so that
any new mis-uses of IS_ERR_VALUE() will cause warnings like

warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]

so that we don't re-introduce any bogus uses.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

aa00edc1

mm: remove more IS_ERR_VALUE abuses · 5d22fc25

由 Linus Torvalds 提交于 5月 27, 2016

The do_brk() and vm_brk() return value was "unsigned long" and returned
the starting address on success, and an error value on failure.  The
reasons are entirely historical, and go back to it basically behaving
like the mmap() interface does.

However, nobody actually wanted that interface, and it causes totally
pointless IS_ERR_VALUE() confusion.

What every single caller actually wants is just the simpler integer
return of zero for success and negative error number on failure.

So just convert to that much clearer and more common calling convention,
and get rid of all the IS_ERR_VALUE() uses wrt vm_brk().
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

5d22fc25

mm: fix section mismatch warning · 7ded384a

由 Linus Torvalds 提交于 5月 27, 2016

The register_page_bootmem_info_node() function needs to be marked __init
in order to avoid a new warning introduced by commit f65e91df ("mm:
use early_pfn_to_nid in register_page_bootmem_info_node").

Otherwise you'll get a warning about how a non-init function calls
early_pfn_to_nid (which is __meminit)

Cc: Yang Shi <yang.shi@linaro.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7ded384a

switch xattr_handler->set() to passing dentry and inode separately · 59301226

由 Al Viro 提交于 5月 27, 2016

preparation for similar switch in ->setxattr() (see the next commit for
rationale).
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

59301226

27 5月, 2016 4 次提交

mm: oom_reaper: remove some bloat · 7ef949d7

由 Michal Hocko 提交于 5月 26, 2016

mmput_async is currently used only from the oom_reaper which is defined
only for CONFIG_MMU.  We can save work_struct in mm_struct for
!CONFIG_MMU.

[akpm@linux-foundation.org: fix typo, per Minchan]
Link: http://lkml.kernel.org/r/20160520061658.GB19172@dhcp22.suse.czReported-by: NMinchan Kim <minchan@kernel.org>
Signed-off-by: NMichal Hocko <mhocko@suse.com>
Acked-by: NMinchan Kim <minchan@kernel.org>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7ef949d7

mm: slub: remove unused virt_to_obj() · d96c84f8

由 Andrey Ryabinin 提交于 5月 26, 2016

It's unused since commit 7ed2f9e6 ("mm, kasan: SLAB support")

Link: http://lkml.kernel.org/r/1464020961-2242-1-git-send-email-aryabinin@virtuozzo.comSigned-off-by: NAndrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d96c84f8

seqlock: fix raw_read_seqcount_latch() · 50755bc1

由 Alexey Dobriyan 提交于 5月 26, 2016

lockless_dereference() is supposed to take pointer not integer.

Link: http://lkml.kernel.org/r/20160521201448.GA7429@p183.telecom.bySigned-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

50755bc1

misc: at24: Fix typo in at24 header file · 868b2072

由 Moritz Fischer 提交于 5月 23, 2016

This commit fixes a simple typo s/mvmem/nvmem in the
example.
Signed-off-by: NMoritz Fischer <moritz.fischer@ettus.com>
Signed-off-by: NWolfram Sang <wsa@the-dreams.de>

868b2072

26 5月, 2016 23 次提交

A
add down_write_killable_nested() · 887bddfa
由 Al Viro 提交于 5月 26, 2016
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
887bddfa

ceph: make logical calculation functions return bool · 3b33f692

由 Zhang Zhuoyu 提交于 3月 25, 2016

This patch makes serverl logical caculation functions return bool to
improve readability due to these particular functions only using 0/1
as their return value.

No functional change.
Signed-off-by: NZhang Zhuoyu <zhangzhuoyu@cmss.chinamobile.com>

3b33f692

ceph: using hash value to compose dentry offset · f3c4ebe6

由 Yan, Zheng 提交于 4月 29, 2016

If MDS sorts dentries in dirfrag in hash order, we use hash value to
compose dentry offset. dentry offset is:

  (0xff << 52) | ((24 bits hash) << 28) |
  (the nth entry hash hash collision)

This offset is stable across directory fragmentation. This alos means
there is no need to reset readdir offset if directory get fragmented
in the middle of readdir.
Signed-off-by: NYan, Zheng <zyan@redhat.com>

f3c4ebe6

ceph: define 'end/complete' in readdir reply as bit flags · 956d39d6

由 Yan, Zheng 提交于 4月 27, 2016

Set a flag in readdir request, which indicates that client interprets
'end/complete' as bit flags. So that mds can reply additional flags in
readdir reply.
Signed-off-by: NYan, Zheng <zyan@redhat.com>

956d39d6

I
libceph: support for subscribing to "mdsmap.<id>" maps · 737cc81e
由 Ilya Dryomov 提交于 5月 26, 2016
```
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
```
737cc81e

libceph: replace ceph_monc_request_next_osdmap() · 7cca78c9

由 Ilya Dryomov 提交于 4月 28, 2016

... with a wrapper around maybe_request_map() - no need for two
osdmap-specific functions.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

7cca78c9

libceph: pool deletion detection · 4609245e

由 Ilya Dryomov 提交于 4月 28, 2016

This adds the "map check" infrastructure for sending osdmap version
checks on CALC_TARGET_POOL_DNE and completing in-flight requests with
-ENOENT if the target pool doesn't exist or has just been deleted.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

4609245e

libceph: async MON client generic requests · d0b19705

由 Ilya Dryomov 提交于 4月 28, 2016

For map check, we are going to need to send CEPH_MSG_MON_GET_VERSION
messages asynchronously and get a callback on completion.  Refactor MON
client to allow firing off generic requests asynchronously and add an
async variant of ceph_monc_get_version().  ceph_monc_do_statfs() is
switched over and remains sync.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

d0b19705

libceph: support for checking on status of watch · b07d3c4b

由 Ilya Dryomov 提交于 4月 28, 2016

Implement ceph_osdc_watch_check() to be able to check on status of
watch.  Note that the time it takes for a watch/notify event to get
delivered through the notify_wq is taken into account.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

b07d3c4b

libceph: support for sending notifies · 19079203

由 Ilya Dryomov 提交于 4月 28, 2016

Implement ceph_osdc_notify() for sending notifies.

Due to the fact that the current messenger can't do read-in into
pagelists (it can only do write-out from them), I had to go with a page
vector for a NOTIFY_COMPLETE payload, for now.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

19079203

libceph, rbd: ceph_osd_linger_request, watch/notify v2 · 922dab61

由 Ilya Dryomov 提交于 5月 26, 2016

This adds support and switches rbd to a new, more reliable version of
watch/notify protocol.  As with the OSD client update, this is mostly
about getting the right structures linked into the right places so that
reconnects are properly sent when needed.  watch/notify v2 also
requires sending regular pings to the OSDs - send_linger_ping().

A major change from the old watch/notify implementation is the
introduction of ceph_osd_linger_request - linger requests no longer
piggy back on ceph_osd_request.  ceph_osd_event has been merged into
ceph_osd_linger_request.

All the details are now hidden within libceph, the interface consists
of a simple pair of watch/unwatch functions and ceph_osdc_notify_ack().
ceph_osdc_watch() does return ceph_osd_linger_request, but only to keep
the lifetime management simple.

ceph_osdc_notify_ack() accepts an optional data payload, which is
relayed back to the notifier.

Portions of this patch are loosely based on work by Douglas Fuller
<dfuller@redhat.com> and Mike Christie <michaelc@cs.wisc.edu>.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

922dab61

libceph: a major OSD client update · 5aea3dcd

由 Ilya Dryomov 提交于 4月 28, 2016

This is a major sync up, up to ~Jewel.  The highlights are:

- per-session request trees (vs a global per-client tree)
- per-session locking (vs a global per-client rwlock)
- homeless OSD session
- no ad-hoc global per-client lists
- support for pool quotas
- foundation for watch/notify v2 support
- foundation for map check (pool deletion detection) support

The switchover is incomplete: lingering requests can be setup and
teared down but aren't ever reestablished.  This functionality is
restored with the introduction of the new lingering infrastructure
(ceph_osd_linger_request, linger_work, etc) in a later commit.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

5aea3dcd

libceph: protect osdc->osd_lru list with a spinlock · 9dd2845c

由 Ilya Dryomov 提交于 4月 28, 2016

OSD client is getting moved from the big per-client lock to a set of
per-session locks. The big rwlock would only be held for read most of
the time, so a global osdc->osd_lru needs additional protection.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

9dd2845c

libceph: handle_one_map() · 42c1b124

由 Ilya Dryomov 提交于 4月 28, 2016

Separate osdmap handling from decoding and iterating over a bag of maps
in a fresh MOSDMap message.  This sets up the scene for the updated OSD
client.

Of particular importance here is the addition of pi->was_full, which
can be used to answer "did this pool go full -> not-full in this map?".
This is the key bit for supporting pool quotas.

We won't be able to downgrade map_sem for much longer, so drop
downgrade_write().
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

42c1b124

libceph: allocate dummy osdmap in ceph_osdc_init() · e5253a7b

由 Ilya Dryomov 提交于 4月 28, 2016

This leads to a simpler osdmap handling code, particularly when dealing
with pi->was_full, which is introduced in a later commit.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

e5253a7b

libceph: redo callbacks and factor out MOSDOpReply decoding · fe5da05e

由 Ilya Dryomov 提交于 4月 28, 2016

If you specify ACK | ONDISK and set ->r_unsafe_callback, both
->r_callback and ->r_unsafe_callback(true) are called on ack.  This is
very confusing.  Redo this so that only one of them is called:

    ->r_unsafe_callback(true), on ack
    ->r_unsafe_callback(false), on commit

or

    ->r_callback, on ack|commit

Decode everything in decode_MOSDOpReply() to reduce clutter.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

fe5da05e

libceph: drop msg argument from ceph_osdc_callback_t · 85e084fe

由 Ilya Dryomov 提交于 4月 28, 2016

finish_read(), its only user, uses it to get to hdr.data_len, which is
what ->r_result is set to on success. This gains us the ability to
safely call callbacks from contexts other than reply, e.g. map check.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

85e084fe

libceph: switch to calc_target(), part 2 · bb873b53

由 Ilya Dryomov 提交于 5月 26, 2016

The crux of this is getting rid of ceph_osdc_build_request(), so that
MOSDOp can be encoded not before but after calc_target() calculates the
actual target. Encoding now happens within ceph_osdc_start_request().

Also nuked is the accompanying bunch of pointers into the encoded
buffer that was used to update fields on each send - instead, the
entire front is re-encoded. If we want to support target->name_len !=
base->name_len in the future, there is no other way, because oid is
surrounded by other fields in the encoded buffer.

Encoding OSD ops and adding data items to the request message were
mixed together in osd_req_encode_op(). While we want to re-encode OSD
ops, we don't want to add duplicate data items to the message when
resending, so all call to ceph_osdc_msg_data_add() are factored out
into a new setup_request_data().
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

bb873b53

libceph: switch to calc_target(), part 1 · a66dd383

由 Ilya Dryomov 提交于 4月 28, 2016

Replace __calc_request_pg() and most of __map_request() with
calc_target() and start using req->r_t.

ceph_osdc_build_request() however still encodes base_oid, because it's
called before calc_target() is and target_oid is empty at that point in
time; a printf in osdc_show() also shows base_oid.  This is fixed in
"libceph: switch to calc_target(), part 2".
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

a66dd383

libceph: introduce ceph_osd_request_target, calc_target() · 63244fa1

由 Ilya Dryomov 提交于 4月 28, 2016

Introduce ceph_osd_request_target, containing all mapping-related
fields of ceph_osd_request and calc_target() for calculating mappings
and populating it.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

63244fa1

libceph: pi->min_size, pi->last_force_request_resend · 04812acf

由 Ilya Dryomov 提交于 4月 28, 2016

Add and decode pi->min_size and pi->last_force_request_resend.  These
are going to be used by calc_target().
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

04812acf

libceph: make pgid_cmp() global · f984cb76

由 Ilya Dryomov 提交于 4月 28, 2016

calc_target() code is going to need to know how to compare PGs.  Take
lhs and rhs pgid by const * while at it.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

f984cb76

libceph: rename ceph_calc_pg_primary() · f81f1633

由 Ilya Dryomov 提交于 4月 28, 2016

Rename ceph_calc_pg_primary() to ceph_pg_to_acting_primary() to
emphasise that it returns acting primary.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

f81f1633