提交 · 2d0cfbec2a95c16818960fda1dfa815fd1a62070 · openeuler / raspberrypi-kernel

30 11月, 2013 12 次提交

sysfs, kernfs: remove sysfs_add_one() · 2d0cfbec

由 Tejun Heo 提交于 11月 28, 2013

sysfs_add_one() is a wrapper around __sysfs_add_one() which prints out
duplicate name warning if __sysfs_add_one() fails with -EEXIST.  The
previous kernfs conversions moved all dup warnings to sysfs interface
functions and sysfs_add_one() doesn't have any user left.

Remove sysfs_add_one() and update __sysfs_add_one() to take its name.

This patch doesn't make any functional changes.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

2d0cfbec

sysfs, kernfs: introduce kernfs_create_file[_ns]() · 496f7394

由 Tejun Heo 提交于 11月 28, 2013

Introduce kernfs interface to create a file which takes and returns
sysfs_dirents.

The actual file creation part is separated out from
sysfs_add_file_mode_ns() into kernfs_create_file_ns().  The former now
only decides the kernfs_ops to use and the file's size and invokes the
latter.

This patch doesn't introduce behavior changes.

v2: Dummy implementation for !CONFIG_SYSFS updated to return -ENOSYS.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

496f7394

sysfs, kernfs: remove SYSFS_KOBJ_BIN_ATTR · a7dc66df

由 Tejun Heo 提交于 11月 28, 2013

After kernfs_ops and sysfs_dirent->s_attr.size addition, the
distinction between SYSFS_KOBJ_BIN_ATTR and SYSFS_KOBJ_ATTR is only
necessary while creating files to decide which kernfs_ops to use.
Afterwards, they behave exactly the same.

This patch removes SYSFS_KOBJ_BIN_ATTR along with sysfs_is_bin().
sysfs_add_file[_mode_ns]() are updated to take bool @is_bin instead of
@type.

This patch doesn't introduce any behavior changes.  This completely
isolates the distinction between the two sysfs file types in the sysfs
layer proper.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

a7dc66df

sysfs, kernfs: add sysfs_dirent->s_attr.size · 471bd7b7

由 Tejun Heo 提交于 11月 28, 2013

sysfs sets the size of regular files unconditionally at PAGE_SIZE and
takes the size of bin files from bin_attribute.  The latter is a
pretty bad interface which forces bin_attribute users to create a
separate copy of bin_attribute for each instance of the file -
e.g. pci resource files.

Add sysfs_dirent->s_attr.size so that the size can be specified
separately.  This unifies inode init paths of ATTR and BIN_ATTR
identical and allows for generic size handling for kernfs.

Unfortunately, this grows the size of sysfs_dirent by sizeof(loff_t).
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

471bd7b7

sysfs, kernfs: introduce kernfs_ops · f6acf8bb

由 Tejun Heo 提交于 11月 28, 2013

We're in the process of separating out core sysfs functionality into
kernfs which will deal with sysfs_dirents directly.  This patch
introduces kernfs_ops which hosts methods kernfs users implement and
updates fs/sysfs/file.c such that sysfs_kf_*() functions populate
kernfs_ops and kernfs_file_*() functions call the matching entries
from kernfs_ops.

kernfs_ops contains the following groups of methods.

* seq_show() - for kernfs files which use seq_file for reads.

* read() - for direct read implementations.  Used iff seq_show() is
  not implemented.

* write() - for writes.

* mmap() - for mmaps.

Notes:

* sysfs_elem_attr->ops is added so that kernfs_ops can be accessed
  from sysfs_dirent.  kernfs_ops() helper is added to verify locking
  and access the field.

* SYSFS_FLAG_HAS_(SEQ_SHOW|MMAP) added.  sd->s_attr->ops is accessible
  only while holding active_ref and there are cases where we want to
  take different actions depending on which ops are implemented.
  These two flags cache whether the two ops are implemented for those.

* kernfs_file_*() no longer test sysfs type but chooses different
  behaviors depending on which methods in kernfs_ops are implemented.
  The conversions are trivial except for the open path.  As
  kernfs_file_open() now decides whether to allow read/write accesses
  depending on the kernfs_ops implemented, the presence of methods in
  kobjs and attribute_bin should be propagated to kernfs_ops.
  sysfs_add_file_mode_ns() is updated so that it propagates presence /
  absence of the callbacks through _empty, _ro, _wo, _rw kernfs_ops.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

f6acf8bb

sysfs, kernfs: move sysfs_open_file to include/linux/kernfs.h · dd8a5b03

由 Tejun Heo 提交于 11月 28, 2013

sysfs_open_file will be used as the primary handle for kernfs methods.
Move its definition from fs/sysfs/file.c to include/linux/kernfs.h and
mark the public and private fields.

This is pure relocation.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

dd8a5b03

sysfs, kernfs: prepare open, release, poll paths for kernfs · c6fb4495

由 Tejun Heo 提交于 11月 28, 2013

We're in the process of separating out core sysfs functionality into
kernfs which will deal with sysfs_dirents directly.  This patch
prepares the rest - open, release and poll.  There isn't much to do.
Just renaming is enough.  As sysfs_file_operations and
sysfs_bin_operations are identical now, use the same file_operations
for both - kernfs_file_operations.

This patch doesn't introduce any behavior changes.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

c6fb4495

sysfs, kernfs: prepare mmap path for kernfs · fdbffaa4

由 Tejun Heo 提交于 11月 28, 2013

We're in the process of separating out core sysfs functionality into
kernfs which will deal with sysfs_dirents directly. This patch
rearranges mmap path so that the kernfs and sysfs parts are separate.

sysfs_kf_bin_mmap() which handles the interaction with bin_attribute
mmap method is factored out of sysfs_bin_mmap(), which is renamed to
kernfs_file_mmap(). All vma ops are renamed accordingly.

sysfs_bin_mmap() is updated such that it can be used for both file
types. This will eventually allow using the same file_operations for
both file types, which is necessary to separate out kernfs.

This patch doesn't introduce any behavior changes.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

fdbffaa4

sysfs, kernfs: prepare write path for kernfs · 50b38ca0

由 Tejun Heo 提交于 11月 28, 2013

We're in the process of separating out core sysfs functionality into
kernfs which will deal with sysfs_dirents directly.  This patch
rearranges write path so that the kernfs and sysfs parts are separate.

kernfs_file_write() handles all boilerplate work including buffer
management and locking and invokes sysfs_kf_write() or
sysfs_kf_bin_write() depending on the file type which deals with the
interaction with kobj store or bin_attribute write method.

While this patch changes the order of some operations, it shouldn't
change any visible behavior.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

50b38ca0

sysfs, kernfs: prepare read path for kernfs · c2b19daf

由 Tejun Heo 提交于 11月 28, 2013

We're in the process of separating out core sysfs functionality into
kernfs which will deal with sysfs_dirents directly.  This patch
rearranges read path so that the kernfs and sysfs parts are separate.

* Regular file read path is refactored such that
  kernfs_seq_start/next/stop/show() handle all the boilerplate work
  including locking and updating event count for poll, while
  sysfs_kf_seq_show() deals with interaction with kobj show method.

* Bin file read path is refactored such that kernfs_file_direct_read()
  handles all the boilerplate work including buffer management and
  locking, while sysfs_kf_bin_read() deals with interaction with
  bin_attribute read method.

kernfs_file_read() is added.  It invokes either the seq_file or direct
read path depending on the file type.  This will eventually allow
using the same file_operations for both file types, which is necessary
to separate out kernfs.

While this patch changes the order of some operations, it shouldn't
change any visible behavior.

v2: Dropped unnecessary zeroing of @count from sysfs_kf_seq_show().
    Add comments explaining single_open() behavior.  Both suggested by
    Pavel.

v3: seq_stop() is called even after seq_start() failed.
    kernfs_seq_start() updated so that it doesn't unlock
    sysfs_open_file->mutex on failure so that kernfs_seq_stop()
    doesn't try to unlock an already unlocked mutex.  Reported by
    Fengguang.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

c2b19daf

sysfs, kernfs: introduce kernfs_create_dir[_ns]() · 93b2b8e4

由 Tejun Heo 提交于 11月 28, 2013

Introduce kernfs interface to manipulate a directory which takes and
returns sysfs_dirents.

create_dir() is renamed to kernfs_create_dir_ns() and its argumantes
and return value are updated.  create_dir() usages are replaced with
kernfs_create_dir_ns() and sysfs_create_subdir() usages are replaced
with kernfs_create_dir().  Dup warnings are handled explicitly by
sysfs users of the kernfs interface.

sysfs_enable_ns() is renamed to kernfs_enable_ns().

This patch doesn't introduce any behavior changes.

v2: Dummy implementation for !CONFIG_SYSFS updated to return -ENOSYS.

v3: kernfs_enable_ns() added.

v4: Refreshed on top of "sysfs: drop kobj_ns_type handling, take #2"
    so that this patch removes sysfs_enable_ns().
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

93b2b8e4

sysfs, kernfs: replace sysfs_dirent->s_dir.kobj and ->s_attr.[bin_]attr with ->priv · 7c6e2d36

由 Tejun Heo 提交于 11月 28, 2013

A directory sysfs_dirent points to the associated kobj.  A regular or
bin file points to the associated [bin_]attribute.  This patch
replaces sysfs_dirent->s_dir.kobj and ->s_attr.[bin_]attr with void *
->priv.

This is to prepare for kernfs interface so that sysfs can specify the
private data in the same way for directories and files.  This lower
debuggability but not by much - the whole thing was overlaid in a
union anyway.  If debuggability becomes an issue, we can later add
->priv accessors which explicitly check for the sysfs_dirent type and
performs casting.

This patch doesn't introduce any behavior difference.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

7c6e2d36

29 11月, 2013 1 次提交

fix bogus path_put() of nd->root after some unlazy_walk() failures · d870b4a1

由 Al Viro 提交于 11月 29, 2013

Failure to grab reference to parent dentry should go through the
same cleanup as nd->seq mismatch.  As it is, we might end up with
caller thinking it needs to path_put() nd->root, with obvious
nasty results once we'd hit that bug enough times to drive the
refcount of root dentry all the way to zero...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

d870b4a1

28 11月, 2013 9 次提交

remove obsolete references to powertweak · dad33750

由 Dave Jones 提交于 11月 27, 2013

This tool hasn't been maintained in over a decade, and is pretty much
useless these days.  Let's pretend it never happened.

Also remove a long-dead email address.
Signed-off-by: NDave Jones <davej@fedoraproject.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

dad33750

sysfs, kernfs: introduce kernfs_setattr() · 5d60418e

由 Tejun Heo 提交于 11月 23, 2013

Introduce kernfs setattr interface - kernfs_setattr().

sysfs_sd_setattr() is renamed to __kernfs_setattr() and
kernfs_setattr() is a simple wrapper around it with sysfs_mutex
locking.  sysfs_chmod_file() is updated to get an explicit ref on
kobj->sd and then invoke kernfs_setattr() so that it doesn't have to
use internal interface.

This patch doesn't introduce any behavior differences.

v2: Dummy implementation for !CONFIG_SYSFS updated to return -ENOSYS.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

5d60418e

sysfs, kernfs: introduce kernfs_rename[_ns]() · 890ece16

由 Tejun Heo 提交于 11月 23, 2013

Introduce kernfs rename interface, krenfs_rename[_ns]().

This is just rename of sysfs_rename(). No functional changes.
Function comment is added to kernfs_rename_ns() and @new_parent_sd is
renamed to @new_parent for consistency with other kernfs interfaces.

v2: Dummy implementation for !CONFIG_SYSFS updated to return -ENOSYS.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

890ece16

sysfs, kernfs: introduce kernfs_create_link() · 5d0e26bb

由 Tejun Heo 提交于 11月 23, 2013

Separate out kernfs symlink interface - kernfs_create_link() - which
takes and returns sysfs_dirents, from sysfs_do_create_link_sd().
sysfs_do_create_link_sd() now just determines the parent and target
sysfs_dirents and invokes the new interface and handles dup warning.

This patch doesn't introduce behavior changes.

v2: Dummy implementation for !CONFIG_SYSFS updated to return -ENOSYS.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

5d0e26bb

sysfs, kernfs: introduce kernfs_remove[_by_name[_ns]]() · 879f40d1

由 Tejun Heo 提交于 11月 23, 2013

Introduce kernfs removal interfaces - kernfs_remove() and
kernfs_remove_by_name[_ns]().

These are just renames of sysfs_remove() and sysfs_hash_and_remove().
No functional changes.

v2: Dummy kernfs_remove_by_name_ns() for !CONFIG_SYSFS updated to
    return -ENOSYS instead of 0.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

879f40d1

sysfs, kernfs: add skeletons for kernfs · b8441ed2

由 Tejun Heo 提交于 11月 24, 2013

Core sysfs implementation will be separated into kernfs so that it can
be used by other non-kobject users.

This patch creates fs/kernfs/ directory and makes boilerplate changes.
kernfs interface will be directly based on sysfs_dirent and its
forward declaration is moved to include/linux/kernfs.h which is
included from include/linux/sysfs.h.  sysfs core implementation will
be gradually separated out and moved to kernfs.

This patch doesn't introduce any functional changes.

v2: mount.c added.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: linux-fsdevel@vger.kernel.org
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

b8441ed2

sysfs: make __sysfs_add_one() fail if the parent isn't a directory · ae2108ad

由 Tejun Heo 提交于 11月 23, 2013

Currently the kobject based interface guarantees that a parent
sysfs_dirent is always a directory; however, the planned kernfs
interface will be directly based on sysfs_dirents and the caller may
specify non-directory node as the parent.  Add an explicit check in
__sysfs_add_one() so that such attempts fail with -EINVAL.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

ae2108ad

sysfs: drop kobj_ns_type handling, take #2 · c84a3b27

由 Tejun Heo 提交于 11月 23, 2013

The way namespace tags are implemented in sysfs is more complicated
than necessary.  As each tag is a pointer value and required to be
non-NULL under a namespace enabled parent, there's no need to record
separately what type each tag is.  If multiple namespace types are
needed, which currently aren't, we can simply compare the tag to a set
of allowed tags in the superblock assuming that the tags, being
pointers, won't have the same value across multiple types.

This patch rips out kobj_ns_type handling from sysfs.  sysfs now has
an enable switch to turn on namespace under a node.  If enabled, all
children are required to have non-NULL namespace tags and filtered
against the super_block's tag.

kobject namespace determination is now performed in
lib/kobject.c::create_dir() making sysfs_read_ns_type() unnecessary.
The sanity checks are also moved.  create_dir() is restructured to
ease such addition.  This removes most kobject namespace knowledge
from sysfs proper which will enable proper separation and layering of
sysfs.

This is the second try.  The first one was cb26a311 ("sysfs: drop
kobj_ns_type handling") which tried to automatically enable namespace
if there are children with non-NULL namespace tags; however, it was
broken for symlinks as they should inherit the target's tag iff
namespace is enabled in the parent.  This led to namespace filtering
enabled incorrectly for wireless net class devices through phy80211
symlinks and thus network configuration failure.  a1212d27
("Revert "sysfs: drop kobj_ns_type handling"") reverted the commit.

This shouldn't introduce any behavior changes, for real.

v2: Dummy implementation of sysfs_enable_ns() for !CONFIG_SYSFS was
    missing and caused build failure.  Reported by kbuild test robot.
Signed-off-by: NTejun Heo <tj@kernel.org>
Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Kay Sievers <kay@vrfy.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

c84a3b27

Revert "sysfs: handle duplicate removal attempts in sysfs_remove_group()" · 81440e73

由 Greg Kroah-Hartman 提交于 11月 27, 2013

This reverts commit 54d71145.

The root cause of these "inverted" sysfs removals have now been found,
so there is no need for this patch.  Keep this functionality around so
that this type of error doesn't show up in driver code again.

Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

81440e73

25 11月, 2013 1 次提交

[CIFS] Do not use btrfs refcopy ioctl for SMB2 copy offload · f19e84df

由 Steve French 提交于 11月 24, 2013

Change cifs.ko to using CIFS_IOCTL_COPYCHUNK instead
of BTRFS_IOC_CLONE to avoid confusion about whether
copy-on-write is required or optional for this operation.

SMB2/SMB3 copyoffload had used the BTRFS_IOC_CLONE ioctl since
they both speed up copy by offloading the copy rather than
passing many read and write requests back and forth and both have
identical syntax (passing file handles), but for SMB2/SMB3
CopyChunk the server is not required to use copy-on-write
to make a copy of the file (although some do), and Christoph
has commented that since CopyChunk does not require
copy-on-write we should not reuse BTRFS_IOC_CLONE.

This patch renames the ioctl to use a cifs specific IOCTL
CIFS_IOCTL_COPYCHUNK.  This ioctl is particularly important
for SMB2/SMB3 since large file copy over the network otherwise
can be very slow, and with this is often more than 100 times
faster putting less load on server and client.

Note that if a copy syscall is ever introduced, depending on
its requirements/format it could end up using one of the other
three methods that CIFS/SMB2/SMB3 can do for copy offload,
but this method is particularly useful for file copy
and broadly supported (not just by Samba server).
Signed-off-by: NSteve French <smfrench@gmail.com>
Reviewed-by: NJeff Layton <jlayton@redhat.com>
Reviewed-by: NDavid Disseldorp <ddiss@samba.org>

f19e84df

24 11月, 2013 8 次提交

ceph: allocate non-zero page to fscache in readpage() · ff638b7d

由 Li Wang 提交于 11月 09, 2013

ceph_osdc_readpages() returns number of bytes read, currently,
the code only allocate full-zero page into fscache, this patch
fixes this.
Signed-off-by: NLi Wang <liwang@ubuntukylin.com>
Reviewed-by: NMilosz Tanski <milosz@adfin.com>
Reviewed-by: NSage Weil <sage@inktank.com>

ff638b7d

ceph: wake up 'safe' waiters when unregistering request · fc55d2c9

由 Yan, Zheng 提交于 10月 31, 2013

We also need to wake up 'safe' waiters if error occurs or request
aborted. Otherwise sync(2)/fsync(2) may hang forever.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: NSage Weil <sage@inktank.com>

fc55d2c9

ceph: cleanup aborted requests when re-sending requests. · eb1b8af3

由 Yan, Zheng 提交于 9月 26, 2013

Aborted requests usually get cleared when the reply is received.
If MDS crashes, no reply will be received. So we need to cleanup
aborted requests when re-sending requests.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: NGreg Farnum <greg@inktank.com>
Signed-off-by: NSage Weil <sage@inktank.com>

eb1b8af3

ceph: handle race between cap reconnect and cap release · 99a9c273

由 Yan, Zheng 提交于 9月 22, 2013

When a cap get released while composing the cap reconnect message.
We should skip queuing the release message if the cap hasn't been
added to the cap reconnect message.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: NSage Weil <sage@inktank.com>

99a9c273

ceph: set caps count after composing cap reconnect message · 44c99757

由 Yan, Zheng 提交于 9月 22, 2013

It's possible that some caps get released while composing the cap
reconnect message.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: NSage Weil <sage@inktank.com>

44c99757

ceph: queue cap release in __ceph_remove_cap() · a096b09a

由 Yan, Zheng 提交于 9月 22, 2013

call __queue_cap_release() in __ceph_remove_cap(), this avoids
acquiring s_cap_lock twice.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: NSage Weil <sage@inktank.com>

a096b09a

sysfs: use a separate locking class for open files depending on mmap · 027a485d

由 Tejun Heo 提交于 11月 17, 2013

The following two commits implemented mmap support in the regular file
path and merged bin file support into the regular path.

 73d97146 ("sysfs: copy bin mmap support from fs/sysfs/bin.c to fs/sysfs/file.c")
 3124eb16 ("sysfs: merge regular and bin file handling")

After the merge, the following commands trigger a spurious lockdep
warning.  "test-mmap-read" simply mmaps the file and dumps the
content.

  $ cat /sys/block/sda/trace/act_mask
  $ test-mmap-read /sys/devices/pci0000\:00/0000\:00\:03.0/resource0 4096

  ======================================================
  [ INFO: possible circular locking dependency detected ]
  3.12.0-work+ #378 Not tainted
  -------------------------------------------------------
  test-mmap-read/567 is trying to acquire lock:
   (&of->mutex){+.+.+.}, at: [<ffffffff8120a8df>] sysfs_bin_mmap+0x4f/0x120

  but task is already holding lock:
   (&mm->mmap_sem){++++++}, at: [<ffffffff8114b399>] vm_mmap_pgoff+0x49/0xa0

  which lock already depends on the new lock.

  the existing dependency chain (in reverse order) is:

  -> #3 (&mm->mmap_sem){++++++}:
  ...
  -> #2 (sr_mutex){+.+.+.}:
  ...
  -> #1 (&bdev->bd_mutex){+.+.+.}:
  ...
  -> #0 (&of->mutex){+.+.+.}:
  ...

  other info that might help us debug this:

  Chain exists of:
   &of->mutex --> sr_mutex --> &mm->mmap_sem

   Possible unsafe locking scenario:

	 CPU0                    CPU1
	 ----                    ----
    lock(&mm->mmap_sem);
				 lock(sr_mutex);
				 lock(&mm->mmap_sem);
    lock(&of->mutex);

   *** DEADLOCK ***

  1 lock held by test-mmap-read/567:
   #0:  (&mm->mmap_sem){++++++}, at: [<ffffffff8114b399>] vm_mmap_pgoff+0x49/0xa0

  stack backtrace:
  CPU: 3 PID: 567 Comm: test-mmap-read Not tainted 3.12.0-work+ #378
  Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
   ffffffff81ed41a0 ffff880009441bc8 ffffffff81611ad2 ffffffff81eccb80
   ffff880009441c08 ffffffff8160f215 ffff880009441c60 ffff880009c75208
   0000000000000000 ffff880009c751e0 ffff880009c75208 ffff880009c74ac0
  Call Trace:
   [<ffffffff81611ad2>] dump_stack+0x4e/0x7a
   [<ffffffff8160f215>] print_circular_bug+0x2b0/0x2bf
   [<ffffffff8109ca0a>] __lock_acquire+0x1a3a/0x1e60
   [<ffffffff8109d6ba>] lock_acquire+0x9a/0x1d0
   [<ffffffff81615547>] mutex_lock_nested+0x67/0x3f0
   [<ffffffff8120a8df>] sysfs_bin_mmap+0x4f/0x120
   [<ffffffff8115d363>] mmap_region+0x3b3/0x5b0
   [<ffffffff8115d8ae>] do_mmap_pgoff+0x34e/0x3d0
   [<ffffffff8114b3ba>] vm_mmap_pgoff+0x6a/0xa0
   [<ffffffff8115be3e>] SyS_mmap_pgoff+0xbe/0x250
   [<ffffffff81008282>] SyS_mmap+0x22/0x30
   [<ffffffff8161a4d2>] system_call_fastpath+0x16/0x1b

This happens because one file nests sr_mutex, which nests mm->mmap_sem
under it, under of->mutex while mmap implementation naturally nests
of->mutex under mm->mmap_sem.  The warning is false positive as
of->mutex is per open-file and the two paths belong to two different
files.  This warning didn't trigger before regular and bin file
supports were merged because only bin file supported mmap and the
other side of locking happened only on regular files which used
equivalent but separate locking.

It'd be best if we give separate locking classes per file but we can't
easily do that.  Let's differentiate on ->mmap() for now.  Later we'll
add explicit file operations struct and can add per-ops lockdep key
there.
Signed-off-by: NTejun Heo <tj@kernel.org>
Reported-by: NDave Jones <davej@redhat.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

027a485d

sysfs: handle duplicate removal attempts in sysfs_remove_group() · 54d71145

由 Mika Westerberg 提交于 11月 20, 2013

Commit bcdde7e2 (sysfs: make __sysfs_remove_dir() recursive) changed
the behavior so that directory removals will be done recursively. This
means that the sysfs group might already be removed if its parent directory
has been removed.

The current code outputs warnings similar to following log snippet when it
detects that there is no group for the given kobject:

 WARNING: CPU: 0 PID: 4 at fs/sysfs/group.c:214 sysfs_remove_group+0xc6/0xd0()
 sysfs group ffffffff81c6f1e0 not found for kobject 'host7'
 Modules linked in:
 CPU: 0 PID: 4 Comm: kworker/0:0 Not tainted 3.12.0+ #13
 Hardware name:                  /D33217CK, BIOS GKPPT10H.86A.0042.2013.0422.1439 04/22/2013
 Workqueue: kacpi_hotplug acpi_hotplug_work_fn
  0000000000000009 ffff8801002459b0 ffffffff817daab1 ffff8801002459f8
  ffff8801002459e8 ffffffff810436b8 0000000000000000 ffffffff81c6f1e0
  ffff88006d440358 ffff88006d440188 ffff88006e8b4c28 ffff880100245a48
 Call Trace:
  [<ffffffff817daab1>] dump_stack+0x45/0x56
  [<ffffffff810436b8>] warn_slowpath_common+0x78/0xa0
  [<ffffffff81043727>] warn_slowpath_fmt+0x47/0x50
  [<ffffffff811ad319>] ? sysfs_get_dirent_ns+0x49/0x70
  [<ffffffff811ae526>] sysfs_remove_group+0xc6/0xd0
  [<ffffffff81432f7e>] dpm_sysfs_remove+0x3e/0x50
  [<ffffffff8142a0d0>] device_del+0x40/0x1b0
  [<ffffffff8142a24d>] device_unregister+0xd/0x20
  [<ffffffff8144131a>] scsi_remove_host+0xba/0x110
  [<ffffffff8145f526>] ata_host_detach+0xc6/0x100
  [<ffffffff8145f578>] ata_pci_remove_one+0x18/0x20
  [<ffffffff812e8f48>] pci_device_remove+0x28/0x60
  [<ffffffff8142d854>] __device_release_driver+0x64/0xd0
  [<ffffffff8142d8de>] device_release_driver+0x1e/0x30
  [<ffffffff8142d257>] bus_remove_device+0xf7/0x140
  [<ffffffff8142a1b1>] device_del+0x121/0x1b0
  [<ffffffff812e43d4>] pci_stop_bus_device+0x94/0xa0
  [<ffffffff812e437b>] pci_stop_bus_device+0x3b/0xa0
  [<ffffffff812e437b>] pci_stop_bus_device+0x3b/0xa0
  [<ffffffff812e44dd>] pci_stop_and_remove_bus_device+0xd/0x20
  [<ffffffff812fc743>] trim_stale_devices+0x73/0xe0
  [<ffffffff812fc78b>] trim_stale_devices+0xbb/0xe0
  [<ffffffff812fc78b>] trim_stale_devices+0xbb/0xe0
  [<ffffffff812fcb6e>] acpiphp_check_bridge+0x7e/0xd0
  [<ffffffff812fd90d>] hotplug_event+0xcd/0x160
  [<ffffffff812fd9c5>] hotplug_event_work+0x25/0x60
  [<ffffffff81316749>] acpi_hotplug_work_fn+0x17/0x22
  [<ffffffff8105cf3a>] process_one_work+0x17a/0x430
  [<ffffffff8105db29>] worker_thread+0x119/0x390
  [<ffffffff8105da10>] ? manage_workers.isra.25+0x2a0/0x2a0
  [<ffffffff81063a5d>] kthread+0xcd/0xf0
  [<ffffffff81063990>] ? kthread_create_on_node+0x180/0x180
  [<ffffffff817eb33c>] ret_from_fork+0x7c/0xb0
  [<ffffffff81063990>] ? kthread_create_on_node+0x180/0x180

On this particular machine I see ~16 of these message during Thunderbolt
hot-unplug.

Fix this in similar way that was done for sysfs_remove_one() by checking
if the parent directory has already been removed and bailing out early.
Signed-off-by: NMika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

54d71145

22 11月, 2013 2 次提交

configfs: fix race between dentry put and lookup · 76ae281f

由 Junxiao Bi 提交于 11月 21, 2013

A race window in configfs, it starts from one dentry is UNHASHED and end
before configfs_d_iput is called.  In this window, if a lookup happen,
since the original dentry was UNHASHED, so a new dentry will be
allocated, and then in configfs_attach_attr(), sd->s_dentry will be
updated to the new dentry.  Then in configfs_d_iput(),
BUG_ON(sd->s_dentry != dentry) will be triggered and system panic.

sys_open:                     sys_close:
 ...                           fput
                                dput
                                 dentry_kill
                                  __d_drop <--- dentry unhashed here,
                                           but sd->dentry still point
                                           to this dentry.

 lookup_real
  configfs_lookup
   configfs_attach_attr---> update sd->s_dentry
                            to new allocated dentry here.

                                   d_kill
                                     configfs_d_iput <--- BUG_ON(sd->s_dentry != dentry)
                                                     triggered here.

To fix it, change configfs_d_iput to not update sd->s_dentry if
sd->s_count > 2, that means there are another dentry is using the sd
beside the one that is going to be put.  Use configfs_dirent_lock in
configfs_attach_attr to sync with configfs_d_iput.

With the following steps, you can reproduce the bug.

1. enable ocfs2, this will mount configfs at /sys/kernel/config and
   fill configure in it.

2. run the following script.
	while [ 1 ]; do cat /sys/kernel/config/cluster/$your_cluster_name/idle_timeout_ms > /dev/null; done &
	while [ 1 ]; do cat /sys/kernel/config/cluster/$your_cluster_name/idle_timeout_ms > /dev/null; done &
Signed-off-by: NJunxiao Bi <junxiao.bi@oracle.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

76ae281f

GFS2: Fix ref count bug relating to atomic_open · ea0341e0

由 Steven Whitehouse 提交于 11月 21, 2013

In the case that atomic_open calls finish_no_open() with
the dentry that was supplied to gfs2_atomic_open() an
extra reference count is required. This patch fixes that
issue preventing a bug trap triggering at umount time.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

ea0341e0

21 11月, 2013 7 次提交

GFS2: fix potential NULL pointer dereference · e3c4269d

由 Michal Nazarewicz 提交于 11月 12, 2013

Commit [e66cf161: GFS2: Use lockref for glocks] replaced call:
    atomic_read(&gi->gl->gl_ref) == 0
with:
    __lockref_is_dead(&gl->gl_lockref)
therefore changing how gl is accessed, from gi->gl to plan gl.
However, gl can be a NULL pointer, and so gi->gl needs to be
used instead (which is guaranteed not to be NULL because fo
the while loop checking that condition).
Signed-off-by: NMichal Nazarewicz <mina86@mina86.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

e3c4269d

btrfs: update kconfig help text · 4204617d

由 David Sterba 提交于 11月 20, 2013

Reflect the current status. Portions of the text taken from the
wiki pages.
Signed-off-by: NDavid Sterba <dsterba@suse.cz>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

4204617d

btrfs: fix bio_size_ok() for max_sectors > 0xffff · 475bf36f

由 Akinobu Mita 提交于 11月 18, 2013

The data type of max_sectors in queue settings is unsigned int.  But
this value is stored to the local variable whose type is unsigned short
in bio_size_ok().  This can cause unexpected result when max_sectors >
0xffff.

Cc: Chris Mason <chris.mason@fusionio.com>
Cc: linux-btrfs@vger.kernel.org
Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

475bf36f

btrfs: Use trace condition for get_extent tracepoint · 4cd8587c

由 Steven Rostedt 提交于 11月 14, 2013

Doing an if statement to test some condition to know if we should
trigger a tracepoint is pointless when tracing is disabled. This just
adds overhead and wastes a branch prediction. This is why the
TRACE_EVENT_CONDITION() was created. It places the check inside the jump
label so that the branch does not happen unless tracing is enabled.

That is, instead of doing:

	if (em)
		trace_btrfs_get_extent(root, em);

Which is basically this:

	if (em)
		if (static_key(trace_btrfs_get_extent)) {

Using a TRACE_EVENT_CONDITION() we can just do:

	trace_btrfs_get_extent(root, em);

And the condition trace event will do:

	if (static_key(trace_btrfs_get_extent)) {
		if (em) {
			...

The static key is a non conditional jump (or nop) that is faster than
having to check if em is NULL or not.
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

4cd8587c

btrfs: fix typo in the log message · 52a15759

由 Anand Jain 提交于 11月 14, 2013

Signed-off-by: NAnand Jain <anand.jain@oracle.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

52a15759

Btrfs: fix list delete warning when removing ordered root from the list · 931aa877

由 Miao Xie 提交于 11月 14, 2013

Commit b0244199 "Btrfs: don't wait for
the completion of all the ordered extents" introduced a bug that broke
the ordered root list:
 WARNING: CPU: 1 PID: 7119 at lib/list_debug.c:59 __list_del_entry+0x5a/0x98()

It is because we forgot to return the roots in the splice list to the
ordered list of the fs. Fix it.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

931aa877

Btrfs: print bytenr instead of page pointer in check-int · 56d140f5

由 Stefan Behrens 提交于 11月 13, 2013

The page pointer information was useless. The bytenr is what you
want when you search for submitted write bios.

Additionally, a new bit in the print mask is added that allows
to selectively enable the check-int submit_bio verbose mode. Before,
the global verbose mode had to be enabled leading to many million
useless lines in the kernel log.

And a comment is added that explains that LOG_BUF_SHIFT needs to
be set to a really high value.
Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

56d140f5