1. 12 6月, 2009 40 次提交
    • J
      HID: use debugfs for events/reports dumping · cd667ce2
      Jiri Kosina 提交于
      This is a followup patch to the one implemeting rdesc representation in debugfs
      rather than being dependent on compile-time CONFIG_HID_DEBUG setting.
      
      The API of the appropriate formatting functions is slightly modified -- if
      they are passed seq_file pointer, the one-shot output for 'rdesc' file mode
      is used, and therefore the message is formatted into the corresponding seq_file
      immediately.
      
      Otherwise the called function allocated a new buffer, formats the text into the
      buffer and returns the pointer to it, so that it can be queued into the ring-buffer
      of the processess blocked waiting on input on 'events' file in debugfs.
      
      'debug' parameter to the 'hid' module is now used solely for the prupose of inetrnal
      driver state debugging (parser, transport, etc).
      Signed-off-by: NJiri Kosina <jkosina@suse.cz>
      cd667ce2
    • J
      HID: use debugfs for report dumping descriptor · a635f9dd
      Jiri Kosina 提交于
      It is a little bit inconvenient for people who have some non-standard
      HID hardware (usually violating the HID specification) to have to
      recompile kernel with CONFIG_HID_DEBUG to be able to see kernel's perspective
      of the HID report descriptor and observe the parsed events. Plus the messages
      are then mixed up inconveniently with the rest of the dmesg stuff.
      
      This patch implements /sys/kernel/debug/hid/<device>/rdesc file, which
      represents the kernel's view of report descriptor (both the raw report
      descriptor data and parsed contents).
      
      With all the device-specific debug data being available through debugfs, there
      is no need for keeping CONFIG_HID_DEBUG, as the 'debug' parameter to the
      hid module will now only output only driver-specific debugging options, which has
      absolutely minimal memory footprint, just a few error messages and one global
      flag (hid_debug).
      
      We use the current set of output formatting functions. The ones that need to be
      used both for one-shot rdesc seq_file and also for continuous flow of data
      (individual reports, as being sent by the device) distinguish according to the
      passed seq_file parameter, and if it is NULL, it still output to kernel ringbuffer,
      otherwise the corresponding seq_file is used for output.
      
      The format of the output is preserved.
      Signed-off-by: NJiri Kosina <jkosina@suse.cz>
      a635f9dd
    • A
      fs/qnx4: sanitize includes · 964f5369
      Al Viro 提交于
      fs-internal parts of qnx4_fs.h taken to fs/qnx4/qnx4.h, includes adjusted,
      qnx4_fs.h doesn't need unifdef anymore.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      964f5369
    • A
      Sanitize qnx4 fsync handling · 79d25767
      Al Viro 提交于
      * have directory operations use mark_buffer_dirty_inode(),
        so that sync_mapping_buffers() would get those.
      * make qnx4_write_inode() honour its last argument.
      * get rid of insane copies of very ancient "walk the indirect blocks"
        in qnx4/fsync - they never matched the actual fs layout and, fortunately,
        never'd been called.  Again, all this junk is not needed; ->fsync()
        should just do sync_mapping_buffers + sync_inode (and if we implement
        block allocation for qnx4, we'll need to use mark_buffer_dirty_inode()
        for extent blocks)
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      79d25767
    • A
      New helper - simple_fsync() · d5aacad5
      Al Viro 提交于
      writes associated buffers, then does sync_inode() to write
      the inode itself (and to make it clean).  Depends on
      ->write_inode() honouring the second argument.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      d5aacad5
    • M
      linux/magic.h: move cramfs magic out of cramfs_fs.h · 8688b863
      Mike Frysinger 提交于
      Signed-off-by: NMike Frysinger <vapier@gentoo.org>
      CC: Alexander Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      8688b863
    • T
      fs: Rearrange inode structure elements to avoid waste due to padding · 28ad0c11
      Theodore Ts'o 提交于
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: linux-fsdevel@vger.kernel.org
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      28ad0c11
    • T
      fs: Remove i_cindex from struct inode · 9fd5746f
      Theodore Ts'o 提交于
      The only user of the i_cindex element in the inode structure is used
      is by the firewire drivers.  As part of an attempt to slim down the
      inode structure to save memory --- since a typical Linux system will
      have hundreds of thousands if not millions of inodes cached, a
      reduction in the size inode has high leverage.
      
      The firewire driver does not need i_cindex in any fast path, so it's
      simple enough to calculate when it is needed, instead of wasting space
      in the inode structure.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: krh@redhat.com
      Cc: stefanr@s5r6.in-berlin.de
      Cc: linux-fsdevel@vger.kernel.org
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      9fd5746f
    • A
      Trim a bit of crap from fs.h · 62c6943b
      Al Viro 提交于
      do_remount_sb() is fs/internal.h fodder, fsync_no_super() is long gone.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      62c6943b
    • A
      dcache: extrace and use d_unlinked() · f3da392e
      Alexey Dobriyan 提交于
      d_unlinked() will be used in middle-term to ban checkpointing when opened
      but unlinked file is detected, and in long term, to detect such situation
      and special case on it.
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      f3da392e
    • J
      quota: Introduce writeout_quota_sb() (version 4) · c3f8a40c
      Jan Kara 提交于
      Introduce this function which just writes all the quota structures but
      avoids all the syncing and cache pruning work to expose quota structures
      to userspace. Use this function from __sync_filesystem when wait == 0.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      c3f8a40c
    • C
      quota: cleanup dquota sync functions (version 4) · 850b201b
      Christoph Hellwig 提交于
      Currently the VFS calls vfs_dq_sync to sync out disk quotas for a given
      superblock.  This is a small wrapper around sync_dquots which for the
      case of a non-NULL superblock is a small wrapper around quota_sync_sb.
      
      Just make quota_sync_sb global (rename it to sync_quota_sb) and call it
      directly.  Also call it directly for those cases in quota.c that have a
      superblock and leave sync_dquots purely an iterator over sync_quota_sb and
      remove it's superblock argument.
      
      To make this nicer move the check for the lack of a quota_sync method
      from the callers into sync_quota_sb.
      
      [folded build fix from Alexander Beregalov <a.beregalov@gmail.com>]
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      850b201b
    • J
      vfs: Rename fsync_super() to sync_filesystem() (version 4) · 60b0680f
      Jan Kara 提交于
      Rename the function so that it better describe what it really does. Also
      remove the unnecessary include of buffer_head.h.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      60b0680f
    • J
      vfs: Move syncing code from super.c to sync.c (version 4) · c15c54f5
      Jan Kara 提交于
      Move sync_filesystems(), __fsync_super(), fsync_super() from
      super.c to sync.c where it fits better.
      
      [build fixes folded]
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      c15c54f5
    • J
      vfs: Make sys_sync() use fsync_super() (version 4) · 5cee5815
      Jan Kara 提交于
      It is unnecessarily fragile to have two places (fsync_super() and do_sync())
      doing data integrity sync of the filesystem. Alter __fsync_super() to
      accommodate needs of both callers and use it. So after this patch
      __fsync_super() is the only place where we gather all the calls needed to
      properly send all data on a filesystem to disk.
      
      Nice bonus is that we get a complete livelock avoidance and write_supers()
      is now only used for periodic writeback of superblocks.
      
      sync_blockdevs() introduced a couple of patches ago is gone now.
      
      [build fixes folded]
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      5cee5815
    • J
      vfs: Make __fsync_super() a static function (version 4) · 429479f0
      Jan Kara 提交于
      __fsync_super() does the same thing as fsync_super(). So change the only
      caller to use fsync_super() and make __fsync_super() static. This removes
      unnecessarily duplicated call to sync_blockdev() and prepares ground
      for the changes to __fsync_super() in the following patches.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      429479f0
    • C
      remove s_async_list · 876a9f76
      Christoph Hellwig 提交于
      Remove the unused s_async_list in the superblock, a leftover of the
      broken async inode deletion code that leaked into mainline.  Having this
      in the middle of the sync/unmount path is not helpful for the following
      cleanups.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      876a9f76
    • N
      fs: introduce mnt_clone_write · 96029c4e
      npiggin@suse.de 提交于
      This patch speeds up lmbench lat_mmap test by about another 2% after the
      first patch.
      
      Before:
       avg = 462.286
       std = 5.46106
      
      After:
       avg = 453.12
       std = 9.58257
      
      (50 runs of each, stddev gives a reasonable confidence)
      
      It does this by introducing mnt_clone_write, which avoids some heavyweight
      operations of mnt_want_write if called on a vfsmount which we know already
      has a write count; and mnt_want_write_file, which can call mnt_clone_write
      if the file is open for write.
      
      After these two patches, mnt_want_write and mnt_drop_write go from 7% on
      the profile down to 1.3% (including mnt_clone_write).
      
      [AV: mnt_want_write_file() should take file alone and derive mnt from it;
      not only all callers have that form, but that's the only mnt about which
      we know that it's already held for write if file is opened for write]
      
      Cc: Dave Hansen <haveblue@us.ibm.com>
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      96029c4e
    • N
      fs: mnt_want_write speedup · d3ef3d73
      npiggin@suse.de 提交于
      This patch speeds up lmbench lat_mmap test by about 8%. lat_mmap is set up
      basically to mmap a 64MB file on tmpfs, fault in its pages, then unmap it.
      A microbenchmark yes, but it exercises some important paths in the mm.
      
      Before:
       avg = 501.9
       std = 14.7773
      
      After:
       avg = 462.286
       std = 5.46106
      
      (50 runs of each, stddev gives a reasonable confidence, but there is quite
      a bit of variation there still)
      
      It does this by removing the complex per-cpu locking and counter-cache and
      replaces it with a percpu counter in struct vfsmount. This makes the code
      much simpler, and avoids spinlocks (although the msync is still pretty
      costly, unfortunately). It results in about 900 bytes smaller code too. It
      does increase the size of a vfsmount, however.
      
      It should also give a speedup on large systems if CPUs are frequently operating
      on different mounts (because the existing scheme has to operate on an atomic in
      the struct vfsmount when switching between mounts). But I'm most interested in
      the single threaded path performance for the moment.
      
      [AV: minor cleanup]
      
      Cc: Dave Hansen <haveblue@us.ibm.com>
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      d3ef3d73
    • A
      Move junk from proc_fs.h to fs/proc/internal.h · 3174c21b
      Al Viro 提交于
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      3174c21b
    • A
      switch lookup_mnt() · 1c755af4
      Al Viro 提交于
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      1c755af4
    • A
      switch follow_down() · 9393bd07
      Al Viro 提交于
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      9393bd07
    • A
      Switch collect_mounts() to struct path · 589ff870
      Al Viro 提交于
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      589ff870
    • A
      switch follow_up() to struct path · bab77ebf
      Al Viro 提交于
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      bab77ebf
    • A
      switch rqst_exp_parent() · e64c390c
      Al Viro 提交于
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      e64c390c
    • A
      switch rqst_exp_get_by_name() · 91c9fa8f
      Al Viro 提交于
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      91c9fa8f
    • A
      Cache root in nameidata · 2a737871
      Al Viro 提交于
      New field: nd->root.  When pathname resolution wants to know the root,
      check if nd->root.mnt is non-NULL; use nd->root if it is, otherwise
      copy current->fs->root there.  After path_walk() is finished, we check
      if we'd got a cached value in nd->root and drop it.  Before calling
      path_walk() we should either set nd->root.mnt to NULL *or* copy (and
      pin down) some path to nd->root.  In the latter case we won't be
      looking at current->fs->root at all.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      2a737871
    • J
      reiserfs: allow exposing privroot w/ xattrs enabled · 73422811
      Jeff Mahoney 提交于
      This patch adds an -oexpose_privroot option to allow access to the privroot.
      Signed-off-by: NJeff Mahoney <jeffm@suse.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      73422811
    • E
      fsnotify: move events should indicate the event was on a child · ff52cc21
      Eric Paris 提交于
      fsnotify tells its listeners explicitly when an event happened on the given
      inode verses on the child of the given inode.  (see __fsnotify_parent)
      However, the semantics of fsnotify_move() are such that we deliver events
      directly to the two parent directories in question (old_dir and new_dir)
      directly without using the __fsnotify_parent() call.  fsnotify should be
      adding FS_EVENT_ON_CHILD for the notifications to these parents.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      ff52cc21
    • E
      inotify: reimplement inotify using fsnotify · 63c882a0
      Eric Paris 提交于
      Reimplement inotify_user using fsnotify.  This should be feature for feature
      exactly the same as the original inotify_user.  This does not make any changes
      to the in kernel inotify feature used by audit.  Those patches (and the eventual
      removal of in kernel inotify) will come after the new inotify_user proves to be
      working correctly.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
      Cc: Christoph Hellwig <hch@lst.de>
      63c882a0
    • E
      fsnotify: handle filesystem unmounts with fsnotify marks · 164bc619
      Eric Paris 提交于
      When an fs is unmounted with an fsnotify mark entry attached to one of its
      inodes we need to destroy that mark entry and we also (like inotify) send
      an unmount event.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
      Cc: Christoph Hellwig <hch@lst.de>
      164bc619
    • E
      fsnotify: allow groups to add private data to events · e4aff117
      Eric Paris 提交于
      inotify needs per group information attached to events.  This patch allows
      groups to attach private information and implements a callback so that
      information can be freed when an event is being destroyed.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
      Cc: Christoph Hellwig <hch@lst.de>
      e4aff117
    • E
      fsnotify: add correlations between events · 47882c6f
      Eric Paris 提交于
      As part of the standard inotify events it includes a correlation cookie
      between two dentry move operations.  This patch includes the same behaviour
      in fsnotify events.  It is needed so that inotify userspace can be
      implemented on top of fsnotify.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
      Cc: Christoph Hellwig <hch@lst.de>
      47882c6f
    • E
      fsnotify: include pathnames with entries when possible · 62ffe5df
      Eric Paris 提交于
      When inotify wants to send events to a directory about a child it includes
      the name of the original file.  This patch collects that filename and makes
      it available for notification.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
      Cc: Christoph Hellwig <hch@lst.de>
      62ffe5df
    • E
      fsnotify: generic notification queue and waitq · a2d8bc6c
      Eric Paris 提交于
      inotify needs to do asyc notification in which event information is stored
      on a queue until the listener is ready to receive it.  This patch
      implements a generic notification queue for inotify (and later fanotify) to
      store events to be sent at a later time.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
      Cc: Christoph Hellwig <hch@lst.de>
      a2d8bc6c
    • E
      dnotify: reimplement dnotify using fsnotify · 3c5119c0
      Eric Paris 提交于
      Reimplement dnotify using fsnotify.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
      Cc: Christoph Hellwig <hch@lst.de>
      3c5119c0
    • E
      fsnotify: parent event notification · c28f7e56
      Eric Paris 提交于
      inotify and dnotify both use a similar parent notification mechanism.  We
      add a generic parent notification mechanism to fsnotify for both of these
      to use.  This new machanism also adds the dentry flag optimization which
      exists for inotify to dnotify.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
      Cc: Christoph Hellwig <hch@lst.de>
      c28f7e56
    • E
      fsnotify: add marks to inodes so groups can interpret how to handle those inodes · 3be25f49
      Eric Paris 提交于
      This patch creates a way for fsnotify groups to attach marks to inodes.
      These marks have little meaning to the generic fsnotify infrastructure
      and thus their meaning should be interpreted by the group that attached
      them to the inode's list.
      
      dnotify and inotify  will make use of these markings to indicate which
      inodes are of interest to their respective groups.  But this implementation
      has the useful property that in the future other listeners could actually
      use the marks for the exact opposite reason, aka to indicate which inodes
      it had NO interest in.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
      Cc: Christoph Hellwig <hch@lst.de>
      3be25f49
    • E
      fsnotify: unified filesystem notification backend · 90586523
      Eric Paris 提交于
      fsnotify is a backend for filesystem notification.  fsnotify does
      not provide any userspace interface but does provide the basis
      needed for other notification schemes such as dnotify.  fsnotify
      can be extended to be the backend for inotify or the upcoming
      fanotify.  fsnotify provides a mechanism for "groups" to register for
      some set of filesystem events and to then deliver those events to
      those groups for processing.
      
      fsnotify has a number of benefits, the first being actually shrinking the size
      of an inode.  Before fsnotify to support both dnotify and inotify an inode had
      
              unsigned long           i_dnotify_mask; /* Directory notify events */
              struct dnotify_struct   *i_dnotify; /* for directory notifications */
              struct list_head        inotify_watches; /* watches on this inode */
              struct mutex            inotify_mutex;  /* protects the watches list
      
      But with fsnotify this same functionallity (and more) is done with just
      
              __u32                   i_fsnotify_mask; /* all events for this inode */
              struct hlist_head       i_fsnotify_mark_entries; /* marks on this inode */
      
      That's right, inotify, dnotify, and fanotify all in 64 bits.  We used that
      much space just in inotify_watches alone, before this patch set.
      
      fsnotify object lifetime and locking is MUCH better than what we have today.
      inotify locking is incredibly complex.  See 8f7b0ba1 as an example of
      what's been busted since inception.  inotify needs to know internal semantics
      of superblock destruction and unmounting to function.  The inode pinning and
      vfs contortions are horrible.
      
      no fsnotify implementers do allocation under locks.  This means things like
      f04b30de which (due to an overabundance of caution) changes GFP_KERNEL to
      GFP_NOFS can be reverted.  There are no longer any allocation rules when using
      or implementing your own fsnotify listener.
      
      fsnotify paves the way for fanotify.  In brief fanotify is a notification
      mechanism that delivers the lisener both an 'event' and an open file descriptor
      to the object in question.  This means that fanotify is pathname agnostic.
      Some on lkml may not care for the original companies or users that pushed for
      TALPA, but fanotify was designed with flexibility and input for other users in
      mind.  The readahead group expressed interest in fanotify as it could be used
      to profile disk access on boot without breaking the audit system.  The desktop
      search groups have also expressed interest in fanotify as it solves a number
      of the race conditions and problems present with managing inotify when more
      than a limited number of specific files are of interest.  fanotify can provide
      for a userspace access control system which makes it a clean interface for AV
      vendors to hook without trying to do binary patching on the syscall table,
      LSM, and everywhere else they do their things today.  With this patch series
      fanotify can be implemented in less than 1200 lines of easy to review code.
      Almost all of which is the socket based user interface.
      
      This patch series builds fsnotify to the point that it can implement
      dnotify and inotify_user.  Patches exist and will be sent soon after
      acceptance to finish the in kernel inotify conversion (audit) and implement
      fanotify.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
      Cc: Christoph Hellwig <hch@lst.de>
      90586523
    • Y
      x86: remove some alloc_bootmem_cpumask_var calling · 38c7fed2
      Yinghai Lu 提交于
      Now that we set up the slab allocator earlier, we can get rid of some
      alloc_bootmem_cpumask_var() calls in boot code.
      
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
      38c7fed2