1. 25 2月, 2014 1 次提交
    • J
      fsnotify: Allocate overflow events with proper type · ff57cd58
      Jan Kara 提交于
      Commit 7053aee2 "fsnotify: do not share events between notification
      groups" used overflow event statically allocated in a group with the
      size of the generic notification event. This causes problems because
      some code looks at type specific parts of event structure and gets
      confused by a random data it sees there and causes crashes.
      
      Fix the problem by allocating overflow event with type corresponding to
      the group type so code cannot get confused.
      Signed-off-by: NJan Kara <jack@suse.cz>
      ff57cd58
  2. 18 2月, 2014 1 次提交
    • J
      inotify: Fix reporting of cookies for inotify events · 45a22f4c
      Jan Kara 提交于
      My rework of handling of notification events (namely commit 7053aee2
      "fsnotify: do not share events between notification groups") broke
      sending of cookies with inotify events. We didn't propagate the value
      passed to fsnotify() properly and passed 4 uninitialized bytes to
      userspace instead (so it is also an information leak). Sadly I didn't
      notice this during my testing because inotify cookies aren't used very
      much and LTP inotify tests ignore them.
      
      Fix the problem by passing the cookie value properly.
      
      Fixes: 7053aee2Reported-by: NVegard Nossum <vegard.nossum@oracle.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      45a22f4c
  3. 29 1月, 2014 1 次提交
  4. 22 1月, 2014 2 次提交
    • J
      fsnotify: remove .should_send_event callback · 83c4c4b0
      Jan Kara 提交于
      After removing event structure creation from the generic layer there is
      no reason for separate .should_send_event and .handle_event callbacks.
      So just remove the first one.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Cc: Eric Paris <eparis@parisplace.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      83c4c4b0
    • J
      fsnotify: do not share events between notification groups · 7053aee2
      Jan Kara 提交于
      Currently fsnotify framework creates one event structure for each
      notification event and links this event into all interested notification
      groups.  This is done so that we save memory when several notification
      groups are interested in the event.  However the need for event
      structure shared between inotify & fanotify bloats the event structure
      so the result is often higher memory consumption.
      
      Another problem is that fsnotify framework keeps path references with
      outstanding events so that fanotify can return open file descriptors
      with its events.  This has the undesirable effect that filesystem cannot
      be unmounted while there are outstanding events - a regression for
      inotify compared to a situation before it was converted to fsnotify
      framework.  For fanotify this problem is hard to avoid and users of
      fanotify should kind of expect this behavior when they ask for file
      descriptors from notified files.
      
      This patch changes fsnotify and its users to create separate event
      structure for each group.  This allows for much simpler code (~400 lines
      removed by this patch) and also smaller event structures.  For example
      on 64-bit system original struct fsnotify_event consumes 120 bytes, plus
      additional space for file name, additional 24 bytes for second and each
      subsequent group linking the event, and additional 32 bytes for each
      inotify group for private data.  After the conversion inotify event
      consumes 48 bytes plus space for file name which is considerably less
      memory unless file names are long and there are several groups
      interested in the events (both of which are uncommon).  Fanotify event
      fits in 56 bytes after the conversion (fanotify doesn't care about file
      names so its events don't have to have it allocated).  A win unless
      there are four or more fanotify groups interested in the event.
      
      The conversion also solves the problem with unmount when only inotify is
      used as we don't have to grab path references for inotify events.
      
      [hughd@google.com: fanotify: fix corruption preventing startup]
      Signed-off-by: NJan Kara <jack@suse.cz>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Cc: Eric Paris <eparis@parisplace.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NHugh Dickins <hughd@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7053aee2
  5. 30 4月, 2013 1 次提交
  6. 12 12月, 2012 8 次提交
    • E
      fsnotify: make fasync generic for both inotify and fanotify · 0a6b6bd5
      Eric Paris 提交于
      inotify is supposed to support async signal notification when information
      is available on the inotify fd.  This patch moves that support to generic
      fsnotify functions so it can be used by all notification mechanisms.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      0a6b6bd5
    • L
      fsnotify: change locking order · 6960b0d9
      Lino Sanfilippo 提交于
      On Mon, Aug 01, 2011 at 04:38:22PM -0400, Eric Paris wrote:
      >
      > I finally built and tested a v3.0 kernel with these patches (I know I'm
      > SOOOOOO far behind).  Not what I hoped for:
      >
      > > [  150.937798] VFS: Busy inodes after unmount of tmpfs. Self-destruct in 5 seconds.  Have a nice day...
      > > [  150.945290] BUG: unable to handle kernel NULL pointer dereference at 0000000000000070
      > > [  150.946012] IP: [<ffffffff810ffd58>] shmem_free_inode+0x18/0x50
      > > [  150.946012] PGD 2bf9e067 PUD 2bf9f067 PMD 0
      > > [  150.946012] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
      > > [  150.946012] CPU 0
      > > [  150.946012] Modules linked in: nfs lockd fscache auth_rpcgss nfs_acl sunrpc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables ext4 jbd2 crc16 joydev ata_piix i2c_piix4 pcspkr uinput ipv6 autofs4 usbhid [last unloaded: scsi_wait_scan]
      > > [  150.946012]
      > > [  150.946012] Pid: 2764, comm: syscall_thrash Not tainted 3.0.0+ #1 Red Hat KVM
      > > [  150.946012] RIP: 0010:[<ffffffff810ffd58>]  [<ffffffff810ffd58>] shmem_free_inode+0x18/0x50
      > > [  150.946012] RSP: 0018:ffff88002c2e5df8  EFLAGS: 00010282
      > > [  150.946012] RAX: 000000004e370d9f RBX: 0000000000000000 RCX: ffff88003a029438
      > > [  150.946012] RDX: 0000000033630a5f RSI: 0000000000000000 RDI: ffff88003491c240
      > > [  150.946012] RBP: ffff88002c2e5e08 R08: 0000000000000000 R09: 0000000000000000
      > > [  150.946012] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88003a029428
      > > [  150.946012] R13: ffff88003a029428 R14: ffff88003a029428 R15: ffff88003499a610
      > > [  150.946012] FS:  00007f5a05420700(0000) GS:ffff88003f600000(0000) knlGS:0000000000000000
      > > [  150.946012] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      > > [  150.946012] CR2: 0000000000000070 CR3: 000000002a662000 CR4: 00000000000006f0
      > > [  150.946012] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      > > [  150.946012] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      > > [  150.946012] Process syscall_thrash (pid: 2764, threadinfo ffff88002c2e4000, task ffff88002bfbc760)
      > > [  150.946012] Stack:
      > > [  150.946012]  ffff88003a029438 ffff88003a029428 ffff88002c2e5e38 ffffffff81102f76
      > > [  150.946012]  ffff88003a029438 ffff88003a029598 ffffffff8160f9c0 ffff88002c221250
      > > [  150.946012]  ffff88002c2e5e68 ffffffff8115e9be ffff88002c2e5e68 ffff88003a029438
      > > [  150.946012] Call Trace:
      > > [  150.946012]  [<ffffffff81102f76>] shmem_evict_inode+0x76/0x130
      > > [  150.946012]  [<ffffffff8115e9be>] evict+0x7e/0x170
      > > [  150.946012]  [<ffffffff8115ee40>] iput_final+0xd0/0x190
      > > [  150.946012]  [<ffffffff8115ef33>] iput+0x33/0x40
      > > [  150.946012]  [<ffffffff81180205>] fsnotify_destroy_mark_locked+0x145/0x160
      > > [  150.946012]  [<ffffffff81180316>] fsnotify_destroy_mark+0x36/0x50
      > > [  150.946012]  [<ffffffff81181937>] sys_inotify_rm_watch+0x77/0xd0
      > > [  150.946012]  [<ffffffff815aca52>] system_call_fastpath+0x16/0x1b
      > > [  150.946012] Code: 67 4a 00 b8 e4 ff ff ff eb aa 66 0f 1f 84 00 00 00 00 00 55 48 89 e5 48 83 ec 10 48 89 1c 24 4c 89 64 24 08 48 8b 9f 40 05 00 00
      > > [  150.946012]  83 7b 70 00 74 1c 4c 8d a3 80 00 00 00 4c 89 e7 e8 d2 5d 4a
      > > [  150.946012] RIP  [<ffffffff810ffd58>] shmem_free_inode+0x18/0x50
      > > [  150.946012]  RSP <ffff88002c2e5df8>
      > > [  150.946012] CR2: 0000000000000070
      >
      > Looks at aweful lot like the problem from:
      > http://www.spinics.net/lists/linux-fsdevel/msg46101.html
      >
      
      I tried to reproduce this bug with your test program, but without success.
      However, if I understand correctly, this occurs since we dont hold any locks when
      we call iput() in mark_destroy(), right?
      With the patches you tested, iput() is also not called within any lock, since the
      groups mark_mutex is released temporarily before iput() is called.  This is, since
      the original codes behaviour is similar.
      However since we now have a mutex as the biggest lock, we can do what you
      suggested (http://www.spinics.net/lists/linux-fsdevel/msg46107.html) and
      call iput() with the mutex held to avoid the race.
      The patch below implements this. It uses nested locking to avoid deadlock in case
      we do the final iput() on an inode which still holds marks and thus would take
      the mutex again when calling fsnotify_inode_delete() in destroy_inode().
      Signed-off-by: NLino Sanfilippo <LinoSanfilippo@gmx.de>
      Signed-off-by: NEric Paris <eparis@redhat.com>
      6960b0d9
    • L
      fsnotify: dont put marks on temporary list when clearing marks by group · 64c20d2a
      Lino Sanfilippo 提交于
      In clear_marks_by_group_flags() the mark list of a group is iterated and the
      marks are put on a temporary list.
      Since we introduced fsnotify_destroy_mark_locked() we dont need the temp list
      any more and are able to remove the marks while the mark list is iterated and
      the mark list mutex is held.
      Signed-off-by: NLino Sanfilippo <LinoSanfilippo@gmx.de>
      Signed-off-by: NEric Paris <eparis@redhat.com>
      64c20d2a
    • L
      fsnotify: introduce locked versions of fsnotify_add_mark() and fsnotify_remove_mark() · d5a335b8
      Lino Sanfilippo 提交于
      This patch introduces fsnotify_add_mark_locked() and fsnotify_remove_mark_locked()
      which are essentially the same as fsnotify_add_mark() and fsnotify_remove_mark() but
      assume that the caller has already taken the groups mark mutex.
      Signed-off-by: NLino Sanfilippo <LinoSanfilippo@gmx.de>
      Signed-off-by: NEric Paris <eparis@redhat.com>
      d5a335b8
    • L
      fsnotify: pass group to fsnotify_destroy_mark() · e2a29943
      Lino Sanfilippo 提交于
      In fsnotify_destroy_mark() dont get the group from the passed mark anymore,
      but pass the group itself as an additional parameter to the function.
      Signed-off-by: NLino Sanfilippo <LinoSanfilippo@gmx.de>
      Signed-off-by: NEric Paris <eparis@redhat.com>
      e2a29943
    • L
      fsnotify: use a mutex instead of a spinlock to protect a groups mark list · 986ab098
      Lino Sanfilippo 提交于
      Replaces the groups mark_lock spinlock with a mutex. Using a mutex instead
      of a spinlock results in more flexibility (i.e it allows to sleep while the
      lock is held).
      Signed-off-by: NLino Sanfilippo <LinoSanfilippo@gmx.de>
      Signed-off-by: NEric Paris <eparis@redhat.com>
      986ab098
    • L
      fsnotify: introduce fsnotify_get_group() · 98612952
      Lino Sanfilippo 提交于
      Introduce fsnotify_get_group() which increments the reference counter of a group.
      Signed-off-by: NLino Sanfilippo <LinoSanfilippo@gmx.de>
      Signed-off-by: NEric Paris <eparis@redhat.com>
      98612952
    • L
      inotify, fanotify: replace fsnotify_put_group() with fsnotify_destroy_group() · d8153d4d
      Lino Sanfilippo 提交于
      Currently in fsnotify_put_group() the ref count of a group is decremented and if
      it becomes 0 fsnotify_destroy_group() is called. Since a groups ref count is only
      at group creation set to 1 and never increased after that a call to fsnotify_put_group()
      always results in a call to fsnotify_destroy_group().
      With this patch fsnotify_destroy_group() is called directly.
      Signed-off-by: NLino Sanfilippo <LinoSanfilippo@gmx.de>
      Signed-off-by: NEric Paris <eparis@redhat.com>
      d8153d4d
  7. 31 5月, 2012 1 次提交
    • N
      fsnotify: handle subfiles' perm events · a4f9a9a6
      Naohiro Aota 提交于
      Recently I'm working on fanotify and found the following strange
      behaviors.
      
      I wrote a program to set fanotify_mark on "/tmp/block" and FAN_DENY
      all events notified.
      
      fanotify_mask = FAN_ALL_EVENTS | FAN_ALL_PERM_EVENTS | FAN_EVENT_ON_CHILD:
      $ cd /tmp/block; cat foo
      cat: foo: Operation not permitted
      
      Operation on the file is blocked as expected.
      
      But,
      
      fanotify_mask = FAN_ALL_PERM_EVENTS | FAN_EVENT_ON_CHILD:
      $ cd /tmp/block; cat foo
      aaa
      
      It's not blocked anymore.  This is confusing behavior.  Also reading
      commit "fsnotify: call fsnotify_parent in perm events", it seems like
      fsnotify should handle subfiles' perm events as well as the other notify
      events.
      
      With this patch, regardless of FAN_ALL_EVENTS set or not:
      $ cd /tmp/block; cat foo
      cat: foo: Operation not permitted
      
      Operation on the file is now blocked properly.
      
      FS_OPEN_PERM and FS_ACCESS_PERM are not listed on FS_EVENTS_POSS_ON_CHILD.
       Due to fsnotify_inode_watches_children() check, if you only specify only
      these events as fsnotify_mask, you don't get subfiles' perm events
      notified.
      
      This patch add the events to FS_EVENTS_POSS_ON_CHILD to get them notified
      even if only these events are specified to fsnotify_mask.
      Signed-off-by: NNaohiro Aota <naota@elisp.net>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      a4f9a9a6
  8. 27 7月, 2011 1 次提交
  9. 07 1月, 2011 1 次提交
  10. 08 12月, 2010 1 次提交
    • L
      fanotify: on group destroy allow all waiters to bypass permission check · 09e5f14e
      Lino Sanfilippo 提交于
      When fanotify_release() is called, there may still be processes waiting for
      access permission. Currently only processes for which an event has already been
      queued into the groups access list will be woken up.  Processes for which no
      event has been queued will continue to sleep and thus cause a deadlock when
      fsnotify_put_group() is called.
      Furthermore there is a race allowing further processes to be waiting on the
      access wait queue after wake_up (if they arrive before clear_marks_by_group()
      is called).
      This patch corrects this by setting a flag to inform processes that the group
      is about to be destroyed and thus not to wait for access permission.
      
      [additional changelog from eparis]
      Lets think about the 4 relevant code paths from the PoV of the
      'operator' 'listener' 'responder' and 'closer'.  Where operator is the
      process doing an action (like open/read) which could require permission.
      Listener is the task (or in this case thread) slated with reading from
      the fanotify file descriptor.  The 'responder' is the thread responsible
      for responding to access requests.  'Closer' is the thread attempting to
      close the fanotify file descriptor.
      
      The 'operator' is going to end up in:
      fanotify_handle_event()
        get_response_from_access()
          (THIS BLOCKS WAITING ON USERSPACE)
      
      The 'listener' interesting code path
      fanotify_read()
        copy_event_to_user()
          prepare_for_access_response()
            (THIS CREATES AN fanotify_response_event)
      
      The 'responder' code path:
      fanotify_write()
        process_access_response()
          (REMOVE A fanotify_response_event, SET RESPONSE, WAKE UP 'operator')
      
      The 'closer':
      fanotify_release()
        (SUPPOSED TO CLEAN UP THE REST OF THIS MESS)
      
      What we have today is that in the closer we remove all of the
      fanotify_response_events and set a bit so no more response events are
      ever created in prepare_for_access_response().
      
      The bug is that we never wake all of the operators up and tell them to
      move along.  You fix that in fanotify_get_response_from_access().  You
      also fix other operators which haven't gotten there yet.  So I agree
      that's a good fix.
      [/additional changelog from eparis]
      
      [remove additional changes to minimize patch size]
      [move initialization so it was inside CONFIG_FANOTIFY_PERMISSION]
      Signed-off-by: NLino Sanfilippo <LinoSanfilippo@gmx.de>
      Signed-off-by: NEric Paris <eparis@redhat.com>
      09e5f14e
  11. 29 10月, 2010 7 次提交
  12. 23 8月, 2010 1 次提交
    • E
      fanotify: flush outstanding perm requests on group destroy · 2eebf582
      Eric Paris 提交于
      When an fanotify listener is closing it may cause a deadlock between the
      listener and the original task doing an fs operation.  If the original task
      is waiting for a permissions response it will be holding the srcu lock.  The
      listener cannot clean up and exit until after that srcu lock is syncronized.
      Thus deadlock.  The fix introduced here is to stop accepting new permissions
      events when a listener is shutting down and to grant permission for all
      outstanding events.  Thus the original task will eventually release the srcu
      lock and the listener can complete shutdown.
      Reported-by: NAndreas Gruenbacher <agruen@suse.de>
      Cc: Andreas Gruenbacher <agruen@suse.de>
      Signed-off-by: NEric Paris <eparis@redhat.com>
      2eebf582
  13. 13 8月, 2010 1 次提交
  14. 28 7月, 2010 13 次提交