1. 22 1月, 2014 1 次提交
    • J
      fsnotify: do not share events between notification groups · 7053aee2
      Jan Kara 提交于
      Currently fsnotify framework creates one event structure for each
      notification event and links this event into all interested notification
      groups.  This is done so that we save memory when several notification
      groups are interested in the event.  However the need for event
      structure shared between inotify & fanotify bloats the event structure
      so the result is often higher memory consumption.
      
      Another problem is that fsnotify framework keeps path references with
      outstanding events so that fanotify can return open file descriptors
      with its events.  This has the undesirable effect that filesystem cannot
      be unmounted while there are outstanding events - a regression for
      inotify compared to a situation before it was converted to fsnotify
      framework.  For fanotify this problem is hard to avoid and users of
      fanotify should kind of expect this behavior when they ask for file
      descriptors from notified files.
      
      This patch changes fsnotify and its users to create separate event
      structure for each group.  This allows for much simpler code (~400 lines
      removed by this patch) and also smaller event structures.  For example
      on 64-bit system original struct fsnotify_event consumes 120 bytes, plus
      additional space for file name, additional 24 bytes for second and each
      subsequent group linking the event, and additional 32 bytes for each
      inotify group for private data.  After the conversion inotify event
      consumes 48 bytes plus space for file name which is considerably less
      memory unless file names are long and there are several groups
      interested in the events (both of which are uncommon).  Fanotify event
      fits in 56 bytes after the conversion (fanotify doesn't care about file
      names so its events don't have to have it allocated).  A win unless
      there are four or more fanotify groups interested in the event.
      
      The conversion also solves the problem with unmount when only inotify is
      used as we don't have to grab path references for inotify events.
      
      [hughd@google.com: fanotify: fix corruption preventing startup]
      Signed-off-by: NJan Kara <jack@suse.cz>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Cc: Eric Paris <eparis@parisplace.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NHugh Dickins <hughd@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7053aee2
  2. 13 6月, 2013 1 次提交
    • C
      kernel/audit_tree.c:audit_add_tree_rule(): protect `rule' from kill_rules() · 736f3203
      Chen Gang 提交于
      audit_add_tree_rule() must set 'rule->tree = NULL;' firstly, to protect
      the rule itself freed in kill_rules().
      
      The reason is when it is killed, the 'rule' itself may have already
      released, we should not access it.  one example: we add a rule to an
      inode, just at the same time the other task is deleting this inode.
      
      The work flow for adding a rule:
      
          audit_receive() -> (need audit_cmd_mutex lock)
            audit_receive_skb() ->
              audit_receive_msg() ->
                audit_receive_filter() ->
                  audit_add_rule() ->
                    audit_add_tree_rule() -> (need audit_filter_mutex lock)
                      ...
                      unlock audit_filter_mutex
                      get_tree()
                      ...
                      iterate_mounts() -> (iterate all related inodes)
                        tag_mount() ->
                          tag_trunk() ->
                            create_trunk() -> (assume it is 1st rule)
                              fsnotify_add_mark() ->
                                fsnotify_add_inode_mark() ->  (add mark to inode->i_fsnotify_marks)
                              ...
                              get_tree(); (each inode will get one)
                      ...
                      lock audit_filter_mutex
      
      The work flow for deleting an inode:
      
          __destroy_inode() ->
           fsnotify_inode_delete() ->
             __fsnotify_inode_delete() ->
              fsnotify_clear_marks_by_inode() ->  (get mark from inode->i_fsnotify_marks)
                fsnotify_destroy_mark() ->
                 fsnotify_destroy_mark_locked() ->
                   audit_tree_freeing_mark() ->
                     evict_chunk() ->
                       ...
                       tree->goner = 1
                       ...
                       kill_rules() ->   (assume current->audit_context == NULL)
                         call_rcu() ->   (rule->tree != NULL)
                           audit_free_rule_rcu() ->
                             audit_free_rule()
                       ...
                       audit_schedule_prune() ->  (assume current->audit_context == NULL)
                         kthread_run() ->    (need audit_cmd_mutex and audit_filter_mutex lock)
                           prune_one() ->    (delete it from prue_list)
                             put_tree(); (match the original get_tree above)
      Signed-off-by: NChen Gang <gang.chen@asianux.com>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      736f3203
  3. 30 4月, 2013 1 次提交
  4. 12 1月, 2013 1 次提交
  5. 12 12月, 2012 1 次提交
  6. 15 8月, 2012 3 次提交
  7. 14 7月, 2012 1 次提交
    • D
      VFS: Make clone_mnt()/copy_tree()/collect_mounts() return errors · be34d1a3
      David Howells 提交于
      copy_tree() can theoretically fail in a case other than ENOMEM, but always
      returns NULL which is interpreted by callers as -ENOMEM.  Change it to return
      an explicit error.
      
      Also change clone_mnt() for consistency and because union mounts will add new
      error cases.
      
      Thanks to Andreas Gruenbacher <agruen@suse.de> for a bug fix.
      [AV: folded braino fix by Dan Carpenter]
      
      Original-author: Valerie Aurora <vaurora@redhat.com>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Cc: Valerie Aurora <valerie.aurora@gmail.com>
      Cc: Andreas Gruenbacher <agruen@suse.de>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      be34d1a3
  8. 21 7月, 2011 1 次提交
  9. 31 3月, 2011 1 次提交
  10. 30 10月, 2010 1 次提交
  11. 28 7月, 2010 16 次提交
  12. 30 3月, 2010 1 次提交
    • T
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo 提交于
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
        file.
      
      The conversion was done in the following steps.
      
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
         files.
      
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
         necessary.
      
      6. percpu.h was updated not to include slab.h.
      
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
      
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      5a0e3ad6
  13. 04 3月, 2010 2 次提交
  14. 20 12月, 2009 2 次提交
    • A
      fix more leaks in audit_tree.c tag_chunk() · b4c30aad
      Al Viro 提交于
      Several leaks in audit_tree didn't get caught by commit
      318b6d3d, including the leak on normal
      exit in case of multiple rules refering to the same chunk.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b4c30aad
    • A
      fix braindamage in audit_tree.c untag_chunk() · 6f5d5114
      Al Viro 提交于
      ... aka "Al had badly fscked up when writing that thing and nobody
      noticed until Eric had fixed leaks that used to mask the breakage".
      
      The function essentially creates a copy of old array sans one element
      and replaces the references to elements of original (they are on cyclic
      lists) with those to corresponding elements of new one.  After that the
      old one is fair game for freeing.
      
      First of all, there's a dumb braino: when we get to list_replace_init we
      use indices for wrong arrays - position in new one with the old array
      and vice versa.
      
      Another bug is more subtle - termination condition is wrong if the
      element to be excluded happens to be the last one.  We shouldn't go
      until we fill the new array, we should go until we'd finished the old
      one.  Otherwise the element we are trying to kill will remain on the
      cyclic lists...
      
      That crap used to be masked by several leaks, so it was not quite
      trivial to hit.  Eric had fixed some of those leaks a while ago and the
      shit had hit the fan...
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6f5d5114
  15. 24 6月, 2009 2 次提交
    • A
      Fix rule eviction order for AUDIT_DIR · 916d7576
      Al Viro 提交于
      If syscall removes the root of subtree being watched, we
      definitely do not want the rules refering that subtree
      to be destroyed without the syscall in question having
      a chance to match them.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      916d7576
    • E
      Audit: clean up all op= output to include string quoting · 9d960985
      Eric Paris 提交于
      A number of places in the audit system we send an op= followed by a string
      that includes spaces.  Somehow this works but it's just wrong.  This patch
      moves all of those that I could find to be quoted.
      
      Example:
      
      Change From: type=CONFIG_CHANGE msg=audit(1244666690.117:31): auid=0 ses=1
      subj=unconfined_u:unconfined_r:auditctl_t:s0-s0:c0.c1023 op=remove rule
      key="number2" list=4 res=0
      
      Change To: type=CONFIG_CHANGE msg=audit(1244666690.117:31): auid=0 ses=1
      subj=unconfined_u:unconfined_r:auditctl_t:s0-s0:c0.c1023 op="remove rule"
      key="number2" list=4 res=0
      Signed-off-by: NEric Paris <eparis@redhat.com>
      9d960985
  16. 12 6月, 2009 1 次提交
  17. 21 4月, 2009 1 次提交
  18. 06 4月, 2009 1 次提交
  19. 05 1月, 2009 2 次提交