1. 28 2月, 2013 1 次提交
  2. 23 2月, 2013 1 次提交
  3. 21 12月, 2012 1 次提交
  4. 19 11月, 2012 1 次提交
  5. 09 10月, 2012 2 次提交
    • M
      rbtree: fix incorrect rbtree node insertion in fs/proc/proc_sysctl.c · ea5272f5
      Michel Lespinasse 提交于
      The recently added code to use rbtrees in sysctl did not follow the proper
      rbtree interface on insertion - it was calling rb_link_node() which
      inserts a new node into the binary tree, but missed the call to
      rb_insert_color() which properly balances the rbtree and establishes all
      expected rbtree invariants.
      
      I found out about this only because faulty commit also used
      rb_init_node(), which I am removing within this patchset.  But I think
      it's an easy mistake to make, and it makes me wonder if we should change
      the rbtree API so that insertions would be done with a single rb_insert()
      call (even if its implementation could still inline the rb_link_node()
      part and call a private __rb_insert_color function to do the rebalancing).
      Signed-off-by: NMichel Lespinasse <walken@google.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Acked-by: NDavid Woodhouse <David.Woodhouse@intel.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Daniel Santos <daniel.santos@pobox.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ea5272f5
    • M
      rbtree: empty nodes have no color · 4c199a93
      Michel Lespinasse 提交于
      Empty nodes have no color.  We can make use of this property to simplify
      the code emitted by the RB_EMPTY_NODE and RB_CLEAR_NODE macros.  Also,
      we can get rid of the rb_init_node function which had been introduced by
      commit 88d19cf3 ("timers: Add rb_init_node() to allow for stack
      allocated rb nodes") to avoid some issue with the empty node's color not
      being initialized.
      
      I'm not sure what the RB_EMPTY_NODE checks in rb_prev() / rb_next() are
      doing there, though.  axboe introduced them in commit 10fd48f2
      ("rbtree: fixed reversed RB_EMPTY_NODE and rb_next/prev").  The way I
      see it, the 'empty node' abstraction is only used by rbtree users to
      flag nodes that they haven't inserted in any rbtree, so asking the
      predecessor or successor of such nodes doesn't make any sense.
      
      One final rb_init_node() caller was recently added in sysctl code to
      implement faster sysctl name lookups.  This code doesn't make use of
      RB_EMPTY_NODE at all, and from what I could see it only called
      rb_init_node() under the mistaken assumption that such initialization was
      required before node insertion.
      
      [sfr@canb.auug.org.au: fix net/ceph/osd_client.c build]
      Signed-off-by: NMichel Lespinasse <walken@google.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Acked-by: NDavid Woodhouse <David.Woodhouse@intel.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Daniel Santos <daniel.santos@pobox.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4c199a93
  6. 06 10月, 2012 1 次提交
  7. 18 9月, 2012 1 次提交
    • F
      fs/proc: fix potential unregister_sysctl_table hang · 6bf61045
      Francesco Ruggeri 提交于
      The unregister_sysctl_table() function hangs if all references to its
      ctl_table_header structure are not dropped.
      
      This can happen sometimes because of a leak in proc_sys_lookup():
      proc_sys_lookup() gets a reference to the table via lookup_entry(), but
      it does not release it when a subsequent call to sysctl_follow_link()
      fails.
      
      This patch fixes this leak by making sure the reference is always
      dropped on return.
      
      See also commit 076c3eed ("sysctl: Rewrite proc_sys_lookup
      introducing find_entry and lookup_entry") which reorganized this code in
      3.4.
      
      Tested in Linux 3.4.4.
      Signed-off-by: NFrancesco Ruggeri <fruggeri@aristanetworks.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6bf61045
  8. 14 7月, 2012 2 次提交
  9. 16 5月, 2012 1 次提交
  10. 23 3月, 2012 1 次提交
    • L
      sysctl: protect poll() in entries that may go away · 4e474a00
      Lucas De Marchi 提交于
      Protect code accessing ctl_table by grabbing the header with grab_header()
      and after releasing with sysctl_head_finish().  This is needed if poll()
      is called in entries created by modules: currently only hostname and
      domainname support poll(), but this bug may be triggered when/if modules
      use it and if user called poll() in a file that doesn't support it.
      
      Dave Jones reported the following when using a syscall fuzzer while
      hibernating/resuming:
      
      RIP: 0010:[<ffffffff81233e3e>]  [<ffffffff81233e3e>] proc_sys_poll+0x4e/0x90
      RAX: 0000000000000145 RBX: ffff88020cab6940 RCX: 0000000000000000
      RDX: ffffffff81233df0 RSI: 6b6b6b6b6b6b6b6b RDI: ffff88020cab6940
      [ ... ]
      Code: 00 48 89 fb 48 89 f1 48 8b 40 30 4c 8b 60 e8 b8 45 01 00 00 49 83
      7c 24 28 00 74 2e 49 8b 74 24 30 48 85 f6 74 24 48 85 c9 75 32 <8b> 16
      b8 45 01 00 00 48 63 d2 49 39 d5 74 10 8b 06 48 98 48 89
      
      If an entry goes away while we are polling() it, ctl_table may not exist
      anymore.
      Reported-by: NDave Jones <davej@redhat.com>
      Signed-off-by: NLucas De Marchi <lucas.demarchi@profusion.mobi>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      4e474a00
  11. 14 2月, 2012 1 次提交
  12. 02 2月, 2012 4 次提交
  13. 31 1月, 2012 2 次提交
  14. 25 1月, 2012 21 次提交
    • E
      sysctl: Add register_sysctl for normal sysctl users · fea478d4
      Eric W. Biederman 提交于
      The plan is to convert all callers of register_sysctl_table
      and register_sysctl_paths to register_sysctl.  The interface
      to register_sysctl is enough nicer this should make the callers
      a bit more readable.  Additionally after the conversion the
      230 lines of backwards compatibility can be removed.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      fea478d4
    • E
      sysctl: Index sysctl directories with rbtrees. · ac13ac6f
      Eric W. Biederman 提交于
      One of the most important jobs of sysctl is to export network stack
      tunables.  Several of those tunables are per network device.  In
      several instances people are running with 1000+ network devices in
      there network stacks, which makes the simple per directory linked list
      in sysctl a scaling bottleneck.   Replace O(N^2) sysctl insertion and
      lookup times with O(NlogN) by using an rbtree to index the sysctl
      directories.
      
      Benchmark before:
          make-dummies 0 999 -> 0.32s
          rmmod dummy        -> 0.12s
          make-dummies 0 9999 -> 1m17s
          rmmod dummy         -> 17s
      
      Benchmark after:
          make-dummies 0 999 -> 0.074s
          rmmod dummy        -> 0.070s
          make-dummies 0 9999 -> 3.4s
          rmmod dummy         -> 0.44s
      
      Benchmark after (without dev_snmp6):
          make-dummies 0 9999 -> 0.75s
          rmmod dummy         -> 0.44s
          make-dummies 0 99999 -> 11s
          rmmod dummy          -> 4.3s
      
      At 10,000 dummy devices the bottleneck becomes the time to add and
      remove the files under /proc/sys/net/dev_snmp6.  I have commented
      out the code that adds and removes files under /proc/sys/net/dev_snmp6
      and taken measurments of creating and destroying 100,000 dummies to
      verify the sysctl continues to scale.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      ac13ac6f
    • E
      sysctl: Make the header lists per directory. · 9e3d47df
      Eric W. Biederman 提交于
      Slightly enhance efficiency and clarity of the code by making the
      header list per directory instead of per set.
      
      Benchmark before:
          make-dummies 0 999 -> 0.63s
          rmmod dummy        -> 0.12s
          make-dummies 0 9999 -> 2m35s
          rmmod dummy         -> 18s
      
      Benchmark after:
          make-dummies 0 999 -> 0.32s
          rmmod dummy        -> 0.12s
          make-dummies 0 9999 -> 1m17s
          rmmod dummy         -> 17s
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      9e3d47df
    • E
      sysctl: Move sysctl_check_dups into insert_header · e54012ce
      Eric W. Biederman 提交于
      Simplify the callers of insert_header by removing explicit calls to check
      for duplicates and instead have insert_header do the work.
      
      This makes the code slightly more maintainable by enabling changes to
      data structures where the insertion of new entries without duplicate
      suppression is not possible.
      
      There is not always a convenient path string where insert_header
      is called so modify sysctl_check_dups to use sysctl_print_dir
      when printing the full path when a duplicate is discovered.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      e54012ce
    • E
      sysctl: Modify __register_sysctl_paths to take a set instead of a root and an nsproxy · 60a47a2e
      Eric W. Biederman 提交于
      An nsproxy argument here has always been awkard and now the nsproxy argument
      is completely unnecessary so remove it, replacing it with the set we want
      the registered tables to show up in.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      60a47a2e
    • E
      sysctl: Replace root_list with links between sysctl_table_sets. · 0e47c99d
      Eric W. Biederman 提交于
      Piecing together directories by looking first in one directory
      tree, than in another directory tree and finally in a third
      directory tree makes it hard to verify that some directory
      entries are not multiply defined and makes it hard to create
      efficient implementations the sysctl filesystem.
      
      Replace the sysctl wide list of roots with autogenerated
      links from the core sysctl directory tree to the other
      sysctl directory trees.
      
      This simplifies sysctl directory reading and lookups as now
      only entries in a single sysctl directory tree need to be
      considered.
      
      Benchmark before:
          make-dummies 0 999 -> 0.44s
          rmmod dummy        -> 0.065s
          make-dummies 0 9999 -> 1m36s
          rmmod dummy         -> 0.4s
      
      Benchmark after:
          make-dummies 0 999 -> 0.63s
          rmmod dummy        -> 0.12s
          make-dummies 0 9999 -> 2m35s
          rmmod dummy         -> 18s
      
      The slowdown is caused by the lookups used in insert_headers
      and put_links to see if we need to add links or remove links.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      0e47c99d
    • E
      sysctl: Add sysctl_print_dir and use it in get_subdir · 6980128f
      Eric W. Biederman 提交于
      When there are errors it is very nice to know the full sysctl path.
      Add a simple function that computes the sysctl path and prints it
      out.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      6980128f
    • E
      sysctl: Stop requiring explicit management of sysctl directories · 7ec66d06
      Eric W. Biederman 提交于
      Simplify the code and the sysctl semantics by autogenerating
      sysctl directories when a sysctl table is registered that needs
      the directories and autodeleting the directories when there are
      no more sysctl tables registered that need them.
      
      Autogenerating directories keeps sysctl tables from depending
      on each other, removing all of the arcane register/unregister
      ordering constraints and makes it impossible to get the order
      wrong when reigsering and unregistering sysctl tables.
      
      Autogenerating directories yields one unique entity that dentries
      can point to, retaining the current effective use of the dcache.
      
      Add struct ctl_dir as the type of these new autogenerated
      directories.
      
      The attached_by and attached_to fields in ctl_table_header are
      removed as they are no longer needed.
      
      The child field in ctl_table is no longer needed by the core of
      the sysctl code.  ctl_table.child can be removed once all of the
      existing users have been updated.
      
      Benchmark before:
          make-dummies 0 999 -> 0.7s
          rmmod dummy        -> 0.07s
          make-dummies 0 9999 -> 1m10s
          rmmod dummy         -> 0.4s
      
      Benchmark after:
          make-dummies 0 999 -> 0.44s
          rmmod dummy        -> 0.065s
          make-dummies 0 9999 -> 1m36s
          rmmod dummy         -> 0.4s
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      7ec66d06
    • E
      sysctl: Add a root pointer to ctl_table_set · 9eb47c26
      Eric W. Biederman 提交于
      Add a ctl_table_root pointer to ctl_table set so it is easy to
      go from a ctl_table_set to a ctl_table_root.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      9eb47c26
    • E
      sysctl: Rewrite proc_sys_readdir in terms of first_entry and next_entry · 6a75ce16
      Eric W. Biederman 提交于
      Replace sysctl_head_next with first_entry and next_entry.  These new
      iterators operate at the level of sysctl table entries and filter
      out any sysctl tables that should not be shown.
      
      Utilizing two specialized functions instead of a single function removes
      conditionals for handling awkward special cases that only come up
      at the beginning of iteration, making the iterators easier to read
      and understand.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      6a75ce16
    • E
      sysctl: Rewrite proc_sys_lookup introducing find_entry and lookup_entry. · 076c3eed
      Eric W. Biederman 提交于
      Replace the helpers that proc_sys_lookup uses with helpers that work
      in terms of an entire sysctl directory.  This is worse for sysctl_lock
      hold times but it is much better for code clarity and the code cleanups
      to come.
      
      find_in_table is no longer needed so it is removed.
      
      find_entry a general helper to find entries in a directory is added.
      
      lookup_entry is a simple wrapper around find_entry that takes the
      sysctl_lock increases the use count if an entry is found and drops
      the sysctl_lock.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      076c3eed
    • E
      sysctl: Normalize the root_table data structure. · a194558e
      Eric W. Biederman 提交于
      Every other directory has a .child member and we look at the .child
      for our entries.  Do the same for the root_table.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      a194558e
    • E
      8425d6aa
    • E
      sysctl: Factor out init_header from __register_sysctl_paths · e0d04529
      Eric W. Biederman 提交于
      Factor out a routing to initialize the sysctl_table_header.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      e0d04529
    • E
      sysctl: Initial support for auto-unregistering sysctl tables. · 938aaa4f
      Eric W. Biederman 提交于
      Add nreg to ctl_table_header.  When nreg drops to 0 the ctl_table_header
      will be unregistered.
      
      Factor out drop_sysctl_table from unregister_sysctl_table, and add
      the logic for decrementing nreg.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      938aaa4f
    • E
      sysctl: A more obvious version of grab_header. · 3cc3e046
      Eric W. Biederman 提交于
      Instead of relying on sysct_head_next(NULL) to magically
      return the right header for the root directory instead
      explicitly transform NULL into the root directories header.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      3cc3e046
    • E
      sysctl: Remove the now unused ctl_table parent field. · 8d6ecfcc
      Eric W. Biederman 提交于
      While useful at one time for selinux and the sysctl sanity
      checks those users no longer use the parent field and we can
      safely remove it.
      Inspired-by: NLucian Adrian Grijincu <lucian.grijincu@gmil.com>
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      8d6ecfcc
    • E
      sysctl: Improve the sysctl sanity checks · 7c60c48f
      Eric W. Biederman 提交于
      - Stop validating subdirectories now that we only register leaf tables
      
      - Cleanup and improve the duplicate filename check.
        * Run the duplicate filename check under the sysctl_lock to guarantee
          we never add duplicate names.
        * Reduce the duplicate filename check to nearly O(M*N) where M is the
          number of entries in tthe table we are registering and N is the
          number of entries in the directory before we got there.
      
      - Move the duplicate filename check into it's own function and call
        it directtly from __register_sysctl_table
      
      - Kill the config option as the sanity checks are now cheap enough
        the config option is unnecessary. The original reason for the config
        option was because we had a huge table used to verify the proc filename
        to binary sysctl mapping.  That table has now evolved into the binary_sysctl
        translation layer and is no longer part of the sysctl_check code.
      
      - Tighten up the permission checks.  Guarnateeing that files only have read
        or write permissions.
      
      - Removed redudant check for parents having a procname as now everything has
        a procname.
      
      - Generalize the backtrace logic so that we print a backtrace from
        any failure of __register_sysctl_table that was not caused by
        a memmory allocation failure.  The backtrace allows us to track
        down who erroneously registered a sysctl table.
      
      Bechmark before (CONFIG_SYSCTL_CHECK=y):
          make-dummies 0 999 -> 12s
          rmmod dummy        -> 0.08s
      
      Bechmark before (CONFIG_SYSCTL_CHECK=n):
          make-dummies 0 999 -> 0.7s
          rmmod dummy        -> 0.06s
          make-dummies 0 99999 -> 1m13s
          rmmod dummy          -> 0.38s
      
      Benchmark after:
          make-dummies 0 999 -> 0.65s
          rmmod dummy        -> 0.055s
          make-dummies 0 9999 -> 1m10s
          rmmod dummy         -> 0.39s
      
      The sysctl sanity checks now impose no measurable cost.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      7c60c48f
    • E
      sysctl: register only tables of sysctl files · f728019b
      Eric W. Biederman 提交于
      Split the registration of a complex ctl_table array which may have
      arbitrary numbers of directories (->child != NULL) and tables of files
      into a series of simpler registrations that only register tables of files.
      
      Graphically:
      
         register('dir', { + file-a
                           + file-b
                           + subdir1
                             + file-c
                           + subdir2
                             + file-d
                             + file-e })
      
      is transformed into:
         wrapper->subheaders[0] = register('dir', {file1-a, file1-b})
         wrapper->subheaders[1] = register('dir/subdir1', {file-c})
         wrapper->subheaders[2] = register('dir/subdir2', {file-d, file-e})
         return wrapper
      
      This guarantees that __register_sysctl_table will only see a simple
      ctl_table array with all entries having (->child == NULL).
      
      Care was taken to pass the original simple ctl_table arrays to
      __register_sysctl_table whenever possible.
      
      This change is derived from a similar patch written
      by Lucrian Grijincu.
      Inspired-by: NLucian Adrian Grijincu <lucian.grijincu@gmail.com>
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      f728019b
    • E
      sysctl: Add ctl_table chains into cstring paths · ec6a5266
      Eric W. Biederman 提交于
      For any component of table passed to __register_sysctl_paths
      that actually serves as a path, add that to the cstring path
      that is passed to __register_sysctl_table.
      
      The result is that for most calls to __register_sysctl_paths
      we only pass a table to __register_sysctl_table that contains
      no child directories.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      ec6a5266
    • E
      sysctl: Add support for register sysctl tables with a normal cstring path. · 6e9d5164
      Eric W. Biederman 提交于
      Make __register_sysctl_table the core sysctl registration operation and
      make it take a char * string as path.
      
      Now that binary paths have been banished into the real of backwards
      compatibility in kernel/binary_sysctl.c where they can be safely
      ignored there is no longer a need to use struct ctl_path to represent
      path names when registering ctl_tables.
      
      Start the transition to using normal char * strings to represent
      pathnames when registering sysctl tables.  Normal strings are easier
      to deal with both in the internal sysctl implementation and for
      programmers registering sysctl tables.
      
      __register_sysctl_paths is turned into a backwards compatibility wrapper
      that converts a ctl_path array into a normal char * string.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      6e9d5164