1. 27 8月, 2016 1 次提交
    • S
      sysctl: handle error writing UINT_MAX to u32 fields · e7d316a0
      Subash Abhinov Kasiviswanathan 提交于
      We have scripts which write to certain fields on 3.18 kernels but this
      seems to be failing on 4.4 kernels.  An entry which we write to here is
      xfrm_aevent_rseqth which is u32.
      
        echo 4294967295  > /proc/sys/net/core/xfrm_aevent_rseqth
      
      Commit 230633d1 ("kernel/sysctl.c: detect overflows when converting
      to int") prevented writing to sysctl entries when integer overflow
      occurs.  However, this does not apply to unsigned integers.
      
      Heinrich suggested that we introduce a new option to handle 64 bit
      limits and set min as 0 and max as UINT_MAX.  This might not work as it
      leads to issues similar to __do_proc_doulongvec_minmax.  Alternatively,
      we would need to change the datatype of the entry to 64 bit.
      
        static int __do_proc_doulongvec_minmax(void *data, struct ctl_table
        {
            i = (unsigned long *) data;   //This cast is causing to read beyond the size of data (u32)
            vleft = table->maxlen / sizeof(unsigned long); //vleft is 0 because maxlen is sizeof(u32) which is lesser than sizeof(unsigned long) on x86_64.
      
      Introduce a new proc handler proc_douintvec.  Individual proc entries
      will need to be updated to use the new handler.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Fixes: 230633d1 ("kernel/sysctl.c:detect overflows when converting to int")
      Link: http://lkml.kernel.org/r/1471479806-5252-1-git-send-email-subashab@codeaurora.orgSigned-off-by: NSubash Abhinov Kasiviswanathan <subashab@codeaurora.org>
      Cc: Heinrich Schuchardt <xypron.glpk@gmx.de>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Ingo Molnar <mingo@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e7d316a0
  2. 03 8月, 2016 1 次提交
  3. 01 7月, 2015 1 次提交
  4. 17 4月, 2015 1 次提交
  5. 09 8月, 2014 1 次提交
  6. 19 11月, 2012 1 次提交
  7. 13 10月, 2012 1 次提交
  8. 23 6月, 2012 2 次提交
  9. 25 1月, 2012 16 次提交
    • E
      sysctl: Add register_sysctl for normal sysctl users · fea478d4
      Eric W. Biederman 提交于
      The plan is to convert all callers of register_sysctl_table
      and register_sysctl_paths to register_sysctl.  The interface
      to register_sysctl is enough nicer this should make the callers
      a bit more readable.  Additionally after the conversion the
      230 lines of backwards compatibility can be removed.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      fea478d4
    • E
      sysctl: Index sysctl directories with rbtrees. · ac13ac6f
      Eric W. Biederman 提交于
      One of the most important jobs of sysctl is to export network stack
      tunables.  Several of those tunables are per network device.  In
      several instances people are running with 1000+ network devices in
      there network stacks, which makes the simple per directory linked list
      in sysctl a scaling bottleneck.   Replace O(N^2) sysctl insertion and
      lookup times with O(NlogN) by using an rbtree to index the sysctl
      directories.
      
      Benchmark before:
          make-dummies 0 999 -> 0.32s
          rmmod dummy        -> 0.12s
          make-dummies 0 9999 -> 1m17s
          rmmod dummy         -> 17s
      
      Benchmark after:
          make-dummies 0 999 -> 0.074s
          rmmod dummy        -> 0.070s
          make-dummies 0 9999 -> 3.4s
          rmmod dummy         -> 0.44s
      
      Benchmark after (without dev_snmp6):
          make-dummies 0 9999 -> 0.75s
          rmmod dummy         -> 0.44s
          make-dummies 0 99999 -> 11s
          rmmod dummy          -> 4.3s
      
      At 10,000 dummy devices the bottleneck becomes the time to add and
      remove the files under /proc/sys/net/dev_snmp6.  I have commented
      out the code that adds and removes files under /proc/sys/net/dev_snmp6
      and taken measurments of creating and destroying 100,000 dummies to
      verify the sysctl continues to scale.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      ac13ac6f
    • E
      sysctl: Make the header lists per directory. · 9e3d47df
      Eric W. Biederman 提交于
      Slightly enhance efficiency and clarity of the code by making the
      header list per directory instead of per set.
      
      Benchmark before:
          make-dummies 0 999 -> 0.63s
          rmmod dummy        -> 0.12s
          make-dummies 0 9999 -> 2m35s
          rmmod dummy         -> 18s
      
      Benchmark after:
          make-dummies 0 999 -> 0.32s
          rmmod dummy        -> 0.12s
          make-dummies 0 9999 -> 1m17s
          rmmod dummy         -> 17s
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      9e3d47df
    • E
      sysctl: Modify __register_sysctl_paths to take a set instead of a root and an nsproxy · 60a47a2e
      Eric W. Biederman 提交于
      An nsproxy argument here has always been awkard and now the nsproxy argument
      is completely unnecessary so remove it, replacing it with the set we want
      the registered tables to show up in.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      60a47a2e
    • E
      sysctl: Replace root_list with links between sysctl_table_sets. · 0e47c99d
      Eric W. Biederman 提交于
      Piecing together directories by looking first in one directory
      tree, than in another directory tree and finally in a third
      directory tree makes it hard to verify that some directory
      entries are not multiply defined and makes it hard to create
      efficient implementations the sysctl filesystem.
      
      Replace the sysctl wide list of roots with autogenerated
      links from the core sysctl directory tree to the other
      sysctl directory trees.
      
      This simplifies sysctl directory reading and lookups as now
      only entries in a single sysctl directory tree need to be
      considered.
      
      Benchmark before:
          make-dummies 0 999 -> 0.44s
          rmmod dummy        -> 0.065s
          make-dummies 0 9999 -> 1m36s
          rmmod dummy         -> 0.4s
      
      Benchmark after:
          make-dummies 0 999 -> 0.63s
          rmmod dummy        -> 0.12s
          make-dummies 0 9999 -> 2m35s
          rmmod dummy         -> 18s
      
      The slowdown is caused by the lookups used in insert_headers
      and put_links to see if we need to add links or remove links.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      0e47c99d
    • E
      sysctl: Stop requiring explicit management of sysctl directories · 7ec66d06
      Eric W. Biederman 提交于
      Simplify the code and the sysctl semantics by autogenerating
      sysctl directories when a sysctl table is registered that needs
      the directories and autodeleting the directories when there are
      no more sysctl tables registered that need them.
      
      Autogenerating directories keeps sysctl tables from depending
      on each other, removing all of the arcane register/unregister
      ordering constraints and makes it impossible to get the order
      wrong when reigsering and unregistering sysctl tables.
      
      Autogenerating directories yields one unique entity that dentries
      can point to, retaining the current effective use of the dcache.
      
      Add struct ctl_dir as the type of these new autogenerated
      directories.
      
      The attached_by and attached_to fields in ctl_table_header are
      removed as they are no longer needed.
      
      The child field in ctl_table is no longer needed by the core of
      the sysctl code.  ctl_table.child can be removed once all of the
      existing users have been updated.
      
      Benchmark before:
          make-dummies 0 999 -> 0.7s
          rmmod dummy        -> 0.07s
          make-dummies 0 9999 -> 1m10s
          rmmod dummy         -> 0.4s
      
      Benchmark after:
          make-dummies 0 999 -> 0.44s
          rmmod dummy        -> 0.065s
          make-dummies 0 9999 -> 1m36s
          rmmod dummy         -> 0.4s
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      7ec66d06
    • E
      sysctl: Add a root pointer to ctl_table_set · 9eb47c26
      Eric W. Biederman 提交于
      Add a ctl_table_root pointer to ctl_table set so it is easy to
      go from a ctl_table_set to a ctl_table_root.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      9eb47c26
    • E
      sysctl: Initial support for auto-unregistering sysctl tables. · 938aaa4f
      Eric W. Biederman 提交于
      Add nreg to ctl_table_header.  When nreg drops to 0 the ctl_table_header
      will be unregistered.
      
      Factor out drop_sysctl_table from unregister_sysctl_table, and add
      the logic for decrementing nreg.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      938aaa4f
    • E
      sysctl: Remove the now unused ctl_table parent field. · 8d6ecfcc
      Eric W. Biederman 提交于
      While useful at one time for selinux and the sysctl sanity
      checks those users no longer use the parent field and we can
      safely remove it.
      Inspired-by: NLucian Adrian Grijincu <lucian.grijincu@gmil.com>
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      8d6ecfcc
    • E
      sysctl: register only tables of sysctl files · f728019b
      Eric W. Biederman 提交于
      Split the registration of a complex ctl_table array which may have
      arbitrary numbers of directories (->child != NULL) and tables of files
      into a series of simpler registrations that only register tables of files.
      
      Graphically:
      
         register('dir', { + file-a
                           + file-b
                           + subdir1
                             + file-c
                           + subdir2
                             + file-d
                             + file-e })
      
      is transformed into:
         wrapper->subheaders[0] = register('dir', {file1-a, file1-b})
         wrapper->subheaders[1] = register('dir/subdir1', {file-c})
         wrapper->subheaders[2] = register('dir/subdir2', {file-d, file-e})
         return wrapper
      
      This guarantees that __register_sysctl_table will only see a simple
      ctl_table array with all entries having (->child == NULL).
      
      Care was taken to pass the original simple ctl_table arrays to
      __register_sysctl_table whenever possible.
      
      This change is derived from a similar patch written
      by Lucrian Grijincu.
      Inspired-by: NLucian Adrian Grijincu <lucian.grijincu@gmail.com>
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      f728019b
    • E
      sysctl: Add support for register sysctl tables with a normal cstring path. · 6e9d5164
      Eric W. Biederman 提交于
      Make __register_sysctl_table the core sysctl registration operation and
      make it take a char * string as path.
      
      Now that binary paths have been banished into the real of backwards
      compatibility in kernel/binary_sysctl.c where they can be safely
      ignored there is no longer a need to use struct ctl_path to represent
      path names when registering ctl_tables.
      
      Start the transition to using normal char * strings to represent
      pathnames when registering sysctl tables.  Normal strings are easier
      to deal with both in the internal sysctl implementation and for
      programmers registering sysctl tables.
      
      __register_sysctl_paths is turned into a backwards compatibility wrapper
      that converts a ctl_path array into a normal char * string.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      6e9d5164
    • E
      sysctl: Remove the unnecessary sysctl_set parent concept. · bd295b56
      Eric W. Biederman 提交于
      In sysctl_net register the two networking roots in the proper order.
      
      In register_sysctl walk the sysctl sets in the reverse order of the
      sysctl roots.
      
      Remove parent from ctl_table_set and setup_sysctl_set as it is no
      longer needed.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      bd295b56
    • E
      sysctl: Implement retire_sysctl_set · 97324cd8
      Eric W. Biederman 提交于
      This adds a small helper retire_sysctl_set to remove the intimate knowledge about
      the how a sysctl_set is implemented from net/sysct_net.c
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      97324cd8
    • E
      sysctl: Move the implementation into fs/proc/proc_sysctl.c · 1f87f0b5
      Eric W. Biederman 提交于
      Move the core sysctl code from kernel/sysctl.c and kernel/sysctl_check.c
      into fs/proc/proc_sysctl.c.
      
      Currently sysctl maintenance is hampered by the sysctl implementation
      being split across 3 files with artificial layering between them.
      Consolidate the entire sysctl implementation into 1 file so that
      it is easier to see what is going on and hopefully allowing for
      simpler maintenance.
      
      For functions that are now only used in fs/proc/proc_sysctl.c remove
      their declarations from sysctl.h and make them static in fs/proc/proc_sysctl.c
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      1f87f0b5
    • E
      sysctl: Register the base sysctl table like any other sysctl table. · de4e83bd
      Eric W. Biederman 提交于
      Simplify the code by treating the base sysctl table like any other
      sysctl table and register it with register_sysctl_table.
      
      To ensure this table is registered early enough to avoid problems
      call sysctl_init from proc_sys_init.
      
      Rename sysctl_net.c:sysctl_init() to net_sysctl_init() to avoid
      name conflicts now that kernel/sysctl.c:sysctl_init() is no longer
      static.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      de4e83bd
    • E
      sysctl: Consolidate !CONFIG_SYSCTL handling · 0ce8974d
      Eric W. Biederman 提交于
      - In sysctl.h move functions only available if CONFIG_SYSCL
        is defined inside of #ifdef CONFIG_SYSCTL
      
      - Move the stub function definitions for !CONFIG_SYSCTL
        into sysctl.h and make them static inlines.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      0ce8974d
  10. 04 1月, 2012 1 次提交
  11. 03 11月, 2011 1 次提交
  12. 04 10月, 2011 1 次提交
  13. 10 3月, 2011 2 次提交
  14. 08 3月, 2011 1 次提交
    • A
      unfuck proc_sysctl ->d_compare() · dfef6dcd
      Al Viro 提交于
      a) struct inode is not going to be freed under ->d_compare();
      however, the thing PROC_I(inode)->sysctl points to just might.
      Fortunately, it's enough to make freeing that sucker delayed,
      provided that we don't step on its ->unregistering, clear
      the pointer to it in PROC_I(inode) before dropping the reference
      and check if it's NULL in ->d_compare().
      
      b) I'm not sure that we *can* walk into NULL inode here (we recheck
      dentry->seq between verifying that it's still hashed / fetching
      dentry->d_inode and passing it to ->d_compare() and there's no
      negative hashed dentries in /proc/sys/*), but if we can walk into
      that, we really should not have ->d_compare() return 0 on it!
      Said that, I really suspect that this check can be simply killed.
      Nick?
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      dfef6dcd
  15. 16 5月, 2010 1 次提交
  16. 17 2月, 2010 2 次提交
  17. 07 1月, 2010 1 次提交
    • J
      net: RFC3069, private VLAN proxy arp support · 65324144
      Jesper Dangaard Brouer 提交于
      This is to be used together with switch technologies, like RFC3069,
      that where the individual ports are not allowed to communicate with
      each other, but they are allowed to talk to the upstream router.  As
      described in RFC 3069, it is possible to allow these hosts to
      communicate through the upstream router by proxy_arp'ing.
      
      This patch basically allow proxy arp replies back to the same
      interface (from which the ARP request/solicitation was received).
      
      Tunable per device via proc "proxy_arp_pvlan":
        /proc/sys/net/ipv4/conf/*/proxy_arp_pvlan
      
      This switch technology is known by different vendor names:
       - In RFC 3069 it is called VLAN Aggregation.
       - Cisco and Allied Telesyn call it Private VLAN.
       - Hewlett-Packard call it Source-Port filtering or port-isolation.
       - Ericsson call it MAC-Forced Forwarding (RFC Draft).
      Signed-off-by: NJesper Dangaard Brouer <hawk@comx.dk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      65324144
  18. 26 12月, 2009 1 次提交
    • J
      net: restore ip source validation · 28f6aeea
      Jamal Hadi Salim 提交于
      when using policy routing and the skb mark:
      there are cases where a back path validation requires us
      to use a different routing table for src ip validation than
      the one used for mapping ingress dst ip.
      One such a case is transparent proxying where we pretend to be
      the destination system and therefore the local table
      is used for incoming packets but possibly a main table would
      be used on outbound.
      Make the default behavior to allow the above and if users
      need to turn on the symmetry via sysctl src_valid_mark
      Signed-off-by: NJamal Hadi Salim <hadi@cyberus.ca>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      28f6aeea
  19. 04 12月, 2009 2 次提交
  20. 19 11月, 2009 1 次提交
  21. 18 11月, 2009 1 次提交