1. 25 1月, 2012 13 次提交
    • E
      sysctl: A more obvious version of grab_header. · 3cc3e046
      Eric W. Biederman 提交于
      Instead of relying on sysct_head_next(NULL) to magically
      return the right header for the root directory instead
      explicitly transform NULL into the root directories header.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      3cc3e046
    • E
      sysctl: Remove the now unused ctl_table parent field. · 8d6ecfcc
      Eric W. Biederman 提交于
      While useful at one time for selinux and the sysctl sanity
      checks those users no longer use the parent field and we can
      safely remove it.
      Inspired-by: NLucian Adrian Grijincu <lucian.grijincu@gmil.com>
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      8d6ecfcc
    • E
      sysctl: Improve the sysctl sanity checks · 7c60c48f
      Eric W. Biederman 提交于
      - Stop validating subdirectories now that we only register leaf tables
      
      - Cleanup and improve the duplicate filename check.
        * Run the duplicate filename check under the sysctl_lock to guarantee
          we never add duplicate names.
        * Reduce the duplicate filename check to nearly O(M*N) where M is the
          number of entries in tthe table we are registering and N is the
          number of entries in the directory before we got there.
      
      - Move the duplicate filename check into it's own function and call
        it directtly from __register_sysctl_table
      
      - Kill the config option as the sanity checks are now cheap enough
        the config option is unnecessary. The original reason for the config
        option was because we had a huge table used to verify the proc filename
        to binary sysctl mapping.  That table has now evolved into the binary_sysctl
        translation layer and is no longer part of the sysctl_check code.
      
      - Tighten up the permission checks.  Guarnateeing that files only have read
        or write permissions.
      
      - Removed redudant check for parents having a procname as now everything has
        a procname.
      
      - Generalize the backtrace logic so that we print a backtrace from
        any failure of __register_sysctl_table that was not caused by
        a memmory allocation failure.  The backtrace allows us to track
        down who erroneously registered a sysctl table.
      
      Bechmark before (CONFIG_SYSCTL_CHECK=y):
          make-dummies 0 999 -> 12s
          rmmod dummy        -> 0.08s
      
      Bechmark before (CONFIG_SYSCTL_CHECK=n):
          make-dummies 0 999 -> 0.7s
          rmmod dummy        -> 0.06s
          make-dummies 0 99999 -> 1m13s
          rmmod dummy          -> 0.38s
      
      Benchmark after:
          make-dummies 0 999 -> 0.65s
          rmmod dummy        -> 0.055s
          make-dummies 0 9999 -> 1m10s
          rmmod dummy         -> 0.39s
      
      The sysctl sanity checks now impose no measurable cost.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      7c60c48f
    • E
      sysctl: register only tables of sysctl files · f728019b
      Eric W. Biederman 提交于
      Split the registration of a complex ctl_table array which may have
      arbitrary numbers of directories (->child != NULL) and tables of files
      into a series of simpler registrations that only register tables of files.
      
      Graphically:
      
         register('dir', { + file-a
                           + file-b
                           + subdir1
                             + file-c
                           + subdir2
                             + file-d
                             + file-e })
      
      is transformed into:
         wrapper->subheaders[0] = register('dir', {file1-a, file1-b})
         wrapper->subheaders[1] = register('dir/subdir1', {file-c})
         wrapper->subheaders[2] = register('dir/subdir2', {file-d, file-e})
         return wrapper
      
      This guarantees that __register_sysctl_table will only see a simple
      ctl_table array with all entries having (->child == NULL).
      
      Care was taken to pass the original simple ctl_table arrays to
      __register_sysctl_table whenever possible.
      
      This change is derived from a similar patch written
      by Lucrian Grijincu.
      Inspired-by: NLucian Adrian Grijincu <lucian.grijincu@gmail.com>
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      f728019b
    • E
      sysctl: Add ctl_table chains into cstring paths · ec6a5266
      Eric W. Biederman 提交于
      For any component of table passed to __register_sysctl_paths
      that actually serves as a path, add that to the cstring path
      that is passed to __register_sysctl_table.
      
      The result is that for most calls to __register_sysctl_paths
      we only pass a table to __register_sysctl_table that contains
      no child directories.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      ec6a5266
    • E
      sysctl: Add support for register sysctl tables with a normal cstring path. · 6e9d5164
      Eric W. Biederman 提交于
      Make __register_sysctl_table the core sysctl registration operation and
      make it take a char * string as path.
      
      Now that binary paths have been banished into the real of backwards
      compatibility in kernel/binary_sysctl.c where they can be safely
      ignored there is no longer a need to use struct ctl_path to represent
      path names when registering ctl_tables.
      
      Start the transition to using normal char * strings to represent
      pathnames when registering sysctl tables.  Normal strings are easier
      to deal with both in the internal sysctl implementation and for
      programmers registering sysctl tables.
      
      __register_sysctl_paths is turned into a backwards compatibility wrapper
      that converts a ctl_path array into a normal char * string.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      6e9d5164
    • E
      sysctl: Create local copies of directory names used in paths · f05e53a7
      Eric W. Biederman 提交于
      Creating local copies of directory names is a good idea for
      two reasons.
      - The dynamic names used by callers must be copied into new
        strings by the callers today to ensure the strings do not
        change between register and unregister of the sysctl table.
      
      - Sysctl directories have a potentially different lifetime
        than the time between register and unregister of any
        particular sysctl table.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      f05e53a7
    • E
      sysctl: Remove the unnecessary sysctl_set parent concept. · bd295b56
      Eric W. Biederman 提交于
      In sysctl_net register the two networking roots in the proper order.
      
      In register_sysctl walk the sysctl sets in the reverse order of the
      sysctl roots.
      
      Remove parent from ctl_table_set and setup_sysctl_set as it is no
      longer needed.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      bd295b56
    • E
      sysctl: Implement retire_sysctl_set · 97324cd8
      Eric W. Biederman 提交于
      This adds a small helper retire_sysctl_set to remove the intimate knowledge about
      the how a sysctl_set is implemented from net/sysct_net.c
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      97324cd8
    • E
      sysctl: Make the directories have nlink == 1 · a15e2098
      Eric W. Biederman 提交于
      I goofed when I made sysctl directories have nlink == 0.
      nlink == 0 means the directory has been deleted.
      nlink == 1 meands a directory does not count subdirectories.
      
      Use the default nlink == 1 for sysctl directories.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      a15e2098
    • E
      sysctl: Move the implementation into fs/proc/proc_sysctl.c · 1f87f0b5
      Eric W. Biederman 提交于
      Move the core sysctl code from kernel/sysctl.c and kernel/sysctl_check.c
      into fs/proc/proc_sysctl.c.
      
      Currently sysctl maintenance is hampered by the sysctl implementation
      being split across 3 files with artificial layering between them.
      Consolidate the entire sysctl implementation into 1 file so that
      it is easier to see what is going on and hopefully allowing for
      simpler maintenance.
      
      For functions that are now only used in fs/proc/proc_sysctl.c remove
      their declarations from sysctl.h and make them static in fs/proc/proc_sysctl.c
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      1f87f0b5
    • E
      sysctl: Register the base sysctl table like any other sysctl table. · de4e83bd
      Eric W. Biederman 提交于
      Simplify the code by treating the base sysctl table like any other
      sysctl table and register it with register_sysctl_table.
      
      To ensure this table is registered early enough to avoid problems
      call sysctl_init from proc_sys_init.
      
      Rename sysctl_net.c:sysctl_init() to net_sysctl_init() to avoid
      name conflicts now that kernel/sysctl.c:sysctl_init() is no longer
      static.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      de4e83bd
    • L
      sysctl: remove impossible condition check · 36885d7b
      Lucas De Marchi 提交于
      Remove checks for conditions that will never happen. If procname is NULL
      the loop would already had bailed out, so there's no need to check it
      again.
      
      At the same time this also compacts the function find_in_table() by
      refactoring it to be easier to read.
      Signed-off-by: NLucas De Marchi <lucas.demarchi@profusion.mobi>
      Reviewed-by: NJesper Juhl <jj@chaosbits.net>
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      36885d7b
  2. 18 1月, 2012 3 次提交
    • L
      proc: clean up and fix /proc/<pid>/mem handling · e268337d
      Linus Torvalds 提交于
      Jüri Aedla reported that the /proc/<pid>/mem handling really isn't very
      robust, and it also doesn't match the permission checking of any of the
      other related files.
      
      This changes it to do the permission checks at open time, and instead of
      tracking the process, it tracks the VM at the time of the open.  That
      simplifies the code a lot, but does mean that if you hold the file
      descriptor open over an execve(), you'll continue to read from the _old_
      VM.
      
      That is different from our previous behavior, but much simpler.  If
      somebody actually finds a load where this matters, we'll need to revert
      this commit.
      
      I suspect that nobody will ever notice - because the process mapping
      addresses will also have changed as part of the execve.  So you cannot
      actually usefully access the fd across a VM change simply because all
      the offsets for IO would have changed too.
      Reported-by: NJüri Aedla <asd@ut.ee>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e268337d
    • E
      audit: only allow tasks to set their loginuid if it is -1 · 633b4545
      Eric Paris 提交于
      At the moment we allow tasks to set their loginuid if they have
      CAP_AUDIT_CONTROL.  In reality we want tasks to set the loginuid when they
      log in and it be impossible to ever reset.  We had to make it mutable even
      after it was once set (with the CAP) because on update and admin might have
      to restart sshd.  Now sshd would get his loginuid and the next user which
      logged in using ssh would not be able to set his loginuid.
      
      Systemd has changed how userspace works and allowed us to make the kernel
      work the way it should.  With systemd users (even admins) are not supposed
      to restart services directly.  The system will restart the service for
      them.  Thus since systemd is going to loginuid==-1, sshd would get -1, and
      sshd would be allowed to set a new loginuid without special permissions.
      
      If an admin in this system were to manually start an sshd he is inserting
      himself into the system chain of trust and thus, logically, it's his
      loginuid that should be used!  Since we have old systems I make this a
      Kconfig option.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      633b4545
    • E
      audit: remove task argument to audit_set_loginuid · 0a300be6
      Eric Paris 提交于
      The function always deals with current.  Don't expose an option
      pretending one can use it for something.  You can't.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      0a300be6
  3. 16 1月, 2012 1 次提交
  4. 13 1月, 2012 2 次提交
  5. 11 1月, 2012 5 次提交
    • V
      procfs: add hidepid= and gid= mount options · 0499680a
      Vasiliy Kulikov 提交于
      Add support for mount options to restrict access to /proc/PID/
      directories.  The default backward-compatible "relaxed" behaviour is left
      untouched.
      
      The first mount option is called "hidepid" and its value defines how much
      info about processes we want to be available for non-owners:
      
      hidepid=0 (default) means the old behavior - anybody may read all
      world-readable /proc/PID/* files.
      
      hidepid=1 means users may not access any /proc/<pid>/ directories, but
      their own.  Sensitive files like cmdline, sched*, status are now protected
      against other users.  As permission checking done in proc_pid_permission()
      and files' permissions are left untouched, programs expecting specific
      files' modes are not confused.
      
      hidepid=2 means hidepid=1 plus all /proc/PID/ will be invisible to other
      users.  It doesn't mean that it hides whether a process exists (it can be
      learned by other means, e.g.  by kill -0 $PID), but it hides process' euid
      and egid.  It compicates intruder's task of gathering info about running
      processes, whether some daemon runs with elevated privileges, whether
      another user runs some sensitive program, whether other users run any
      program at all, etc.
      
      gid=XXX defines a group that will be able to gather all processes' info
      (as in hidepid=0 mode).  This group should be used instead of putting
      nonroot user in sudoers file or something.  However, untrusted users (like
      daemons, etc.) which are not supposed to monitor the tasks in the whole
      system should not be added to the group.
      
      hidepid=1 or higher is designed to restrict access to procfs files, which
      might reveal some sensitive private information like precise keystrokes
      timings:
      
      http://www.openwall.com/lists/oss-security/2011/11/05/3
      
      hidepid=1/2 doesn't break monitoring userspace tools.  ps, top, pgrep, and
      conky gracefully handle EPERM/ENOENT and behave as if the current user is
      the only user running processes.  pstree shows the process subtree which
      contains "pstree" process.
      
      Note: the patch doesn't deal with setuid/setgid issues of keeping
      preopened descriptors of procfs files (like
      https://lkml.org/lkml/2011/2/7/368).  We rely on that the leaked
      information like the scheduling counters of setuid apps doesn't threaten
      anybody's privacy - only the user started the setuid program may read the
      counters.
      Signed-off-by: NVasiliy Kulikov <segoon@openwall.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Randy Dunlap <rdunlap@xenotime.net>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Greg KH <greg@kroah.com>
      Cc: Theodore Tso <tytso@MIT.EDU>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: James Morris <jmorris@namei.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0499680a
    • V
      procfs: parse mount options · 97412950
      Vasiliy Kulikov 提交于
      Add support for procfs mount options.  Actual mount options are coming in
      the next patches.
      Signed-off-by: NVasiliy Kulikov <segoon@openwall.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Randy Dunlap <rdunlap@xenotime.net>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Greg KH <greg@kroah.com>
      Cc: Theodore Tso <tytso@MIT.EDU>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: James Morris <jmorris@namei.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      97412950
    • P
      procfs: introduce the /proc/<pid>/map_files/ directory · 640708a2
      Pavel Emelyanov 提交于
      This one behaves similarly to the /proc/<pid>/fd/ one - it contains
      symlinks one for each mapping with file, the name of a symlink is
      "vma->vm_start-vma->vm_end", the target is the file.  Opening a symlink
      results in a file that point exactly to the same inode as them vma's one.
      
      For example the ls -l of some arbitrary /proc/<pid>/map_files/
      
       | lr-x------ 1 root root 64 Aug 26 06:40 7f8f80403000-7f8f80404000 -> /lib64/libc-2.5.so
       | lr-x------ 1 root root 64 Aug 26 06:40 7f8f8061e000-7f8f80620000 -> /lib64/libselinux.so.1
       | lr-x------ 1 root root 64 Aug 26 06:40 7f8f80826000-7f8f80827000 -> /lib64/libacl.so.1.1.0
       | lr-x------ 1 root root 64 Aug 26 06:40 7f8f80a2f000-7f8f80a30000 -> /lib64/librt-2.5.so
       | lr-x------ 1 root root 64 Aug 26 06:40 7f8f80a30000-7f8f80a4c000 -> /lib64/ld-2.5.so
      
      This *helps* checkpointing process in three ways:
      
      1. When dumping a task mappings we do know exact file that is mapped
         by particular region.  We do this by opening
         /proc/$pid/map_files/$address symlink the way we do with file
         descriptors.
      
      2. This also helps in determining which anonymous shared mappings are
         shared with each other by comparing the inodes of them.
      
      3. When restoring a set of processes in case two of them has a mapping
         shared, we map the memory by the 1st one and then open its
         /proc/$pid/map_files/$address file and map it by the 2nd task.
      
      Using /proc/$pid/maps for this is quite inconvenient since it brings
      repeatable re-reading and reparsing for this text file which slows down
      restore procedure significantly.  Also as being pointed in (3) it is a way
      easier to use top level shared mapping in children as
      /proc/$pid/map_files/$address when needed.
      
      [akpm@linux-foundation.org: coding-style fixes]
      [gorcunov@openvz.org: make map_files depend on CHECKPOINT_RESTORE]
      Signed-off-by: NPavel Emelyanov <xemul@parallels.com>
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Reviewed-by: NVasiliy Kulikov <segoon@openwall.com>
      Reviewed-by: N"Kirill A. Shutemov" <kirill@shutemov.name>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Cc: Pavel Machek <pavel@ucw.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      640708a2
    • C
      procfs: make proc_get_link to use dentry instead of inode · 7773fbc5
      Cyrill Gorcunov 提交于
      Prepare the ground for the next "map_files" patch which needs a name of a
      link file to analyse.
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Vasiliy Kulikov <segoon@openwall.com>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7773fbc5
    • K
      tracepoint: add tracepoints for debugging oom_score_adj · 43d2b113
      KAMEZAWA Hiroyuki 提交于
      oom_score_adj is used for guarding processes from OOM-Killer.  One of
      problem is that it's inherited at fork().  When a daemon set oom_score_adj
      and make children, it's hard to know where the value is set.
      
      This patch adds some tracepoints useful for debugging. This patch adds
      3 trace points.
        - creating new task
        - renaming a task (exec)
        - set oom_score_adj
      
      To debug, users need to enable some trace pointer. Maybe filtering is useful as
      
      # EVENT=/sys/kernel/debug/tracing/events/task/
      # echo "oom_score_adj != 0" > $EVENT/task_newtask/filter
      # echo "oom_score_adj != 0" > $EVENT/task_rename/filter
      # echo 1 > $EVENT/enable
      # EVENT=/sys/kernel/debug/tracing/events/oom/
      # echo 1 > $EVENT/enable
      
      output will be like this.
      # grep oom /sys/kernel/debug/tracing/trace
      bash-7699  [007] d..3  5140.744510: oom_score_adj_update: pid=7699 comm=bash oom_score_adj=-1000
      bash-7699  [007] ...1  5151.818022: task_newtask: pid=7729 comm=bash clone_flags=1200011 oom_score_adj=-1000
      ls-7729  [003] ...2  5151.818504: task_rename: pid=7729 oldcomm=bash newcomm=ls oom_score_adj=-1000
      bash-7699  [002] ...1  5175.701468: task_newtask: pid=7730 comm=bash clone_flags=1200011 oom_score_adj=-1000
      grep-7730  [007] ...2  5175.701993: task_rename: pid=7730 oldcomm=bash newcomm=grep oom_score_adj=-1000
      Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      43d2b113
  6. 06 1月, 2012 1 次提交
    • E
      ptrace: do not audit capability check when outputing /proc/pid/stat · 69f594a3
      Eric Paris 提交于
      Reading /proc/pid/stat of another process checks if one has ptrace permissions
      on that process.  If one does have permissions it outputs some data about the
      process which might have security and attack implications.  If the current
      task does not have ptrace permissions the read still works, but those fields
      are filled with inocuous (0) values.  Since this check and a subsequent denial
      is not a violation of the security policy we should not audit such denials.
      
      This can be quite useful to removing ptrace broadly across a system without
      flooding the logs when ps is run or something which harmlessly walks proc.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      Acked-by: NSerge E. Hallyn <serge.hallyn@canonical.com>
      69f594a3
  7. 04 1月, 2012 4 次提交
  8. 30 12月, 2011 1 次提交
    • A
      procfs: do not confuse jiffies with cputime64_t · 34845636
      Andreas Schwab 提交于
      Commit 2a95ea6c ("procfs: do not overflow get_{idle,iowait}_time
      for nohz") did not take into account that one some architectures jiffies
      and cputime use different units.
      
      This causes get_idle_time() to return numbers in the wrong units, making
      the idle time fields in /proc/stat wrong.
      
      Instead of converting the usec value returned by
      get_cpu_{idle,iowait}_time_us to units of jiffies, use the new function
      usecs_to_cputime64 to convert it to the correct unit of cputime64_t.
      Signed-off-by: NAndreas Schwab <schwab@linux-m68k.org>
      Acked-by: NMichal Hocko <mhocko@suse.cz>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: "Artem S. Tashkinov" <t.artem@mailcity.com>
      Cc: Dave Jones <davej@redhat.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      34845636
  9. 15 12月, 2011 2 次提交
  10. 09 12月, 2011 3 次提交
    • M
      procfs: do not overflow get_{idle,iowait}_time for nohz · 2a95ea6c
      Michal Hocko 提交于
      Since commit a25cac51 ("proc: Consider NO_HZ when printing idle and
      iowait times") we are reporting idle/io_wait time also while a CPU is
      tickless.  We rely on get_{idle,iowait}_time functions to retrieve
      proper data.
      
      These functions, however, use usecs_to_cputime to translate micro
      seconds time to cputime64_t.  This is just an alias to usecs_to_jiffies
      which reduces the data type from u64 to unsigned int and also checks
      whether the given parameter overflows jiffies_to_usecs(MAX_JIFFY_OFFSET)
      and returns MAX_JIFFY_OFFSET in that case.
      
      When we overflow depends on CONFIG_HZ but especially for CONFIG_HZ_300
      it is quite low (1431649781) so we are getting MAX_JIFFY_OFFSET for
      >3000s! until we overflow unsigned int.  Just for reference
      CONFIG_HZ_100 has an overflow window around 20s, CONFIG_HZ_250 ~8s and
      CONFIG_HZ_1000 ~2s.
      
      This results in a bug when people saw [h]top going mad reporting 100%
      CPU usage even though there was basically no CPU load.  The reason was
      simply that /proc/stat stopped reporting idle/io_wait changes (and
      reported MAX_JIFFY_OFFSET) and so the only change happening was for user
      system time.
      
      Let's use nsecs_to_jiffies64 instead which doesn't reduce the precision
      to 32b type and it is much more appropriate for cumulative time values
      (unlike usecs_to_jiffies which intended for timeout calculations).
      Signed-off-by: NMichal Hocko <mhocko@suse.cz>
      Tested-by: NArtem S. Tashkinov <t.artem@mailcity.com>
      Cc: Dave Jones <davej@redhat.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2a95ea6c
    • C
      fs/proc/meminfo.c: fix compilation error · b53fc7c2
      Claudio Scordino 提交于
      Fix the error message "directives may not be used inside a macro argument"
      which appears when the kernel is compiled for the cris architecture.
      Signed-off-by: NClaudio Scordino <claudio@evidence.eu.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b53fc7c2
    • A
      procfs: fix a vfsmount longterm reference leak · 905ad269
      Al Viro 提交于
      kern_mount() doesn't pair with plain mntput()...
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      905ad269
  11. 06 12月, 2011 1 次提交
  12. 10 11月, 2011 1 次提交
  13. 03 11月, 2011 3 次提交