1. 02 2月, 2008 1 次提交
    • E
      [AUDIT] break large execve argument logging into smaller messages · de6bbd1d
      Eric Paris 提交于
      execve arguments can be quite large.  There is no limit on the number of
      arguments and a 4G limit on the size of an argument.
      
      this patch prints those aruguments in bite sized pieces.  a userspace size
      limitation of 8k was discovered so this keeps messages around 7.5k
      
      single arguments larger than 7.5k in length are split into multiple records
      and can be identified as aX[Y]=
      Signed-off-by: NEric Paris <eparis@redhat.com>
      de6bbd1d
  2. 30 1月, 2008 1 次提交
  3. 29 1月, 2008 4 次提交
  4. 26 1月, 2008 5 次提交
    • I
      softlockup: fix signedness · 90739081
      Ingo Molnar 提交于
      fix softlockup tunables signedness.
      
      mark tunables read-mostly.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      90739081
    • A
      sched: latencytop support · 9745512c
      Arjan van de Ven 提交于
      LatencyTOP kernel infrastructure; it measures latencies in the
      scheduler and tracks it system wide and per process.
      Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9745512c
    • P
      sched: rt time limit · fa85ae24
      Peter Zijlstra 提交于
      Very simple time limit on the realtime scheduling classes.
      Allow the rq's realtime class to consume sched_rt_ratio of every
      sched_rt_period slice. If the class exceeds this quota the fair class
      will preempt the realtime class.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      fa85ae24
    • I
      softlockup: automatically detect hung TASK_UNINTERRUPTIBLE tasks · 82a1fcb9
      Ingo Molnar 提交于
      this patch extends the soft-lockup detector to automatically
      detect hung TASK_UNINTERRUPTIBLE tasks. Such hung tasks are
      printed the following way:
      
       ------------------>
       INFO: task prctl:3042 blocked for more than 120 seconds.
       "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message
       prctl         D fd5e3793     0  3042   2997
              f6050f38 00000046 00000001 fd5e3793 00000009 c06d8264 c06dae80 00000286
              f6050f40 f6050f00 f7d34d90 f7d34fc8 c1e1be80 00000001 f6050000 00000000
              f7e92d00 00000286 f6050f18 c0489d1a f6050f40 00006605 00000000 c0133a5b
       Call Trace:
        [<c04883a5>] schedule_timeout+0x6d/0x8b
        [<c04883d8>] schedule_timeout_uninterruptible+0x15/0x17
        [<c0133a76>] msleep+0x10/0x16
        [<c0138974>] sys_prctl+0x30/0x1e2
        [<c0104c52>] sysenter_past_esp+0x5f/0xa5
        =======================
       2 locks held by prctl/3042:
       #0:  (&sb->s_type->i_mutex_key#5){--..}, at: [<c0197d11>] do_fsync+0x38/0x7a
       #1:  (jbd_handle){--..}, at: [<c01ca3d2>] journal_start+0xc7/0xe9
       <------------------
      
      the current default timeout is 120 seconds. Such messages are printed
      up to 10 times per bootup. If the system has crashed already then the
      messages are not printed.
      
      if lockdep is enabled then all held locks are printed as well.
      
      this feature is a natural extension to the softlockup-detector (kernel
      locked up without scheduling) and to the NMI watchdog (kernel locked up
      with IRQs disabled).
      
      [ Gautham R Shenoy <ego@in.ibm.com>: CPU hotplug fixes. ]
      [ Andrew Morton <akpm@linux-foundation.org>: build warning fix. ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>
      82a1fcb9
    • S
      sched: group scheduler, fix fairness of cpu bandwidth allocation for task groups · 6b2d7700
      Srivatsa Vaddagiri 提交于
      The current load balancing scheme isn't good enough for precise
      group fairness.
      
      For example: on a 8-cpu system, I created 3 groups as under:
      
      	a = 8 tasks (cpu.shares = 1024)
      	b = 4 tasks (cpu.shares = 1024)
      	c = 3 tasks (cpu.shares = 1024)
      
      a, b and c are task groups that have equal weight. We would expect each
      of the groups to receive 33.33% of cpu bandwidth under a fair scheduler.
      
      This is what I get with the latest scheduler git tree:
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      --------------------------------------------------------------------------------
      Col1  | Col2    | Col3  |  Col4
      ------|---------|-------|-------------------------------------------------------
      a     | 277.676 | 57.8% | 54.1%  54.1%  54.1%  54.2%  56.7%  62.2%  62.8% 64.5%
      b     | 116.108 | 24.2% | 47.4%  48.1%  48.7%  49.3%
      c     |  86.326 | 18.0% | 47.5%  47.9%  48.5%
      --------------------------------------------------------------------------------
      
      Explanation of o/p:
      
      Col1 -> Group name
      Col2 -> Cumulative execution time (in seconds) received by all tasks of that
      	group in a 60sec window across 8 cpus
      Col3 -> CPU bandwidth received by the group in the 60sec window, expressed in
              percentage. Col3 data is derived as:
      		Col3 = 100 * Col2 / (NR_CPUS * 60)
      Col4 -> CPU bandwidth received by each individual task of the group.
      		Col4 = 100 * cpu_time_recd_by_task / 60
      
      [I can share the test case that produces a similar o/p if reqd]
      
      The deviation from desired group fairness is as below:
      
      	a = +24.47%
      	b = -9.13%
      	c = -15.33%
      
      which is quite high.
      
      After the patch below is applied, here are the results:
      
      --------------------------------------------------------------------------------
      Col1  | Col2    | Col3  |  Col4
      ------|---------|-------|-------------------------------------------------------
      a     | 163.112 | 34.0% | 33.2%  33.4%  33.5%  33.5%  33.7%  34.4%  34.8% 35.3%
      b     | 156.220 | 32.5% | 63.3%  64.5%  66.1%  66.5%
      c     | 160.653 | 33.5% | 85.8%  90.6%  91.4%
      --------------------------------------------------------------------------------
      
      Deviation from desired group fairness is as below:
      
      	a = +0.67%
      	b = -0.83%
      	c = +0.17%
      
      which is far better IMO. Most of other runs have yielded a deviation within
      +-2% at the most, which is good.
      
      Why do we see bad (group) fairness with current scheuler?
      =========================================================
      
      Currently cpu's weight is just the summation of individual task weights.
      This can yield incorrect results. For ex: consider three groups as below
      on a 2-cpu system:
      
      	CPU0	CPU1
      ---------------------------
      	A (10)  B(5)
      		C(5)
      ---------------------------
      
      Group A has 10 tasks, all on CPU0, Group B and C have 5 tasks each all
      of which are on CPU1. Each task has the same weight (NICE_0_LOAD =
      1024).
      
      The current scheme would yield a cpu weight of 10240 (10*1024) for each cpu and
      the load balancer will think both CPUs are perfectly balanced and won't
      move around any tasks. This, however, would yield this bandwidth:
      
      	A = 50%
      	B = 25%
      	C = 25%
      
      which is not the desired result.
      
      What's changing in the patch?
      =============================
      
      	- How cpu weights are calculated when CONFIF_FAIR_GROUP_SCHED is
      	  defined (see below)
      	- API Change
      		- Two tunables introduced in sysfs (under SCHED_DEBUG) to
      		  control the frequency at which the load balance monitor
      		  thread runs.
      
      The basic change made in this patch is how cpu weight (rq->load.weight) is
      calculated. Its now calculated as the summation of group weights on a cpu,
      rather than summation of task weights. Weight exerted by a group on a
      cpu is dependent on the shares allocated to it and also the number of
      tasks the group has on that cpu compared to the total number of
      (runnable) tasks the group has in the system.
      
      Let,
      	W(K,i)  = Weight of group K on cpu i
      	T(K,i)  = Task load present in group K's cfs_rq on cpu i
      	T(K)    = Total task load of group K across various cpus
      	S(K) 	= Shares allocated to group K
      	NRCPUS	= Number of online cpus in the scheduler domain to
      	 	  which group K is assigned.
      
      Then,
      	W(K,i) = S(K) * NRCPUS * T(K,i) / T(K)
      
      A load balance monitor thread is created at bootup, which periodically
      runs and adjusts group's weight on each cpu. To avoid its overhead, two
      min/max tunables are introduced (under SCHED_DEBUG) to control the rate
      at which it runs.
      
      Fixes from: Peter Zijlstra <a.p.zijlstra@chello.nl>
      
      - don't start the load_balance_monitor when there is only a single cpu.
      - rename the kthread because its currently longer than TASK_COMM_LEN
      Signed-off-by: NSrivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6b2d7700
  5. 18 12月, 2007 3 次提交
    • E
      sched: sysctl, proc_dointvec_minmax() expects int values for · 73c4efd2
      Eric Dumazet 提交于
      min_sched_granularity_ns, max_sched_granularity_ns,
      min_wakeup_granularity_ns and max_wakeup_granularity_ns are declared
      "unsigned long".
      
      This is incorrect since proc_dointvec_minmax() expects plain "int" guard
      values.
      
      This bug only triggers on big endian 64 bit arches.
      Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      73c4efd2
    • N
      Revert "hugetlb: Add hugetlb_dynamic_pool sysctl" · 368d2c63
      Nishanth Aravamudan 提交于
      This reverts commit 54f9f80d ("hugetlb:
      Add hugetlb_dynamic_pool sysctl")
      
      Given the new sysctl nr_overcommit_hugepages, the boolean dynamic pool
      sysctl is not needed, as its semantics can be expressed by 0 in the
      overcommit sysctl (no dynamic pool) and non-0 in the overcommit sysctl
      (pool enabled).
      
      (Needed in 2.6.24 since it reverts a post-2.6.23 userspace-visible change)
      Signed-off-by: NNishanth Aravamudan <nacc@us.ibm.com>
      Acked-by: NAdam Litke <agl@us.ibm.com>
      Cc: William Lee Irwin III <wli@holomorphy.com>
      Cc: Dave Hansen <haveblue@us.ibm.com>
      Cc: David Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      368d2c63
    • N
      hugetlb: introduce nr_overcommit_hugepages sysctl · d1c3fb1f
      Nishanth Aravamudan 提交于
      hugetlb: introduce nr_overcommit_hugepages sysctl
      
      While examining the code to support /proc/sys/vm/hugetlb_dynamic_pool, I
      became convinced that having a boolean sysctl was insufficient:
      
      1) To support per-node control of hugepages, I have previously submitted
      patches to add a sysfs attribute related to nr_hugepages. However, with
      a boolean global value and per-mount quota enforcement constraining the
      dynamic pool, adding corresponding control of the dynamic pool on a
      per-node basis seems inconsistent to me.
      
      2) Administration of the hugetlb dynamic pool with multiple hugetlbfs
      mount points is, arguably, more arduous than it needs to be. Each quota
      would need to be set separately, and the sum would need to be monitored.
      
      To ease the administration, and to help make the way for per-node
      control of the static & dynamic hugepage pool, I added a separate
      sysctl, nr_overcommit_hugepages. This value serves as a high watermark
      for the overall hugepage pool, while nr_hugepages serves as a low
      watermark. The boolean sysctl can then be removed, as the condition
      
      	nr_overcommit_hugepages > 0
      
      indicates the same administrative setting as
      
      	hugetlb_dynamic_pool == 1
      
      Quotas still serve as local enforcement of the size of the pool on a
      per-mount basis.
      
      A few caveats:
      
      1) There is a race whereby the global surplus huge page counter is
      incremented before a hugepage has allocated. Another process could then
      try grow the pool, and fail to convert a surplus huge page to a normal
      huge page and instead allocate a fresh huge page. I believe this is
      benign, as no memory is leaked (the actual pages are still tracked
      correctly) and the counters won't go out of sync.
      
      2) Shrinking the static pool while a surplus is in effect will allow the
      number of surplus huge pages to exceed the overcommit value. As long as
      this condition holds, however, no more surplus huge pages will be
      allowed on the system until one of the two sysctls are increased
      sufficiently, or the surplus huge pages go out of use and are freed.
      
      Successfully tested on x86_64 with the current libhugetlbfs snapshot,
      modified to use the new sysctl.
      Signed-off-by: NNishanth Aravamudan <nacc@us.ibm.com>
      Acked-by: NAdam Litke <agl@us.ibm.com>
      Cc: William Lee Irwin III <wli@holomorphy.com>
      Cc: Dave Hansen <haveblue@us.ibm.com>
      Cc: David Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d1c3fb1f
  6. 06 12月, 2007 1 次提交
    • P
      Avoid potential NULL dereference in unregister_sysctl_table · f1dad166
      Pavel Emelyanov 提交于
      register_sysctl_table() can return NULL sometimes, e.g.  when kmalloc()
      returns NULL or when sysctl check fails.
      
      I've also noticed, that many (most?) code in the kernel doesn't check for
      the return value from register_sysctl_table() and later simply calls the
      unregister_sysctl_table() with potentially NULL argument.
      
      This is unlikely on a common kernel configuration, but in case we're
      dealing with modules and/or fault-injection support, there's a slight
      possibility of an OOPS.
      
      Changing all the users to check for return code from the registering does
      not look like a good solution - there are too many code doing this and
      failure in sysctl tables registration is not a good reason to abort module
      loading (in most of the cases).
      
      So I think, that we can just have this check in unregister_sysctl_table
      just to avoid accidental OOPS-es (actually, the unregister_sysctl_table()
      did exactly this, before the start_unregistering() appeared).
      Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f1dad166
  7. 15 11月, 2007 1 次提交
  8. 10 11月, 2007 3 次提交
  9. 20 10月, 2007 2 次提交
    • P
      pid namespaces: changes to show virtual ids to user · b488893a
      Pavel Emelyanov 提交于
      This is the largest patch in the set. Make all (I hope) the places where
      the pid is shown to or get from user operate on the virtual pids.
      
      The idea is:
       - all in-kernel data structures must store either struct pid itself
         or the pid's global nr, obtained with pid_nr() call;
       - when seeking the task from kernel code with the stored id one
         should use find_task_by_pid() call that works with global pids;
       - when showing pid's numerical value to the user the virtual one
         should be used, but however when one shows task's pid outside this
         task's namespace the global one is to be used;
       - when getting the pid from userspace one need to consider this as
         the virtual one and use appropriate task/pid-searching functions.
      
      [akpm@linux-foundation.org: build fix]
      [akpm@linux-foundation.org: nuther build fix]
      [akpm@linux-foundation.org: yet nuther build fix]
      [akpm@linux-foundation.org: remove unneeded casts]
      Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
      Signed-off-by: NAlexey Dobriyan <adobriyan@openvz.org>
      Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com>
      Cc: Oleg Nesterov <oleg@tv-sign.ru>
      Cc: Paul Menage <menage@google.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b488893a
    • S
      pid namespaces: define is_global_init() and is_container_init() · b460cbc5
      Serge E. Hallyn 提交于
      is_init() is an ambiguous name for the pid==1 check.  Split it into
      is_global_init() and is_container_init().
      
      A cgroup init has it's tsk->pid == 1.
      
      A global init also has it's tsk->pid == 1 and it's active pid namespace
      is the init_pid_ns.  But rather than check the active pid namespace,
      compare the task structure with 'init_pid_ns.child_reaper', which is
      initialized during boot to the /sbin/init process and never changes.
      
      Changelog:
      
      	2.6.22-rc4-mm2-pidns1:
      	- Use 'init_pid_ns.child_reaper' to determine if a given task is the
      	  global init (/sbin/init) process. This would improve performance
      	  and remove dependence on the task_pid().
      
      	2.6.21-mm2-pidns2:
      
      	- [Sukadev Bhattiprolu] Changed is_container_init() calls in {powerpc,
      	  ppc,avr32}/traps.c for the _exception() call to is_global_init().
      	  This way, we kill only the cgroup if the cgroup's init has a
      	  bug rather than force a kernel panic.
      
      [akpm@linux-foundation.org: fix comment]
      [sukadev@us.ibm.com: Use is_global_init() in arch/m32r/mm/fault.c]
      [bunk@stusta.de: kernel/pid.c: remove unused exports]
      [sukadev@us.ibm.com: Fix capability.c to work with threaded init]
      Signed-off-by: NSerge E. Hallyn <serue@us.ibm.com>
      Signed-off-by: NSukadev Bhattiprolu <sukadev@us.ibm.com>
      Acked-by: NPavel Emelianov <xemul@openvz.org>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Cedric Le Goater <clg@fr.ibm.com>
      Cc: Dave Hansen <haveblue@us.ibm.com>
      Cc: Herbert Poetzel <herbert@13thfloor.at>
      Cc: Kirill Korotaev <dev@sw.ru>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b460cbc5
  10. 19 10月, 2007 9 次提交
    • A
      V3 file capabilities: alter behavior of cap_setpcap · 72c2d582
      Andrew Morgan 提交于
      The non-filesystem capability meaning of CAP_SETPCAP is that a process, p1,
      can change the capabilities of another process, p2.  This is not the
      meaning that was intended for this capability at all, and this
      implementation came about purely because, without filesystem capabilities,
      there was no way to use capabilities without one process bestowing them on
      another.
      
      Since we now have a filesystem support for capabilities we can fix the
      implementation of CAP_SETPCAP.
      
      The most significant thing about this change is that, with it in effect, no
      process can set the capabilities of another process.
      
      The capabilities of a program are set via the capability convolution
      rules:
      
         pI(post-exec) = pI(pre-exec)
         pP(post-exec) = (X(aka cap_bset) & fP) | (pI(post-exec) & fI)
         pE(post-exec) = fE ? pP(post-exec) : 0
      
      at exec() time.  As such, the only influence the pre-exec() program can
      have on the post-exec() program's capabilities are through the pI
      capability set.
      
      The correct implementation for CAP_SETPCAP (and that enabled by this patch)
      is that it can be used to add extra pI capabilities to the current process
      - to be picked up by subsequent exec()s when the above convolution rules
      are applied.
      
      Here is how it works:
      
      Let's say we have a process, p. It has capability sets, pE, pP and pI.
      Generally, p, can change the value of its own pI to pI' where
      
         (pI' & ~pI) & ~pP = 0.
      
      That is, the only new things in pI' that were not present in pI need to
      be present in pP.
      
      The role of CAP_SETPCAP is basically to permit changes to pI beyond
      the above:
      
         if (pE & CAP_SETPCAP) {
            pI' = anything; /* ie., even (pI' & ~pI) & ~pP != 0  */
         }
      
      This capability is useful for things like login, which (say, via
      pam_cap) might want to raise certain inheritable capabilities for use
      by the children of the logged-in user's shell, but those capabilities
      are not useful to or needed by the login program itself.
      
      One such use might be to limit who can run ping. You set the
      capabilities of the 'ping' program to be "= cap_net_raw+i", and then
      only shells that have (pI & CAP_NET_RAW) will be able to run
      it. Without CAP_SETPCAP implemented as described above, login(pam_cap)
      would have to also have (pP & CAP_NET_RAW) in order to raise this
      capability and pass it on through the inheritable set.
      Signed-off-by: NAndrew Morgan <morgan@kernel.org>
      Signed-off-by: NSerge E. Hallyn <serue@us.ibm.com>
      Cc: Stephen Smalley <sds@tycho.nsa.gov>
      Cc: James Morris <jmorris@namei.org>
      Cc: Casey Schaufler <casey@schaufler-ca.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      72c2d582
    • E
      sysctl: deprecate sys_sysctl in a user space visible fashion. · 7058cb02
      Eric W. Biederman 提交于
      After adding checking to register_sysctl_table and finding a whole new set
      of bugs.  Missed by countless code reviews and testers I have finally lost
      patience with the binary sysctl interface.
      
      The binary sysctl interface has been sort of deprecated for years and
      finding a user space program that uses the syscall is more difficult then
      finding a needle in a haystack.  Problems continue to crop up, with the in
      kernel implementation.  So since supporting something that no one uses is
      silly, deprecate sys_sysctl with a sufficient grace period and notice that
      the handful of user space applications that care can be fixed or replaced.
      
      The /proc/sys sysctl interface that people use will continue to be
      supported indefinitely.
      
      This patch moves the tested warning about sysctls from the path where
      sys_sysctl to a separate path called from both implementations of
      sys_sysctl, and it adds a proper entry into
      Documentation/feature-removal-schedule.
      
      Allowing us to revisit this in a couple years time and actually kill
      sys_sysctl.
      
      [lethal@linux-sh.org: sysctl: Fix syscall disabled build]
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7058cb02
    • E
      sysctl: Error on bad sysctl tables · fc6cd25b
      Eric W. Biederman 提交于
      After going through the kernels sysctl tables several times it has become
      clear that code review and testing is just not effective in prevent
      problematic sysctl tables from being used in the stable kernel.  I certainly
      can't seem to fix the problems as fast as they are introduced.
      
      Therefore this patch adds sysctl_check_table which is called when a sysctl
      table is registered and checks to see if we have a problematic sysctl table.
      
      The biggest part of the code is the table of valid binary sysctl entries, but
      since we have frozen our set of binary sysctls this table should not need to
      change, and it makes it much easier to detect when someone unintentionally
      adds a new binary sysctl value.
      
      As best as I can determine all of the several hundred errors spewed on boot up
      now are legitimate.
      
      [bunk@kernel.org: kernel/sysctl_check.c must #include <linux/string.h>]
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Alexey Dobriyan <adobriyan@sw.ru>
      Signed-off-by: NAdrian Bunk <bunk@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fc6cd25b
    • E
      sysctl: remove the cad_pid binary sysctl path · c65f9239
      Eric W. Biederman 提交于
      It looks like we inadvertently killed the cad_pid binary sysctl support when
      cap_pid was changed to be a struct pid.  Since no one has complained just
      remove the binary path.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c65f9239
    • E
      sysctl: simplify the pty sysctl logic · 35834ca1
      Eric W. Biederman 提交于
      Instead of having a bunch of ifdefs in sysctl.c move all of the pty sysctl
      logic into drivers/char/pty.c
      
      As well as cleaning up the logic this prevents sysctl_check_table from
      complaining that the root table has a NULL data pointer on something with
      generic methods.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      35834ca1
    • E
      sysctl: remove the binary interface for aio-nr, aio-max-nr, acpi_video_flags · 0d135a4a
      Eric W. Biederman 提交于
      aio-nr, aio-max-nr, acpi_video_flags are unsigned long values which sysctl
      does not handle properly with a 64bit kernel and a 32bit user space.
      
      Since no one is likely to be using the binary sysctl values and the ascii
      interface still works, this patch just removes support for the binary sysctl
      interface from the kernel.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Alexey Dobriyan <adobriyan@sw.ru>
      Cc: Benjamin LaHaise <bcrl@kvack.org>
      Cc: Zach Brown <zach.brown@oracle.com>
      Cc: Badari Pulavarty <pbadari@us.ibm.com>
      Cc: Len Brown <lenb@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0d135a4a
    • E
      sysctl: remove binary sysctl support where it clearly doesn't work · f5ead5ce
      Eric W. Biederman 提交于
      These functions are all wrapper functions for the proc interface that are
      needed for them to work correctly.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Alexey Dobriyan <adobriyan@sw.ru>
      Acked-by: NAndrew Morgan <morgan@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f5ead5ce
    • E
      sysctl: Factor out sysctl_data. · 49a0c458
      Eric W. Biederman 提交于
      There as been no easy way to wrap the default sysctl strategy routine except
      for returning 0.  Which is not always what we want.  The few instances I have
      seen that want different behaviour have written their own version of
      sysctl_data.  While not too hard it is unnecessary code and has the potential
      for extra bugs.
      
      So to make these situations easier and make that part of sysctl more symetric
      I have factord sysctl_data out of do_sysctl_strategy and exported as a
      function everyone can use.
      
      Further having sysctl_data be an explicit function makes checking for badly
      formed sysctl tables much easier.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Alexey Dobriyan <adobriyan@sw.ru>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      49a0c458
    • E
      sysctl core: Stop using the unnecessary ctl_table typedef · d8217f07
      Eric W. Biederman 提交于
      In sysctl.h the typedef struct ctl_table ctl_table violates coding style isn't
      needed and is a bit of a nuisance because it makes it harder to recognize
      ctl_table is a type name.
      
      So this patch removes it from the generic sysctl code.  Hopefully I will have
      enough energy to send the rest of my patches will follow and to remove it from
      the rest of the kernel.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Alexey Dobriyan <adobriyan@sw.ru>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d8217f07
  11. 17 10月, 2007 4 次提交
  12. 15 10月, 2007 4 次提交
  13. 14 10月, 2007 1 次提交
    • A
      minimal build fixes for uml (fallout from x86 merge) · 2b8232ce
      Al Viro 提交于
       a) include/asm-um/arch can't just point to include/asm-$(SUBARCH) now
       b) arch/{i386,x86_64}/crypto are merged now
       c) subarch-obj needed changes
       d) cpufeature_64.h should pull "cpufeature_32.h", not <asm/cpufeature_32.h>
          since it can be included from asm-um/cpufeature.h
       e) in case of uml-i386 we need CONFIG_X86_32 for make and gcc, but not
          for Kconfig
       f) sysctl.c shouldn't do vdso_enabled for uml-i386 (actually, that one
          should be registered from corresponding arch/*/kernel/*, with ifdef
          going away; that's a separate patch, though).
      
      With that and with Stephen's patch ("[PATCH net-2.6] uml: hard_header fix")
      we have uml allmodconfig building both on i386 and amd64.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2b8232ce
  14. 12 10月, 2007 1 次提交