1. 09 2月, 2008 14 次提交
    • O
      fix group stop with exit race · d12619b5
      Oleg Nesterov 提交于
      do_signal_stop() counts all sub-thread and sets ->group_stop_count
      accordingly.  Every thread should decrement ->group_stop_count and stop,
      the last one should notify the parent.
      
      However a sub-thread can exit before it notices the signal_pending(), or it
      may be somewhere in do_exit() already.  In that case the group stop never
      finishes properly.
      
      Note: this is a minimal fix, we can add some optimizations later.  Say we
      can return quickly if thread_group_empty().  Also, we can move some signal
      related code from exit_notify() to exit_signals().
      Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
      Acked-by: NDavide Libenzi <davidel@xmailserver.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Roland McGrath <roland@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d12619b5
    • O
      teach set_special_pids() to use struct pid · 8520d7c7
      Oleg Nesterov 提交于
      Change set_special_pids() to work with struct pid, not pid_t from global name
      space. This again speedups and imho cleanups the code, also a preparation for
      the next patch.
      Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
      Acked-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Acked-by: NPavel Emelyanov <xemul@openvz.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8520d7c7
    • O
      kill PT_ATTACHED · 6b39c7bf
      Oleg Nesterov 提交于
      Since the patch
      
      	"Fix ptrace_attach()/ptrace_traceme()/de_thread() race"
      	commit f5b40e36
      
      we set PT_ATTACHED and change child->parent "atomically" wrt task_list lock.
      
      This means we can remove the checks like "PT_ATTACHED && ->parent != ptracer"
      which were needed to catch the "ptrace attach is in progress" case.  We can
      also remove the flag itself since nobody else uses it.
      Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
      Acked-by: NRoland McGrath <roland@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6b39c7bf
    • P
      IPC: consolidate sem_exit_ns(), msg_exit_ns() and shm_exit_ns() · 01b8b07a
      Pierre Peiffer 提交于
      sem_exit_ns(), msg_exit_ns() and shm_exit_ns() are all called when an
      ipc_namespace is released to free all ipcs of each type.  But in fact, they
      do the same thing: they loop around all ipcs to free them individually by
      calling a specific routine.
      
      This patch proposes to consolidate this by introducing a common function,
      free_ipcs(), that do the job.  The specific routine to call on each
      individual ipcs is passed as parameter.  For this, these ipc-specific
      'free' routines are reworked to take a generic 'struct ipc_perm' as
      parameter.
      Signed-off-by: NPierre Peiffer <pierre.peiffer@bull.net>
      Cc: Cedric Le Goater <clg@fr.ibm.com>
      Cc: Pavel Emelyanov <xemul@openvz.org>
      Cc: Nadia Derbey <Nadia.Derbey@bull.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      01b8b07a
    • P
      IPC: make struct ipc_ids static in ipc_namespace · ed2ddbf8
      Pierre Peiffer 提交于
      Each ipc_namespace contains a table of 3 pointers to struct ipc_ids (3 for
      msg, sem and shm, structure used to store all ipcs) These 'struct ipc_ids'
      are dynamically allocated for each icp_namespace as the ipc_namespace
      itself (for the init namespace, they are initialized with pointers to
      static variables instead)
      
      It is so for historical reason: in fact, before the use of idr to store the
      ipcs, the ipcs were stored in tables of variable length, depending of the
      maximum number of ipc allowed.  Now, these 'struct ipc_ids' have a fixed
      size.  As they are allocated in any cases for each new ipc_namespace, there
      is no gain of memory in having them allocated separately of the struct
      ipc_namespace.
      
      This patch proposes to make this table static in the struct ipc_namespace.
      Thus, we can allocate all in once and get rid of all the code needed to
      allocate and free these ipc_ids separately.
      Signed-off-by: NPierre Peiffer <pierre.peiffer@bull.net>
      Acked-by: NCedric Le Goater <clg@fr.ibm.com>
      Cc: Pavel Emelyanov <xemul@openvz.org>
      Cc: Nadia Derbey <Nadia.Derbey@bull.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ed2ddbf8
    • A
      fix "modules: make module_address_lookup() safe" · 92dfc9dc
      Andrew Morton 提交于
      Get the constness right, avoid nasty cast.
      
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      92dfc9dc
    • M
      intel-iommu: fault_reason index cleanup · d94afc6c
      mark gross 提交于
      Fix an off by one bug in the fault reason string reporting function, and
      clean up some of the code around this buglet.
      
      [akpm@linux-foundation.org: cleanup]
      Signed-off-by: Nmark gross <mgross@linux.intel.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d94afc6c
    • A
      proc: fix ->open'less usage due to ->proc_fops flip · 2d3a4e36
      Alexey Dobriyan 提交于
      Typical PDE creation code looks like:
      
      	pde = create_proc_entry("foo", 0, NULL);
      	if (pde)
      		pde->proc_fops = &foo_proc_fops;
      
      Notice that PDE is first created, only then ->proc_fops is set up to
      final value. This is a problem because right after creation
      a) PDE is fully visible in /proc , and
      b) ->proc_fops are proc_file_operations which do not have ->open callback. So, it's
         possible to ->read without ->open (see one class of oopses below).
      
      The fix is new API called proc_create() which makes sure ->proc_fops are
      set up before gluing PDE to main tree. Typical new code looks like:
      
      	pde = proc_create("foo", 0, NULL, &foo_proc_fops);
      	if (!pde)
      		return -ENOMEM;
      
      Fix most networking users for a start.
      
      In the long run, create_proc_entry() for regular files will go.
      
      BUG: unable to handle kernel NULL pointer dereference at virtual address 00000024
      printing eip: c1188c1b *pdpt = 000000002929e001 *pde = 0000000000000000
      Oops: 0002 [#1] PREEMPT SMP DEBUG_PAGEALLOC
      last sysfs file: /sys/block/sda/sda1/dev
      Modules linked in: foo af_packet ipv6 cpufreq_ondemand loop serio_raw psmouse k8temp hwmon sr_mod cdrom
      
      Pid: 24679, comm: cat Not tainted (2.6.24-rc3-mm1 #2)
      EIP: 0060:[<c1188c1b>] EFLAGS: 00210002 CPU: 0
      EIP is at mutex_lock_nested+0x75/0x25d
      EAX: 000006fe EBX: fffffffb ECX: 00001000 EDX: e9340570
      ESI: 00000020 EDI: 00200246 EBP: e9340570 ESP: e8ea1ef8
       DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
      Process cat (pid: 24679, ti=E8EA1000 task=E9340570 task.ti=E8EA1000)
      Stack: 00000000 c106f7ce e8ee05b4 00000000 00000001 458003d0 f6fb6f20 fffffffb
             00000000 c106f7aa 00001000 c106f7ce 08ae9000 f6db53f0 00000020 00200246
             00000000 00000002 00000000 00200246 00200246 e8ee05a0 fffffffb e8ee0550
      Call Trace:
       [<c106f7ce>] seq_read+0x24/0x28a
       [<c106f7aa>] seq_read+0x0/0x28a
       [<c106f7ce>] seq_read+0x24/0x28a
       [<c106f7aa>] seq_read+0x0/0x28a
       [<c10818b8>] proc_reg_read+0x60/0x73
       [<c1081858>] proc_reg_read+0x0/0x73
       [<c105a34f>] vfs_read+0x6c/0x8b
       [<c105a6f3>] sys_read+0x3c/0x63
       [<c10025f2>] sysenter_past_esp+0x5f/0xa5
       [<c10697a7>] destroy_inode+0x24/0x33
       =======================
      INFO: lockdep is turned off.
      Code: 75 21 68 e1 1a 19 c1 68 87 00 00 00 68 b8 e8 1f c1 68 25 73 1f c1 e8 84 06 e9 ff e8 52 b8 e7 ff 83 c4 10 9c 5f fa e8 28 89 ea ff <f0> fe 4e 04 79 0a f3 90 80 7e 04 00 7e f8 eb f0 39 76 34 74 33
      EIP: [<c1188c1b>] mutex_lock_nested+0x75/0x25d SS:ESP 0068:e8ea1ef8
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NAlexey Dobriyan <adobriyan@sw.ru>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Oleg Nesterov <oleg@tv-sign.ru>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2d3a4e36
    • E
      proc: seqfile convert proc_pid_status to properly handle pid namespaces · df5f8314
      Eric W. Biederman 提交于
      Currently we possibly lookup the pid in the wrong pid namespace.  So
      seq_file convert proc_pid_status which ensures the proper pid namespaces is
      passed in.
      
      [akpm@linux-foundation.org: coding-style fixes]
      [akpm@linux-foundation.org: build fix]
      [akpm@linux-foundation.org: another build fix]
      [akpm@linux-foundation.org: s390 build fix]
      [akpm@linux-foundation.org: fix task_name() output]
      [akpm@linux-foundation.org: fix nommu build]
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Andrew Morgan <morgan@kernel.org>
      Cc: Serge Hallyn <serue@us.ibm.com>
      Cc: Cedric Le Goater <clg@fr.ibm.com>
      Cc: Pavel Emelyanov <xemul@openvz.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Paul Menage <menage@google.com>
      Cc: Paul Jackson <pj@sgi.com>
      Cc: David Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      df5f8314
    • E
      proc: implement proc_single_file_operations · be614086
      Eric W. Biederman 提交于
      Currently many /proc/pid files use a crufty precursor to the current seq_file
      api, and they don't have direct access to the pid_namespace or the pid of for
      which they are displaying data.
      
      So implement proc_single_file_operations to make the seq_file routines easy to
      use, and to give access to the full state of the pid of we are displaying data
      for.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Oleg Nesterov <oleg@tv-sign.ru>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      be614086
    • P
      namespaces: cleanup the code managed with PID_NS option · 74bd59bb
      Pavel Emelyanov 提交于
      Just like with the user namespaces, move the namespace management code into
      the separate .c file and mark the (already existing) PID_NS option as "depend
      on NAMESPACES"
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
      Acked-by: NSerge Hallyn <serue@us.ibm.com>
      Cc: Cedric Le Goater <clg@fr.ibm.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Herbert Poetzl <herbert@13thfloor.at>
      Cc: Kirill Korotaev <dev@sw.ru>
      Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      74bd59bb
    • P
      namespaces: move the IPC namespace under IPC_NS option · ae5e1b22
      Pavel Emelyanov 提交于
      Currently the IPC namespace management code is spread over the ipc/*.c files.
      I moved this code into ipc/namespace.c file which is compiled out when needed.
      
      The linux/ipc_namespace.h file is used to store the prototypes of the
      functions in namespace.c and the stubs for NAMESPACES=n case.  This is done
      so, because the stub for copy_ipc_namespace requires the knowledge of the
      CLONE_NEWIPC flag, which is in sched.h.  But the linux/ipc.h file itself in
      included into many many .c files via the sys.h->sem.h sequence so adding the
      sched.h into it will make all these .c depend on sched.h which is not that
      good.  On the other hand the knowledge about the namespaces stuff is required
      in 4 .c files only.
      
      Besides, this patch compiles out some auxiliary functions from ipc/sem.c,
      msg.c and shm.c files.  It turned out that moving these functions into
      namespaces.c is not that easy because they use many other calls and macros
      from the original file.  Moving them would make this patch complicated.  On
      the other hand all these functions can be consolidated, so I will send a
      separate patch doing this a bit later.
      Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
      Acked-by: NSerge Hallyn <serue@us.ibm.com>
      Cc: Cedric Le Goater <clg@fr.ibm.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Herbert Poetzl <herbert@13thfloor.at>
      Cc: Kirill Korotaev <dev@sw.ru>
      Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ae5e1b22
    • P
      namespaces: move the UTS namespace under UTS_NS option · 58bfdd6d
      Pavel Emelyanov 提交于
      Currently all the namespace management code is in the kernel/utsname.c file,
      so just compile it out and make stubs in the appropriate header.
      
      The init namespace itself is in init/version.c and is in the kernel all the
      time.
      Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
      Acked-by: NSerge Hallyn <serue@us.ibm.com>
      Cc: Cedric Le Goater <clg@fr.ibm.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Herbert Poetzl <herbert@13thfloor.at>
      Cc: Kirill Korotaev <dev@sw.ru>
      Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      58bfdd6d
    • N
      hugetlb: add locking for overcommit sysctl · a3d0c6aa
      Nishanth Aravamudan 提交于
      When I replaced hugetlb_dynamic_pool with nr_overcommit_hugepages I used
      proc_doulongvec_minmax() directly.  However, hugetlb.c's locking rules
      require that all counter modifications occur under the hugetlb_lock.  Add a
      callback into the hugetlb code similar to the one for nr_hugepages.  Grab
      the lock around the manipulation of nr_overcommit_hugepages in
      proc_doulongvec_minmax().
      Signed-off-by: NNishanth Aravamudan <nacc@us.ibm.com>
      Acked-by: NAdam Litke <agl@us.ibm.com>
      Cc: David Gibson <david@gibson.dropbear.id.au>
      Cc: William Lee Irwin III <wli@holomorphy.com>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a3d0c6aa
  2. 08 2月, 2008 26 次提交