1. 26 2月, 2009 2 次提交
  2. 20 2月, 2009 2 次提交
    • A
      [IA64] Remove redundant cpu_clear() in __cpu_disable path · c0acdea2
      Alex Chiang 提交于
      The second call to cpu_clear() is redundant, as we've already removed
      the CPU from cpu_online_map before calling migrate_platform_irqs().
      Signed-off-by: NAlex Chiang <achiang@hp.com>
      Signed-off-by: NTony Luck <aegl@agluck-desktop.(none)>
      c0acdea2
    • A
      [IA64] Revert "prevent ia64 from invoking irq handlers on offline CPUs" · 66db2e63
      Alex Chiang 提交于
      This reverts commit e7b14036.
      
      Commit e7b14036 removes the targetted disabled CPU from the
      cpu_online_map after calls to migrate_platform_irqs and fixup_irqs.
      
      Paul McKenney states that the reasoning behind the patch was to
      prevent irq handlers from running on CPUs marked offline because:
      
      	RCU happily ignores CPUs that don't have their bits set in
      	cpu_online_map, so if there are RCU read-side critical sections
      	in the irq handlers being run, RCU will ignore them.  If the
      	other CPUs were running, they might sequence through the RCU
      	state machine, which could result in data structures being
      	yanked out from under those irq handlers, which in turn could
      	result in oopses or worse.
      
      Unfortunately, both ia64 functions above look at cpu_online_map to find
      a new CPU to migrate interrupts onto. This means we can potentially
      migrate an interrupt off ourself back to... ourself. Uh oh.
      
      This causes an oops when we finally try to process pending interrupts on
      the CPU we want to disable. The oops results from calling __do_IRQ with
      a NULL pt_regs:
      
      Unable to handle kernel NULL pointer dereference (address 0000000000000040)
      Call Trace:
       [<a000000100016930>] show_stack+0x50/0xa0
                                      sp=e0000009c922fa00 bsp=e0000009c92214d0
       [<a0000001000171a0>] show_regs+0x820/0x860
                                      sp=e0000009c922fbd0 bsp=e0000009c9221478
       [<a00000010003c700>] die+0x1a0/0x2e0
                                      sp=e0000009c922fbd0 bsp=e0000009c9221438
       [<a0000001006e92f0>] ia64_do_page_fault+0x950/0xa80
                                      sp=e0000009c922fbd0 bsp=e0000009c92213d8
       [<a00000010000c7a0>] ia64_native_leave_kernel+0x0/0x270
                                      sp=e0000009c922fc60 bsp=e0000009c92213d8
       [<a0000001000ecdb0>] profile_tick+0xd0/0x1c0
                                      sp=e0000009c922fe30 bsp=e0000009c9221398
       [<a00000010003bb90>] timer_interrupt+0x170/0x3e0
                                      sp=e0000009c922fe30 bsp=e0000009c9221330
       [<a00000010013a800>] handle_IRQ_event+0x80/0x120
                                      sp=e0000009c922fe30 bsp=e0000009c92212f8
       [<a00000010013aa00>] __do_IRQ+0x160/0x4a0
                                      sp=e0000009c922fe30 bsp=e0000009c9221290
       [<a000000100012290>] ia64_process_pending_intr+0x2b0/0x360
                                      sp=e0000009c922fe30 bsp=e0000009c9221208
       [<a0000001000112d0>] fixup_irqs+0xf0/0x2a0
                                      sp=e0000009c922fe30 bsp=e0000009c92211a8
       [<a00000010005bd80>] __cpu_disable+0x140/0x240
                                      sp=e0000009c922fe30 bsp=e0000009c9221168
       [<a0000001006c5870>] take_cpu_down+0x50/0xa0
                                      sp=e0000009c922fe30 bsp=e0000009c9221148
       [<a000000100122610>] stop_cpu+0xd0/0x200
                                      sp=e0000009c922fe30 bsp=e0000009c92210f0
       [<a0000001000e0440>] kthread+0xc0/0x140
                                      sp=e0000009c922fe30 bsp=e0000009c92210c8
       [<a000000100014ab0>] kernel_thread_helper+0xd0/0x100
                                      sp=e0000009c922fe30 bsp=e0000009c92210a0
       [<a00000010000a4c0>] start_kernel_thread+0x20/0x40
                                      sp=e0000009c922fe30 bsp=e0000009c92210a0
      
      I don't like this revert because it is fragile. ia64 is getting lucky
      because we seem to only ever process timer interrupts in this path, but
      if we ever race with an IPI here, we definitely use RCU and have the
      potential of hitting an oops that Paul describes above.
      
      Patching ia64's timer_interrupt() to check for NULL pt_regs is
      insufficient though, as we still hit the above oops.
      
      As a short term solution, I do think that this revert is the right
      answer. The revert hold up under repeated testing (24+ hour test runs)
      with this setup:
      
      	- 8-way rx6600
      	- randomly toggling CPU online/offline state every 2 seconds
      	- running CPU exercisers, memory hog, disk exercisers, and
      	  network stressors
      	- average system load around ~160
      
      In the long term, we really need to figure out why we set pt_regs = NULL
      in ia64_process_pending_intr(). If it turns out that it is unnecessary
      to do so, then we could safely re-introduce e7b14036 (along with some
      other logic to be smarter about migrating interrupts).
      
      One final note: x86 also removes the disabled CPU from cpu_online_map
      and then re-enables interrupts for 1ms, presumably to handle any pending
      interrupts:
      
      arch/x86/kernel/irq_32.c (and irq_64.c):
      cpu_disable_common:
      	[remove cpu from cpu_online_map]
      
      	fixup_irqs():
      		for_each_irq:
      			[break CPU affinities]
      
      		local_irq_enable();
      		mdelay(1);
      		local_irq_disable();
      
      So they are doing implicitly what ia64 is doing explicitly.
      Signed-off-by: NAlex Chiang <achiang@hp.com>
      Signed-off-by: NTony Luck <aegl@agluck-desktop.(none)>
      66db2e63
  3. 17 1月, 2009 1 次提交
  4. 16 1月, 2009 1 次提交
  5. 14 1月, 2009 1 次提交
  6. 09 1月, 2009 1 次提交
  7. 07 1月, 2009 2 次提交
  8. 06 1月, 2009 1 次提交
  9. 04 1月, 2009 2 次提交
    • M
      ia64: cpumask fix for is_affinity_mask_valid() · d3b66bf2
      Mike Travis 提交于
      Impact: cleanup
      
      The function prototype should use 'struct cpumask *' to declare
      cpumask arguments (instead of cpumask_var_t).
      
      Note: arch/ia64/kernel/irq.c still had the following "old cpumask_t" usages:
      
      105:	cpumask_t mask = CPU_MASK_NONE;
      107:	cpu_set(cpu_logical_id(hwid), mask);
      110:                 irq_desc[irq].affinity = mask;
      
      	... replaced with a simple "cpumask_of(cpu_logical_id(hwid))".
      
      161:			new_cpu = any_online_cpu(cpu_online_map);
      194:		time_keeper_id = first_cpu(cpu_online_map);
      
      	... replaced with cpu_online_mask refs.
      Signed-off-by: NMike Travis <travis@sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d3b66bf2
    • I
      ia64: cpumask fix for is_affinity_mask_valid() · 6bdf197b
      Ingo Molnar 提交于
      Impact: build fix on ia64
      
      ia64's default_affinity_write() still had old cpumask_t usage:
      
       /home/mingo/tip/kernel/irq/proc.c: In function `default_affinity_write':
       /home/mingo/tip/kernel/irq/proc.c:114: error: incompatible type for argument 1 of `is_affinity_mask_valid'
       make[3]: *** [kernel/irq/proc.o] Error 1
       make[3]: *** Waiting for unfinished jobs....
      
      update it to cpumask_var_t.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6bdf197b
  10. 01 1月, 2009 2 次提交
  11. 31 12月, 2008 2 次提交
    • M
      [PATCH] idle cputime accounting · 79741dd3
      Martin Schwidefsky 提交于
      The cpu time spent by the idle process actually doing something is
      currently accounted as idle time. This is plain wrong, the architectures
      that support VIRT_CPU_ACCOUNTING=y can do better: distinguish between the
      time spent doing nothing and the time spent by idle doing work. The first
      is accounted with account_idle_time and the second with account_system_time.
      The architectures that use the account_xxx_time interface directly and not
      the account_xxx_ticks interface now need to do the check for the idle
      process in their arch code. In particular to improve the system vs true
      idle time accounting the arch code needs to measure the true idle time
      instead of just testing for the idle process.
      To improve the tick based accounting as well we would need an architecture
      primitive that can tell us if the pt_regs of the interrupted context
      points to the magic instruction that halts the cpu.
      
      In addition idle time is no more added to the stime of the idle process.
      This field now contains the system time of the idle process as it should
      be. On systems without VIRT_CPU_ACCOUNTING this will always be zero as
      every tick that occurs while idle is running will be accounted as idle
      time.
      
      This patch contains the necessary common code changes to be able to
      distinguish idle system time and true idle time. The architectures with
      support for VIRT_CPU_ACCOUNTING need some changes to exploit this.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      79741dd3
    • M
      [PATCH] fix scaled & unscaled cputime accounting · 457533a7
      Martin Schwidefsky 提交于
      The utimescaled / stimescaled fields in the task structure and the
      global cpustat should be set on all architectures. On s390 the calls
      to account_user_time_scaled and account_system_time_scaled never have
      been added. In addition system time that is accounted as guest time
      to the user time of a process is accounted to the scaled system time
      instead of the scaled user time.
      To fix the bugs and to prevent future forgetfulness this patch merges
      account_system_time_scaled into account_system_time and
      account_user_time_scaled into account_user_time.
      
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Chris Wright <chrisw@sous-sol.org>
      Cc: Michael Neuling <mikey@neuling.org>
      Acked-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      457533a7
  12. 26 12月, 2008 1 次提交
  13. 13 12月, 2008 3 次提交
    • R
      cpumask: make irq_set_affinity() take a const struct cpumask · 0de26520
      Rusty Russell 提交于
      Impact: change existing irq_chip API
      
      Not much point with gentle transition here: the struct irq_chip's
      setaffinity method signature needs to change.
      
      Fortunately, not widely used code, but hits a few architectures.
      
      Note: In irq_select_affinity() I save a temporary in by mangling
      irq_desc[irq].affinity directly.  Ingo, does this break anything?
      
      (Folded in fix from KOSAKI Motohiro)
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NMike Travis <travis@sgi.com>
      Reviewed-by: NGrant Grundler <grundler@parisc-linux.org>
      Acked-by: NIngo Molnar <mingo@redhat.com>
      Cc: ralf@linux-mips.org
      Cc: grundler@parisc-linux.org
      Cc: jeremy@xensource.com
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      0de26520
    • R
      cpumask: change cpumask_scnprintf, cpumask_parse_user, cpulist_parse, and... · 29c0177e
      Rusty Russell 提交于
      cpumask: change cpumask_scnprintf, cpumask_parse_user, cpulist_parse, and cpulist_scnprintf to take pointers.
      
      Impact: change calling convention of existing cpumask APIs
      
      Most cpumask functions started with cpus_: these have been replaced by
      cpumask_ ones which take struct cpumask pointers as expected.
      
      These four functions don't have good replacement names; fortunately
      they're rarely used, so we just change them over.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NMike Travis <travis@sgi.com>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Cc: paulus@samba.org
      Cc: mingo@redhat.com
      Cc: tony.luck@intel.com
      Cc: ralf@linux-mips.org
      Cc: Greg Kroah-Hartman <gregkh@suse.de>
      Cc: cl@linux-foundation.org
      Cc: srostedt@redhat.com
      29c0177e
    • R
      cpumask: centralize cpu_online_map and cpu_possible_map · 98a79d6a
      Rusty Russell 提交于
      Impact: cleanup
      
      Each SMP arch defines these themselves.  Move them to a central
      location.
      
      Twists:
      1) Some archs (m32, parisc, s390) set possible_map to all 1, so we add a
         CONFIG_INIT_ALL_POSSIBLE for this rather than break them.
      
      2) mips and sparc32 '#define cpu_possible_map phys_cpu_present_map'.
         Those archs simply have phys_cpu_present_map replaced everywhere.
      
      3) Alpha defined cpu_possible_map to cpu_present_map; this is tricky
         so I just manipulate them both in sync.
      
      4) IA64, cris and m32r have gratuitous 'extern cpumask_t cpu_possible_map'
         declarations.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Reviewed-by: NGrant Grundler <grundler@parisc-linux.org>
      Tested-by: NTony Luck <tony.luck@intel.com>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Cc: Mike Travis <travis@sgi.com>
      Cc: ink@jurassic.park.msu.ru
      Cc: rmk@arm.linux.org.uk
      Cc: starvik@axis.com
      Cc: tony.luck@intel.com
      Cc: takata@linux-m32r.org
      Cc: ralf@linux-mips.org
      Cc: grundler@parisc-linux.org
      Cc: paulus@samba.org
      Cc: schwidefsky@de.ibm.com
      Cc: lethal@linux-sh.org
      Cc: wli@holomorphy.com
      Cc: davem@davemloft.net
      Cc: jdike@addtoit.com
      Cc: mingo@redhat.com
      98a79d6a
  14. 10 12月, 2008 1 次提交
  15. 21 11月, 2008 4 次提交
  16. 14 11月, 2008 2 次提交
  17. 08 11月, 2008 1 次提交
  18. 07 11月, 2008 1 次提交
    • D
      [IA64] fix boot panic caused by offline CPUs · 62ee0540
      Doug Chapman 提交于
      This fixes a regression introduced by 2c6e6db4
      "Minimize per_cpu reservations."  That patch incorrectly used information about
      what CPUs are possible that was not yet initialized by ACPI.  The end result
      was that per_cpu structures for offline CPUs were not initialized causing a
      NULL pointer reference.
      
      Since we cannot do the full acpi_boot_init() call any earlier, the simplest
      fix is to just parse the MADT for SAPIC entries early to find the CPU
      info.  This should also allow for some cleanup of the code added by the
      "Minimize per_cpu reservations".  This patch just fixes the regressions, the
      cleanup will come in a later patch.
      Signed-off-by: NDoug Chapman <doug.chapman@hp.com>
      Signed-off-by: NAlex Chiang <achiang@hp.com>
      CC: Robin Holt <holt@sgi.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      62ee0540
  19. 05 11月, 2008 1 次提交
  20. 02 11月, 2008 1 次提交
    • A
      saner FASYNC handling on file close · 233e70f4
      Al Viro 提交于
      As it is, all instances of ->release() for files that have ->fasync()
      need to remember to evict file from fasync lists; forgetting that
      creates a hole and we actually have a bunch that *does* forget.
      
      So let's keep our lives simple - let __fput() check FASYNC in
      file->f_flags and call ->fasync() there if it's been set.  And lose that
      crap in ->release() instances - leaving it there is still valid, but we
      don't have to bother anymore.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      233e70f4
  21. 25 10月, 2008 1 次提交
  22. 20 10月, 2008 3 次提交
    • S
      always reserve elfcore header memory in crash kernel · d9a9855d
      Simon Horman 提交于
      elfcore header memory needs to be reserved in a crash kernel.  This means
      that the relevant code should be protected by CONFIG_CRASH_DUMP rather
      than CONFIG_PROC_VMCORE.
      Signed-off-by: NSimon Horman <horms@verge.net.au>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d9a9855d
    • S
      kdump: add is_vmcore_usable() and vmcore_unusable() · 85a0ee34
      Simon Horman 提交于
      The usage of elfcorehdr_addr has changed recently such that being set to
      ELFCORE_ADDR_MAX is used by is_kdump_kernel() to indicate if the code is
      executing in a kernel executed as a crash kernel.
      
      However, arch/ia64/kernel/setup.c:reserve_elfcorehdr will rest
      elfcorehdr_addr to ELFCORE_ADDR_MAX on error, which means any subsequent
      calls to is_kdump_kernel() will return 0, even though they should return
      1.
      
      Ok, at this point in time there are no subsequent calls, but I think its
      fair to say that there is ample scope for error or at the very least
      confusion.
      
      This patch add an extra state, ELFCORE_ADDR_ERR, which indicates that
      elfcorehdr_addr was passed on the command line, and thus execution is
      taking place in a crashdump kernel, but vmcore can't be used for some
      reason.  This is tested for using is_vmcore_usable() and set using
      vmcore_unusable().  A subsequent patch makes use of this new code.
      
      To summarise, the states that elfcorehdr_addr can now be in are as follows:
      
      ELFCORE_ADDR_MAX: not a crashdump kernel
      ELFCORE_ADDR_ERR: crashdump kernel but vmcore is unusable
      any other value:  crash dump kernel and vmcore is usable
      Signed-off-by: NSimon Horman <horms@verge.net.au>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      85a0ee34
    • V
      kdump: make elfcorehdr_addr independent of CONFIG_PROC_VMCORE · 57cac4d1
      Vivek Goyal 提交于
      o elfcorehdr_addr is used by not only the code under CONFIG_PROC_VMCORE
        but also by the code which is not inside CONFIG_PROC_VMCORE.  For
        example, is_kdump_kernel() is used by powerpc code to determine if
        kernel is booting after a panic then use previous kernel's TCE table.
        So even if CONFIG_PROC_VMCORE is not set in second kernel, one should be
        able to correctly determine that we are booting after a panic and setup
        calgary iommu accordingly.
      
      o So remove the assumption that elfcorehdr_addr is under
        CONFIG_PROC_VMCORE.
      
      o Move definition of elfcorehdr_addr to arch dependent crash files.
        (Unfortunately crash dump does not have an arch independent file
        otherwise that would have been the best place).
      
      o kexec.c is not the right place as one can Have CRASH_DUMP enabled in
        second kernel without KEXEC being enabled.
      
      o I don't see sh setup code parsing the command line for
        elfcorehdr_addr.  I am wondering how does vmcore interface work on sh.
        Anyway, I am atleast defining elfcoredhr_addr so that compilation is not
        broken on sh.
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Acked-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Acked-by: NSimon Horman <horms@verge.net.au>
      Acked-by: NPaul Mundt <lethal@linux-sh.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      57cac4d1
  23. 18 10月, 2008 4 次提交