提交 · 2b7d0390a6d6d595f43ea3806639664afe5b9ebe · openanolis / cloud-kernel

12 11月, 2008 2 次提交

tracing: branch tracer, fix vdso crash · 2b7d0390

由 Ingo Molnar 提交于 11月 12, 2008

Impact: fix bootup crash

the branch tracer missed arch/x86/vdso/vclock_gettime.c from
disabling tracing, which caused such bootup crashes:

  [  201.840097] init[1]: segfault at 7fffed3fe7c0 ip 00007fffed3fea2e sp 000077

also clean up the ugly ifdefs in arch/x86/kernel/vsyscall_64.c by
creating DISABLE_UNLIKELY_PROFILE facility for code to turn off
instrumentation on a per file basis.
Signed-off-by: NIngo Molnar <mingo@elte.hu>

2b7d0390

tracing: profile likely and unlikely annotations · 1f0d69a9

由 Steven Rostedt 提交于 11月 12, 2008

Impact: new unlikely/likely profiler

Andrew Morton recently suggested having an in-kernel way to profile
likely and unlikely macros. This patch achieves that goal.

When configured, every(*) likely and unlikely macro gets a counter attached
to it. When the condition is hit, the hit and misses of that condition
are recorded. These numbers can later be retrieved by:

  /debugfs/tracing/profile_likely    - All likely markers
  /debugfs/tracing/profile_unlikely  - All unlikely markers.

# cat /debug/tracing/profile_unlikely | head
 correct incorrect  %        Function                  File              Line
 ------- ---------  -        --------                  ----              ----
    2167        0   0 do_arch_prctl                  process_64.c         832
       0        0   0 do_arch_prctl                  process_64.c         804
    2670        0   0 IS_ERR                         err.h                34
   71230     5693   7 __switch_to                    process_64.c         673
   76919        0   0 __switch_to                    process_64.c         639
   43184    33743  43 __switch_to                    process_64.c         624
   12740    64181  83 __switch_to                    process_64.c         594
   12740    64174  83 __switch_to                    process_64.c         590

# cat /debug/tracing/profile_unlikely | \
  awk '{ if ($3 > 25) print $0; }' |head -20
   44963    35259  43 __switch_to                    process_64.c         624
   12762    67454  84 __switch_to                    process_64.c         594
   12762    67447  84 __switch_to                    process_64.c         590
    1478      595  28 syscall_get_error              syscall.h            51
       0     2821 100 syscall_trace_leave            ptrace.c             1567
       0        1 100 native_smp_prepare_cpus        smpboot.c            1237
   86338   265881  75 calc_delta_fair                sched_fair.c         408
  210410   108540  34 calc_delta_mine                sched.c              1267
       0    54550 100 sched_info_queued              sched_stats.h        222
   51899    66435  56 pick_next_task_fair            sched_fair.c         1422
       6       10  62 yield_task_fair                sched_fair.c         982
    7325     2692  26 rt_policy                      sched.c              144
       0     1270 100 pre_schedule_rt                sched_rt.c           1261
    1268    48073  97 pick_next_task_rt              sched_rt.c           884
       0    45181 100 sched_info_dequeued            sched_stats.h        177
       0       15 100 sched_move_task                sched.c              8700
       0       15 100 sched_move_task                sched.c              8690
   53167    33217  38 schedule                       sched.c              4457
       0    80208 100 sched_info_switch              sched_stats.h        270
   30585    49631  61 context_switch                 sched.c              2619

# cat /debug/tracing/profile_likely | awk '{ if ($3 > 25) print $0; }'
   39900    36577  47 pick_next_task                 sched.c              4397
   20824    15233  42 switch_mm                      mmu_context_64.h     18
       0        7 100 __cancel_work_timer            workqueue.c          560
     617    66484  99 clocksource_adjust             timekeeping.c        456
       0   346340 100 audit_syscall_exit             auditsc.c            1570
      38   347350  99 audit_get_context              auditsc.c            732
       0   345244 100 audit_syscall_entry            auditsc.c            1541
      38     1017  96 audit_free                     auditsc.c            1446
       0     1090 100 audit_alloc                    auditsc.c            862
    2618     1090  29 audit_alloc                    auditsc.c            858
       0        6 100 move_masked_irq                migration.c          9
       1      198  99 probe_sched_wakeup             trace_sched_switch.c 58
       2        2  50 probe_wakeup                   trace_sched_wakeup.c 227
       0        2 100 probe_wakeup_sched_switch      trace_sched_wakeup.c 144
    4514     2090  31 __grab_cache_page              filemap.c            2149
   12882   228786  94 mapping_unevictable            pagemap.h            50
       4       11  73 __flush_cpu_slab               slub.c               1466
  627757   330451  34 slab_free                      slub.c               1731
    2959    61245  95 dentry_lru_del_init            dcache.c             153
     946     1217  56 load_elf_binary                binfmt_elf.c         904
     102       82  44 disk_put_part                  genhd.h              206
       1        1  50 dst_gc_task                    dst.c                82
       0       19 100 tcp_mss_split_point            tcp_output.c         1126

As you can see by the above, there's a bit of work to do in rethinking
the use of some unlikelys and likelys. Note: the unlikely case had 71 hits
that were more than 25%.

Note:  After submitting my first version of this patch, Andrew Morton
  showed me a version written by Daniel Walker, where I picked up
  the following ideas from:

  1)  Using __builtin_constant_p to avoid profiling fixed values.
  2)  Using __FILE__ instead of instruction pointers.
  3)  Using the preprocessor to stop all profiling of likely
       annotations from vsyscall_64.c.

Thanks to Andrew Morton, Arjan van de Ven, Theodore Tso and Ingo Molnar
for their feed back on this patch.

(*) Not ever unlikely is recorded, those that are used by vsyscalls
 (a few of them) had to have profiling disabled.
Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Theodore Tso <tytso@mit.edu>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

1f0d69a9

11 11月, 2008 12 次提交

tracing: function return tracer, build fix · 19b3e967

由 Ingo Molnar 提交于 11月 11, 2008

fix:

 arch/x86/kernel/ftrace.c: In function 'ftrace_return_to_handler':
 arch/x86/kernel/ftrace.c:112: error: implicit declaration of function 'cpu_clock'

cpu_clock() is implicitly included via a number of ways, but its real
location is sched.h. (Build failure is triggerable if enough other
kernel components are turned off.)
Signed-off-by: NIngo Molnar <mingo@elte.hu>

19b3e967

tracing, x86: function return tracer, fix assembly constraints · 867f7fb3

由 Ingo Molnar 提交于 11月 11, 2008

fix:

 arch/x86/kernel/ftrace.c: Assembler messages:
 arch/x86/kernel/ftrace.c:140: Error: missing ')'
 arch/x86/kernel/ftrace.c:140: Error: junk `(%ebp))' after expression
 arch/x86/kernel/ftrace.c:141: Error: missing ')'
 arch/x86/kernel/ftrace.c:141: Error: junk `(%ebp))' after expression

the [parent_replaced] is used in an =rm fashion, so that constraint
is correct in isolation - but [parent_old] aliases register %0 and uses
it in an addressing mode that is only valid with registers - so change
the constraint from =rm to =r.

This fixes the build failure.
Signed-off-by: NIngo Molnar <mingo@elte.hu>

867f7fb3

tracing, x86: clean up FUNCTION_RET_TRACER Kconfig · f1c4be5e

由 Ingo Molnar 提交于 11月 11, 2008

Impact: cleanup

move FUNCTION_RET_TRACER to the X86 select section, where we have all the
other options.
Signed-off-by: NIngo Molnar <mingo@elte.hu>

f1c4be5e

tracing, x86: add low level support for ftrace return tracing · caf4b323

由 Frederic Weisbecker 提交于 11月 11, 2008

Impact: add infrastructure for function-return tracing

Add low level support for ftrace return tracing.

This plug-in stores return addresses on the thread_info structure of
the current task.

The index of the current return address is initialized when the task
is the first one (init) and when a process forks (the child). It is
not needed when a task does a sys_execve because after this syscall,
it still needs to return on the kernel functions it called.

Note that the code of return_to_handler has been suggested by Steven
Rostedt as almost all of the ideas of improvements in this V3.

For purpose of security, arch/x86/kernel/process_32.c is not traced
because __switch_to() changes the current task during its execution.
That could cause inconsistency in the stored return address of this
function even if I didn't have any crash after testing with tracing on
this function enabled.
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

caf4b323

powerpc: Update desktop/server defconfigs · cb8fdc69

由 Paul Mackerras 提交于 11月 11, 2008

Turned off CONFIG_PCI_LEGACY and turned on EXT4, and otherwise mostly
took the defaults.  This also updates ppc6xx_defconfig, which covers
the 6xx/7xx/7xxx-based embedded boards.
Signed-off-by: NPaul Mackerras <paulus@samba.org>

cb8fdc69

powerpc: Fix msr check in compat_sys_swapcontext · 77eb50ae

由 Andreas Schwab 提交于 11月 06, 2008

The new context may not be 16-byte aligned, so the real address of the
mcontext structure should be read from the uc_regs pointer instead of
directly using the (unaligned) uc_mcontext field.
Signed-off-by: NAndreas Schwab <schwab@suse.de>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

77eb50ae

x86: Make NUMA on 32-bit depend on BROKEN · 4694516d

由 Rafael J. Wysocki 提交于 11月 10, 2008

While investigating the failure of hibernation on 32-bit x86 with
CONFIG_NUMA set, as described in this message
http://marc.info/?l=linux-kernel&m=122634118116226&w=4
I asked some people for help and I was told that it wasn't really
worth the effort, because CONFIG_NUMA was generally broken on 32-bit
x86 systems and it shouldn't be used in such configs.  For this
reason, make CONFIG_NUMA depend on BROKEN instead of EXPERIMENTAL on
x86-32.
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Pavel Machek <pavel@suse.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4694516d

D
sparc64: Update defconfig. · 12de512a
由 David S. Miller 提交于 11月 10, 2008
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
12de512a

Revert "sparc: correct section of current_pc()" · 29b14328

由 David S. Miller 提交于 11月 07, 2008

This reverts commit 8dd94537.

This fixes a boot failure reported by Robert Reif.

The code above the section change expects to fallthrough, so
we can't make such a section change here.

29b14328

x86: HPET: enter hpet_interrupt_handler with interrupts disabled · 5ceb1a04

由 Matt Fleming 提交于 11月 02, 2008

Some functions that may be called from this handler require that
interrupts are disabled. Also, combining IRQF_DISABLED and
IRQF_SHARED does not reliably disable interrupts in a handler, so
remove IRQF_SHARED from the irq flags (this irq is not shared anyway).
Signed-off-by: NMatt Fleming <mjf@gentoo.org>
Cc: mingo@elte.hu
Cc: venkatesh.pallipadi@intel.com
Cc: "Will Newton" <will.newton@gmail.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

5ceb1a04

x86: HPET: read from HPET_Tn_CMP() not HPET_T0_CMP · 89d77a1e

由 Matt Fleming 提交于 11月 02, 2008

In hpet_next_event() we check that the value we just wrote to
HPET_Tn_CMP(timer) has reached the chip. Currently, we're checking that
the value we wrote to HPET_Tn_CMP(timer) is in HPET_T0_CMP, which, if
timer is anything other than timer 0, is likely to fail.
Signed-off-by: NMatt Fleming <mjf@gentoo.org>
Cc: mingo@elte.hu
Cc: venkatesh.pallipadi@intel.com
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

89d77a1e

x86: HPET: convert WARN_ON to WARN_ON_ONCE · 1de5b085

由 Matt Fleming 提交于 11月 02, 2008

It is possible to flood the console with call traces if the WARN_ON
condition is true because of the frequency with which this function is
called.
Signed-off-by: NMatt Fleming <mjf@gentoo.org>
Cc: mingo@elte.hu
Cc: venkatesh.pallipadi@intel.com
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

1de5b085

09 11月, 2008 5 次提交

powerpc: Updated Freescale PPC related defconfigs · ea37194d

由 Kumar Gala 提交于 11月 08, 2008

unset CONFIG_PCI_LEGACY in the defconfigs as none of them enable
ISDN drivers which seem to be the only place we are using pci_find_device
Signed-off-by: NKumar Gala <galak@kernel.crashing.org>

ea37194d

powerpc: Update QE/CPM2 usb_ctlr structures for USB support · 2b487065

由 Li Yang 提交于 11月 08, 2008

Fixes following build error:

CC drivers/usb/gadget/fsl_qe_udc.o
drivers/usb/gadget/fsl_qe_udc.c: In function 'qe_eprx_stall_change':
drivers/usb/gadget/fsl_qe_udc.c:156: error: 'struct usb_ctlr' has no member named 'usb_usep'
drivers/usb/gadget/fsl_qe_udc.c:163: error: 'struct usb_ctlr' has no member named 'usb_usep'
drivers/usb/gadget/fsl_qe_udc.c: In function 'qe_eptx_stall_change':
drivers/usb/gadget/fsl_qe_udc.c:173: error: 'struct usb_ctlr' has no member named 'usb_usep'
drivers/usb/gadget/fsl_qe_udc.c:180: error: 'struct usb_ctlr' has no member named 'usb_usep'
drivers/usb/gadget/fsl_qe_udc.c: In function 'qe_eprx_nack':
drivers/usb/gadget/fsl_qe_udc.c:201: error: 'struct usb_ctlr' has no member named 'usb_usep'
drivers/usb/gadget/fsl_qe_udc.c:201: error: 'struct usb_ctlr' has no member named 'usb_usep'
drivers/usb/gadget/fsl_qe_udc.c: In function 'qe_eprx_normal':
drivers/usb/gadget/fsl_qe_udc.c:218: error: 'struct usb_ctlr' has no member named 'usb_usep'
drivers/usb/gadget/fsl_qe_udc.c:218: error: 'struct usb_ctlr' has no member named 'usb_usep'
drivers/usb/gadget/fsl_qe_udc.c: In function 'qe_ep_reset':
drivers/usb/gadget/fsl_qe_udc.c:325: error: 'struct usb_ctlr' has no member named 'usb_usep'
drivers/usb/gadget/fsl_qe_udc.c:342: error: 'struct usb_ctlr' has no member named 'usb_usep'
drivers/usb/gadget/fsl_qe_udc.c: In function 'qe_ep_register_init':
drivers/usb/gadget/fsl_qe_udc.c:515: error: 'struct usb_ctlr' has no member named 'usb_usep'
drivers/usb/gadget/fsl_qe_udc.c: In function 'ch9getstatus':
drivers/usb/gadget/fsl_qe_udc.c:1981: error: 'struct usb_ctlr' has no member named 'usb_usep'
make[2]: *** [drivers/usb/gadget/fsl_qe_udc.o] Error 1
Signed-off-by: NLi Yang <leoli@freescale.com>
Signed-off-by: NAnton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: NKumar Gala <galak@kernel.crashing.org>

2b487065

powerpc/86xx: Correct SOC bus-frequency in GE Fanuc SBC610 DTS · 33d2d78b

由 Martyn Welch 提交于 11月 07, 2008

This patch corrects the bus-frequency value provided in the SBC610's dts.
Signed-off-by: NMartyn Welch <martyn.welch@gefanuc.com>
Signed-off-by: NKumar Gala <galak@kernel.crashing.org>

33d2d78b

powerpc/fsl-booke: Fix synchronization bug w/local tlb invalidates · b41d6fee

由 Kumar Gala 提交于 11月 04, 2008

The implemetation of _tlbil_pid() on Freescale Book-E cores needs
an msync & isync after we flash invalidate the TLBs.  This was causing
the following oops reported by Sebastian Andrzej Siewior:

  VFS: Mounted root (nfs filesystem) readonly.
  Freeing unused kernel memory: 148k init
  BUG: sleeping function called from invalid context at /home/bigeasy/git/linux-2.6-powerpc/mm/mmap.c:234
  in_atomic():1, irqs_disabled():0
  Call Trace:
  [df189df0] [c0007160] show_stack+0x48/0x148 (unreliable)
  [df189e30] [c0029480] __might_sleep+0xf0/0x100
  [df189e40] [c0070ac0] remove_vma+0x28/0x98
  [df189e50] [c0070c1c] exit_mmap+0xec/0x128
  [df189e80] [c002d2f4] mmput+0x54/0xec
  [df189ea0] [c0030b6c] exit_mm+0x10c/0x120
  [df189ed0] [c003288c] do_exit+0x1ac/0x6e8
  [df189f20] [c0032e48] do_group_exit+0x80/0xac
  [df189f40] [c000e9dc] ret_from_syscall+0x0/0x3c
  BUG: scheduling while atomic: udevd/956/0x10000002
  Modules linked in:
  Call Trace:
  [df189df0] [c0007160] show_stack+0x48/0x148 (unreliable)
  [df189e30] [c002ac88] __schedule_bug+0x58/0x6c
  [df189e40] [c023e6cc] schedule+0xa8/0x4a8
  [df189e90] [c002ad6c] __cond_resched+0x38/0x64
  [df189ea0] [c023ebc8] _cond_resched+0x3c/0x58
  [df189eb0] [c0030e70] put_files_struct+0x90/0xec
  [df189ed0] [c00328a8] do_exit+0x1c8/0x6e8
  [df189f20] [c0032e48] do_group_exit+0x80/0xac
  [df189f40] [c000e9dc] ret_from_syscall+0x0/0x3c
Signed-off-by: NKumar Gala <galak@kernel.crashing.org>

b41d6fee

sched: optimize sched_clock() a bit · 7cbaef9c

由 Ingo Molnar 提交于 11月 08, 2008

sched_clock() uses cycles_2_ns() needlessly - which is an irq-disabling
variant of __cycles_2_ns().

Most of the time sched_clock() is called with irqs disabled already.
The few places that call it with irqs enabled need to be updated.
Signed-off-by: NIngo Molnar <mingo@elte.hu>

7cbaef9c

08 11月, 2008 3 次提交

sched: improve sched_clock() performance · 0d12cdd5

由 Ingo Molnar 提交于 11月 08, 2008

in scheduler-intense workloads native_read_tsc() overhead accounts for
20% of the system overhead:

 659567 system_call                              41222.9375
 686796 schedule                                 435.7843
 718382 __switch_to                              665.1685
 823875 switch_mm                                4526.7857
 1883122 native_read_tsc                          55385.9412
 9761990 total                                      2.8468

this is large part due to the rdtsc_barrier() that is done before
and after reading the TSC.

But sched_clock() is not a precise clock in the GTOD sense, using such
barriers is completely pointless. So remove the barriers and only use
them in vget_cycles().

This improves lat_ctx performance by about 5%.
Signed-off-by: NIngo Molnar <mingo@elte.hu>

0d12cdd5

[IA64] Reserve elfcorehdr memory in CONFIG_CRASH_DUMP · 17c1f07e

由 Jay Lan 提交于 11月 07, 2008

IA64 kdump kernel failed to initialize /proc/vmcore in 2.6.28-rc2.
A bug was introduced in this patch commit:

  d9a9855d
  always reserve elfcore header memory in crash kernel

The problem was that the call to reserve_elfcorehdr() should be placed
in CONFIG_CRASH_DUMP rather than in CONFIG_CRASH_KERNEL, which does
not exist.
Signed-off-by: NJay Lan <jlan@sgi.com>
Acked-by: NSimon Hormon <horms@verge.net.au>
Signed-off-by: NTony Luck <tony.luck@intel.com>

17c1f07e

oprofile: Fix p6 counter overflow check · 7c64ade5

由 Andi Kleen 提交于 11月 07, 2008

Fix the counter overflow check for CPUs with counter width > 32

I had a similar change in a different patch that I didn't submit
and I didn't notice the problem earlier because it was always
tested together.
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Signed-off-by: NRobert Richter <robert.richter@amd.com>

7c64ade5

07 11月, 2008 7 次提交

xen: make sure stray alias mappings are gone before pinning · d05fdf31

由 Jeremy Fitzhardinge 提交于 10月 28, 2008

Xen requires that all mappings of pagetable pages are read-only, so
that they can't be updated illegally.  As a result, if a page is being
turned into a pagetable page, we need to make sure all its mappings
are RO.

If the page had been used for ioremap or vmalloc, it may still have
left over mappings as a result of not having been lazily unmapped.
This change makes sure we explicitly mop them all up before pinning
the page.

Unlike aliases created by kmap, the there can be vmalloc aliases even
for non-high pages, so we must do the flush unconditionally.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Linux Memory Management List <linux-mm@kvack.org>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

d05fdf31

x86, xen: fix use of pgd_page now that it really does return a page · 47cb2ed9

由 Jeremy Fitzhardinge 提交于 11月 06, 2008

Impact: fix 32-bit Xen guest boot crash

On 32-bit PAE, pud_page, for no good reason, didn't really return a
struct page *.  Since Jan Beulich's fix "i386/PAE: fix pud_page()",
pud_page does return a struct page *.

Because PAE has 3 pagetable levels, the pud level is folded into the
pgd level, so pgd_page() is the same as pud_page(), and now returns
a struct page *.  Update the xen/mmu.c code which uses pgd_page()
accordingly.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

47cb2ed9

[ARM] xsc3: fix xsc3_l2_inv_range · c7cf72dc

由 Dan Williams 提交于 11月 06, 2008

When 'start' and 'end' are less than a cacheline apart and 'start' is
unaligned we are done after cleaning and invalidating the first
cacheline.  So check for (start < end) which will not walk off into
invalid address ranges when (start > end).

This issue was caught by drivers/dma/dmatest.

2.6.27 is susceptible.

Cc: <stable@kernel.org>
Cc: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
Cc: Lothar WaÃ<9f>mann <LW@KARO-electronics.de>
Cc: Lennert Buytenhek <buytenh@marvell.com>
Cc: Eric Miao <eric.miao@marvell.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

c7cf72dc

[ARM] mm: fix page table initialization · b1cce6b1

由 Russell King 提交于 11月 04, 2008

As a result of the ptebits changes, we ended up marking device mappings
as normal memory on ARMv7 CPUs, resulting in undesirable behaviour with
serial ports and the like.  While reviewing the section mapping table
entries, other errors in the memory type settings for devices were
detected and confirmed to prevent Xscale3 platforms booting.

Tested on:
	OMAP34xx (ARMv7),
	OMAP24xx (ARMv6),
	OMAP16xx (ARM926T, ARMv5),
	PXA311 (Xscale3),
	PXA272 (Xscale),
	PXA255 (Xscale),
	IXP42x (Xscale),
	S3C2410 (ARM920T, ARMv4T),
	ARM720T (ARMv4T)
	StrongARM-110 (ARMv4)
Acked-by: NTony Lindgren <tony@atomide.com>
Tested-by: NRobert Jarzmik <robert.jarzmik@free.fr>
Tested-by: NMike Rapoport <mike@compulab.co.il>
Tested-by: NBen Dooks <ben-linux@fluff.org>
Tested-by: NAnders Grafström <grfstrm@users.sourceforge.net>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

b1cce6b1

[IA64] fix boot panic caused by offline CPUs · 62ee0540

由 Doug Chapman 提交于 11月 05, 2008

This fixes a regression introduced by 2c6e6db4
"Minimize per_cpu reservations."  That patch incorrectly used information about
what CPUs are possible that was not yet initialized by ACPI.  The end result
was that per_cpu structures for offline CPUs were not initialized causing a
NULL pointer reference.

Since we cannot do the full acpi_boot_init() call any earlier, the simplest
fix is to just parse the MADT for SAPIC entries early to find the CPU
info.  This should also allow for some cleanup of the code added by the
"Minimize per_cpu reservations".  This patch just fixes the regressions, the
cleanup will come in a later patch.
Signed-off-by: NDoug Chapman <doug.chapman@hp.com>
Signed-off-by: NAlex Chiang <achiang@hp.com>
CC: Robin Holt <holt@sgi.com>
Signed-off-by: NTony Luck <tony.luck@intel.com>

62ee0540

[IA64] reorder Kconfig options to match x86 · 1547a012

由 Bjorn Helgaas 提交于 11月 06, 2008

No functional change, just reorder some config options and update
the "Power management and ACPI" label to match the defacto x86
standard.
Signed-off-by: NBjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: NTony Luck <tony.luck@intel.com>

1547a012

[ARM] fix naming of MODULE_START / MODULE_END · ab4f2ee1

由 Russell King 提交于 11月 06, 2008

As of 73bdf0a6, the kernel needs
to know where modules are located in the virtual address space.
On ARM, we located this region between MODULE_START and MODULE_END.
Unfortunately, everyone else calls it MODULES_VADDR and MODULES_END.
Update ARM to use the same naming, so is_vmalloc_or_module_addr()
can work properly.  Also update the comment on mm/vmalloc.c to
reflect that ARM also places modules in a separate region from the
vmalloc space.
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

ab4f2ee1

06 11月, 2008 10 次提交

Revert "x86: default to reboot via ACPI" · 8d00450d

由 Eduardo Habkost 提交于 11月 04, 2008

This reverts commit c7ffa6c2.

the assumptio of this change was that this would not break
any existing machine. Andrey Borzenkov reported troubles with
the ACPI reboot method: the system would hang on reboot, necessiating
a power cycle. Probably more systems are affected as well.

Also, there are patches queued up for v2.6.29 to disable virtualization
on emergency_restart() - which was the original motivation of
this change.
Reported-by: NAndrey Borzenkov <arvidjaar@mail.ru>
Bisected-by: NAndrey Borzenkov <arvidjaar@mail.ru>
Signed-off-by: NEduardo Habkost <ehabkost@redhat.com>
Acked-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

8d00450d

x86: align DirectMap in /proc/meminfo · b9c3bfc2

由 Hugh Dickins 提交于 11月 06, 2008

Impact: right-align /proc/meminfo consistent with other fields

When the split-LRU patches added Inactive(anon) and Inactive(file) lines
to /proc/meminfo, all counts were moved two columns rightwards to fit in.
Now move x86's DirectMap lines two columns rightwards to line up.
Signed-off-by: NHugh Dickins <hugh@veritas.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

b9c3bfc2

AMD IOMMU: fix lazy IO/TLB flushing in unmap path · 80be308d

由 Joerg Roedel 提交于 11月 06, 2008

Lazy flushing needs to take care of the unmap path too which is not yet
implemented and leads to stale IO/TLB entries. This is fixed by this
patch.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>

80be308d

x86: add smp_mb() before sending INVALIDATE_TLB_VECTOR · d6f0f39b

由 Suresh Siddha 提交于 11月 04, 2008

Impact: fix rare x2apic hang

On x86, x2apic mode accesses for sending IPI's don't have serializing
semantics. If the IPI receivner refers(in lock-free fashion) to some
memory setup by the sender, the need for smp_mb() before sending the
IPI becomes critical in x2apic mode.

Add the smp_mb() in native_flush_tlb_others() before sending the IPI.
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

d6f0f39b

x86: remove VISWS and PARAVIRT around NR_IRQS puzzle · 7db282fa

由 Yinghai Lu 提交于 11月 05, 2008

Impact: fix warning message when PARAVIRT is set in config

Remove stale #ifdef components from our IRQ sizing logic.
x86/Voyager is the only holdout.
Signed-off-by: NYinghai Lu <yinghai@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

7db282fa

x86: mention ACPI in top-level Kconfig menu · da85f865

由 Bjorn Helgaas 提交于 11月 05, 2008

Impact: clarify menuconfig text

Mention ACPI in the top-level menu to give a clue as to where
it lives. This matches what ia64 does.
Signed-off-by: NBjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

da85f865

ftrace: add quick function trace stop · 60a7ecf4

由 Steven Rostedt 提交于 11月 05, 2008

Impact: quick start and stop of function tracer

This patch adds a way to disable the function tracer quickly without
the need to run kstop_machine. It adds a new variable called
function_trace_stop which will stop the calls to functions from mcount
when set.  This is just an on/off switch and does not handle recursion
like preempt_disable().

It's main purpose is to help other tracers/debuggers start and stop tracing
fuctions without the need to call kstop_machine.

The config option HAVE_FUNCTION_TRACE_MCOUNT_TEST is added for archs
that implement the testing of the function_trace_stop in the mcount
arch dependent code. Otherwise, the test is done in the C code.

x86 is the only arch at the moment that supports this.
Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

60a7ecf4

x86: size NR_IRQS on 32-bit systems the same way as 64-bit · 1b489768

由 Yinghai Lu 提交于 11月 04, 2008

Impact: make NR_IRQS big enough for system with lots of apic/pins

If lots of IO_APIC's are there (or can be there), size the same way
as 64-bit, depending on MAX_IO_APICS and NR_CPUS.

This fixes the boot problem reported by Ben Hutchings on a 32-bit
server with 5 IO-APICs and 240 IO-APIC pins.
Signed-off-by: NYinghai <yinghai@kernel.org>
Tested-by: NBen Hutchings <bhutchings@solarflare.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

1b489768

x86: don't allow nr_irqs > NR_IRQS · c78d0cf2

由 Ben Hutchings 提交于 11月 05, 2008

Impact: fix boot hang on 32-bit systems with more than 224 IO-APIC pins

On some 32-bit systems with a lot of IO-APICs probe_nr_irqs() can
return a value larger than NR_IRQS. This will lead to probe_irq_on()
overrunning the irq_desc array.

I hit this when running net-next-2.6 (close to 2.6.28-rc3) on a
Supermicro dual Xeon system. NR_IRQS is 224 but probe_nr_irqs() detects
5 IOAPICs and returns 240. Here are the log messages:

Tue Nov 4 16:53:47 2008 ACPI: IOAPIC (id[0x01] address[0xfec00000] gsi_base[0])
Tue Nov 4 16:53:47 2008 IOAPIC[0]: apic_id 1, version 32, address 0xfec00000, GSI 0-23
Tue Nov 4 16:53:47 2008 ACPI: IOAPIC (id[0x02] address[0xfec81000] gsi_base[24])
Tue Nov 4 16:53:47 2008 IOAPIC[1]: apic_id 2, version 32, address 0xfec81000, GSI 24-47
Tue Nov 4 16:53:47 2008 ACPI: IOAPIC (id[0x03] address[0xfec81400] gsi_base[48])
Tue Nov 4 16:53:47 2008 IOAPIC[2]: apic_id 3, version 32, address 0xfec81400, GSI 48-71
Tue Nov 4 16:53:47 2008 ACPI: IOAPIC (id[0x04] address[0xfec82000] gsi_base[72])
Tue Nov 4 16:53:47 2008 IOAPIC[3]: apic_id 4, version 32, address 0xfec82000, GSI 72-95
Tue Nov 4 16:53:47 2008 ACPI: IOAPIC (id[0x05] address[0xfec82400] gsi_base[96])
Tue Nov 4 16:53:47 2008 IOAPIC[4]: apic_id 5, version 32, address 0xfec82400, GSI 96-119
Tue Nov 4 16:53:47 2008 ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 high edge)
Tue Nov 4 16:53:47 2008 ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
Tue Nov 4 16:53:47 2008 Enabling APIC mode: Flat. Using 5 I/O APICs
Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
Acked-by: NYinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

c78d0cf2

sched: re-tune balancing · 9fcd18c9

由 Ingo Molnar 提交于 11月 05, 2008

Impact: improve wakeup affinity on NUMA systems, tweak SMP systems

Given the fixes+tweaks to the wakeup-buddy code, re-tweak the domain
balancing defaults on NUMA and SMP systems.

Turn on SD_WAKE_AFFINE which was off on x86 NUMA - there's no reason
why we would not want to have wakeup affinity across nodes as well.
(we already do this in the standard NUMA template.)

lat_ctx on a NUMA box is particularly happy about this change:

before:

 |   phoenix:~/l> ./lat_ctx -s 0 2
 |   "size=0k ovr=2.60
 |   2 5.70

after:

 |   phoenix:~/l> ./lat_ctx -s 0 2
 |   "size=0k ovr=2.65
 |   2 2.07

a 2.75x speedup.

pipe-test is similarly happy about it too:

 |  phoenix:~/sched-tests> ./pipe-test
 |   18.26 usecs/loop.
 |   14.70 usecs/loop.
 |   14.38 usecs/loop.
 |   10.55 usecs/loop.              # +WAKE_AFFINE on domain0+domain1
 |   8.63 usecs/loop.
 |   8.59 usecs/loop.
 |   9.03 usecs/loop.
 |   8.94 usecs/loop.
 |   8.96 usecs/loop.
 |   8.63 usecs/loop.

Also:

 - disable SD_BALANCE_NEWIDLE on NUMA and SMP domains (keep it for siblings)
 - enable SD_WAKE_BALANCE on SMP domains

Sysbench+postgresql improves all around the board, quite significantly:

           .28-rc3-11474e2c  .28-rc3-11474e2c-tune
-------------------------------------------------
    1:             571              688    +17.08%
    2:            1236             1206    -2.55%
    4:            2381             2642    +9.89%
    8:            4958             5164    +3.99%
   16:            9580             9574    -0.07%
   32:            7128             8118    +12.20%
   64:            7342             8266    +11.18%
  128:            7342             8064    +8.95%
  256:            7519             7884    +4.62%
  512:            7350             7731    +4.93%
-------------------------------------------------
  SUM:           55412            59341    +6.62%

So it's a win both for the runup portion, the peak area and the tail.
Signed-off-by: NIngo Molnar <mingo@elte.hu>

9fcd18c9

05 11月, 2008 1 次提交

powerpc: Fix "unused variable" warning in pci_dlpar.c · 454666eb

由 Stephen Rothwell 提交于 11月 02, 2008

This gets rid of this build warning:

arch/powerpc/platforms/pseries/pci_dlpar.c: In function 'init_phb_dynamic':
arch/powerpc/platforms/pseries/pci_dlpar.c:192: warning: unused variable 'b'

This is one of the very few warnings left in a ppc64_defconfig build and
getting rid of it will make it easier to see future introduced ones (in
fact this was introduced very recently).
Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

454666eb

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功