1. 09 10月, 2012 8 次提交
    • Y
      memory-hotplug: suppress "Trying to free nonexistent resource... · d760afd4
      Yasuaki Ishimatsu 提交于
      memory-hotplug: suppress "Trying to free nonexistent resource <XXXXXXXXXXXXXXXX-YYYYYYYYYYYYYYYY>" warning
      
      When our x86 box calls __remove_pages(), release_mem_region() shows many
      warnings.  And x86 box cannot unregister iomem_resource.
      
        "Trying to free nonexistent resource <XXXXXXXXXXXXXXXX-YYYYYYYYYYYYYYYY>"
      
      release_mem_region() has been changed to be called in each
      PAGES_PER_SECTION by commit de7f0cba ("memory hotplug: release
      memory regions in PAGES_PER_SECTION chunks").  Because powerpc registers
      iomem_resource in each PAGES_PER_SECTION chunk.  But when I hot add
      memory on x86 box, iomem_resource is register in each _CRS not
      PAGES_PER_SECTION chunk.  So x86 box unregisters iomem_resource.
      
      The patch fixes the problem.
      Signed-off-by: NYasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Jiang Liu <liuj97@gmail.com>
      Cc: Len Brown <len.brown@intel.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Minchan Kim <minchan.kim@gmail.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Wen Congyang <wency@cn.fujitsu.com>
      Cc: Dave Hansen <dave@linux.vnet.ibm.com>
      Cc: Nathan Fontenot <nfont@austin.ibm.com>
      Cc: Badari Pulavarty <pbadari@us.ibm.com>
      Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d760afd4
    • S
      readahead: fault retry breaks mmap file read random detection · 45cac65b
      Shaohua Li 提交于
      .fault now can retry.  The retry can break state machine of .fault.  In
      filemap_fault, if page is miss, ra->mmap_miss is increased.  In the second
      try, since the page is in page cache now, ra->mmap_miss is decreased.  And
      these are done in one fault, so we can't detect random mmap file access.
      
      Add a new flag to indicate .fault is tried once.  In the second try, skip
      ra->mmap_miss decreasing.  The filemap_fault state machine is ok with it.
      
      I only tested x86, didn't test other archs, but looks the change for other
      archs is obvious, but who knows :)
      Signed-off-by: NShaohua Li <shaohua.li@fusionio.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      45cac65b
    • S
      atomic: implement generic atomic_dec_if_positive() · e79bee24
      Shaohua Li 提交于
      The x86 implementation of atomic_dec_if_positive is quite generic, so make
      it available to all architectures.
      
      This is needed for "swap: add a simple detector for inappropriate swapin
      readahead".
      
      [akpm@linux-foundation.org: do the "#define foo foo" trick in the conventional manner]
      Signed-off-by: NShaohua Li <shli@fusionio.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Michal Simek <monstr@monstr.eu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e79bee24
    • W
      mm: hugetlb: add arch hook for clearing page flags before entering pool · 5d3a551c
      Will Deacon 提交于
      The core page allocator ensures that page flags are zeroed when freeing
      pages via free_pages_check.  A number of architectures (ARM, PPC, MIPS)
      rely on this property to treat new pages as dirty with respect to the data
      cache and perform the appropriate flushing before mapping the pages into
      userspace.
      
      This can lead to cache synchronisation problems when using hugepages,
      since the allocator keeps its own pool of pages above the usual page
      allocator and does not reset the page flags when freeing a page into the
      pool.
      
      This patch adds a new architecture hook, arch_clear_hugepage_flags, so
      that architectures which rely on the page flags being in a particular
      state for fresh allocations can adjust the flags accordingly when a page
      is freed into the pool.
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Reviewed-by: NMichal Hocko <mhocko@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5d3a551c
    • K
      mm: kill vma flag VM_RESERVED and mm->reserved_vm counter · 314e51b9
      Konstantin Khlebnikov 提交于
      A long time ago, in v2.4, VM_RESERVED kept swapout process off VMA,
      currently it lost original meaning but still has some effects:
      
       | effect                 | alternative flags
      -+------------------------+---------------------------------------------
      1| account as reserved_vm | VM_IO
      2| skip in core dump      | VM_IO, VM_DONTDUMP
      3| do not merge or expand | VM_IO, VM_DONTEXPAND, VM_HUGETLB, VM_PFNMAP
      4| do not mlock           | VM_IO, VM_DONTEXPAND, VM_HUGETLB, VM_PFNMAP
      
      This patch removes reserved_vm counter from mm_struct.  Seems like nobody
      cares about it, it does not exported into userspace directly, it only
      reduces total_vm showed in proc.
      
      Thus VM_RESERVED can be replaced with VM_IO or pair VM_DONTEXPAND | VM_DONTDUMP.
      
      remap_pfn_range() and io_remap_pfn_range() set VM_IO|VM_DONTEXPAND|VM_DONTDUMP.
      remap_vmalloc_range() set VM_DONTEXPAND | VM_DONTDUMP.
      
      [akpm@linux-foundation.org: drivers/vfio/pci/vfio_pci.c fixup]
      Signed-off-by: NKonstantin Khlebnikov <khlebnikov@openvz.org>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Carsten Otte <cotte@de.ibm.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Morris <james.l.morris@oracle.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Kentaro Takeda <takedakn@nttdata.co.jp>
      Cc: Matt Helsley <matthltc@us.ibm.com>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Venkatesh Pallipadi <venki@google.com>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      314e51b9
    • K
      mm: use mm->exe_file instead of first VM_EXECUTABLE vma->vm_file · 2dd8ad81
      Konstantin Khlebnikov 提交于
      Some security modules and oprofile still uses VM_EXECUTABLE for retrieving
      a task's executable file.  After this patch they will use mm->exe_file
      directly.  mm->exe_file is protected with mm->mmap_sem, so locking stays
      the same.
      Signed-off-by: NKonstantin Khlebnikov <khlebnikov@openvz.org>
      Acked-by: Chris Metcalf <cmetcalf@tilera.com>			[arch/tile]
      Acked-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>	[tomoyo]
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Carsten Otte <cotte@de.ibm.com>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Acked-by: NJames Morris <james.l.morris@oracle.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Kentaro Takeda <takedakn@nttdata.co.jp>
      Cc: Matt Helsley <matthltc@us.ibm.com>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Cc: Venkatesh Pallipadi <venki@google.com>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2dd8ad81
    • C
      Kconfig: clean up the "#if defined(arch)" list for exception-trace sysctl entry · 7ac57a89
      Catalin Marinas 提交于
      Introduce SYSCTL_EXCEPTION_TRACE config option and selec it in the
      architectures requiring support for the "exception-trace" debug_table
      entry in kernel/sysctl.c.
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7ac57a89
    • C
      Kconfig: clean up the long arch list for the DEBUG_KMEMLEAK config option · b69ec42b
      Catalin Marinas 提交于
      Introduce HAVE_DEBUG_KMEMLEAK config option and select it in corresponding
      architecture Kconfig files.  DEBUG_KMEMLEAK now only depends on
      HAVE_DEBUG_KMEMLEAK.
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b69ec42b
  2. 06 10月, 2012 3 次提交
  3. 04 10月, 2012 3 次提交
    • A
      powerpc/iommu: Fix multiple issues with IOMMU pools code · d900bd73
      Anton Blanchard 提交于
      There are a number of issues in the recent IOMMU pools code:
      
      - On a preempt kernel we might switch CPUs in the middle of building
        a scatter gather list. When this happens the handle hint passed in
        no longer falls within the local CPU's pool. Check for this and
        fall back to the pool hint.
      
      - We were missing a spin_unlock/spin_lock in one spot where we
        switch pools.
      
      - We need to provide locking around dart_tlb_invalidate_all and
        dart_tlb_invalidate_one now that the global lock is gone.
      Reported-by: NAlexander Graf <agraf@suse.de>
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      CC: <stable@kernel.org> [v3.6]
      d900bd73
    • N
      powerpc: Fix VMX fix for memcpy case · c8adfecc
      Nishanth Aravamudan 提交于
      In 2fae7cdb ("powerpc: Fix VMX in
      interrupt check in POWER7 copy loops"), Anton inadvertently
      introduced a regression for memcpy on POWER7 machines. copyuser and
      memcpy diverge slightly in their use of cr1 (copyuser doesn't use it,
      but memcpy does) and you end up clobbering that register with your fix.
      That results in (taken from an FC18 kernel):
      
      [   18.824604] Unrecoverable VMX/Altivec Unavailable Exception f20 at c000000000052f40
      [   18.824618] Oops: Unrecoverable VMX/Altivec Unavailable Exception, sig: 6 [#1]
      [   18.824623] SMP NR_CPUS=1024 NUMA pSeries
      [   18.824633] Modules linked in: tg3(+) be2net(+) cxgb4(+) ipr(+) sunrpc xts lrw gf128mul dm_crypt dm_round_robin dm_multipath linear raid10 raid456 async_raid6_recov async_memcpy async_pq raid6_pq async_xor xor async_tx raid1 raid0 scsi_dh_rdac scsi_dh_hp_sw scsi_dh_emc scsi_dh_alua squashfs cramfs
      [   18.824705] NIP: c000000000052f40 LR: c00000000020b874 CTR: 0000000000000512
      [   18.824709] REGS: c000001f1fef7790 TRAP: 0f20   Not tainted  (3.6.0-0.rc6.git0.2.fc18.ppc64)
      [   18.824713] MSR: 8000000000009032 <SF,EE,ME,IR,DR,RI>  CR: 4802802e  XER: 20000010
      [   18.824726] SOFTE: 0
      [   18.824728] CFAR: 0000000000000f20
      [   18.824731] TASK = c000000fa7128400[0] 'swapper/24' THREAD: c000000fa7480000 CPU: 24
      GPR00: 00000000ffffffc0 c000001f1fef7a10 c00000000164edc0 c000000f9b9a8120
      GPR04: c000000f9b9a8124 0000000000001438 0000000000000060 03ffffff064657ee
      GPR08: 0000000080000000 0000000000000010 0000000000000020 0000000000000030
      GPR12: 0000000028028022 c00000000ff25400 0000000000000001 0000000000000000
      GPR16: 0000000000000000 7fffffffffffffff c0000000016b2180 c00000000156a500
      GPR20: c000000f968c7a90 c0000000131c31d8 c000001f1fef4000 c000000001561d00
      GPR24: 000000000000000a 0000000000000000 0000000000000001 0000000000000012
      GPR28: c000000fa5c04f80 00000000000008bc c0000000015c0a28 000000000000022e
      [   18.824792] NIP [c000000000052f40] .memcpy_power7+0x5a0/0x7c4
      [   18.824797] LR [c00000000020b874] .pcpu_free_area+0x174/0x2d0
      [   18.824800] Call Trace:
      [   18.824803] [c000001f1fef7a10] [c000000000052c14] .memcpy_power7+0x274/0x7c4 (unreliable)
      [   18.824809] [c000001f1fef7b10] [c00000000020b874] .pcpu_free_area+0x174/0x2d0
      [   18.824813] [c000001f1fef7bb0] [c00000000020ba88] .free_percpu+0xb8/0x1b0
      [   18.824819] [c000001f1fef7c50] [c00000000043d144] .throtl_pd_exit+0x94/0xd0
      [   18.824824] [c000001f1fef7cf0] [c00000000043acf8] .blkg_free+0x88/0xe0
      [   18.824829] [c000001f1fef7d90] [c00000000018c048] .rcu_process_callbacks+0x2e8/0x8a0
      [   18.824835] [c000001f1fef7e90] [c0000000000a8ce8] .__do_softirq+0x158/0x4d0
      [   18.824840] [c000001f1fef7f90] [c000000000025ecc] .call_do_softirq+0x14/0x24
      [   18.824845] [c000000fa7483650] [c000000000010e80] .do_softirq+0x160/0x1a0
      [   18.824850] [c000000fa74836f0] [c0000000000a94a4] .irq_exit+0xf4/0x120
      [   18.824854] [c000000fa7483780] [c000000000020c44] .timer_interrupt+0x154/0x4d0
      [   18.824859] [c000000fa7483830] [c000000000003be0] decrementer_common+0x160/0x180
      [   18.824866] --- Exception: 901 at .plpar_hcall_norets+0x84/0xd4
      [   18.824866]     LR = .check_and_cede_processor+0x48/0x80
      [   18.824871] [c000000fa7483b20] [c00000000007f018] .check_and_cede_processor+0x18/0x80 (unreliable)
      [   18.824877] [c000000fa7483b90] [c00000000007f104] .dedicated_cede_loop+0x84/0x150
      [   18.824883] [c000000fa7483c50] [c0000000006bc030] .cpuidle_enter+0x30/0x50
      [   18.824887] [c000000fa7483cc0] [c0000000006bc9f4] .cpuidle_idle_call+0x104/0x720
      [   18.824892] [c000000fa7483d80] [c000000000070af8] .pSeries_idle+0x18/0x40
      [   18.824897] [c000000fa7483df0] [c000000000019084] .cpu_idle+0x1a4/0x380
      [   18.824902] [c000000fa7483ec0] [c0000000008a4c18] .start_secondary+0x520/0x528
      [   18.824907] [c000000fa7483f90] [c0000000000093f0] .start_secondary_prolog+0x10/0x14
      [   18.824911] Instruction dump:
      [   18.824914] 38840008 90030000 90e30004 38630008 7ca62850 7cc300d0 78c7e102 7cf01120
      [   18.824923] 78c60660 39200010 39400020 39600030 <7e00200c> 7c0020ce 38840010 409f001c
      [   18.824935] ---[ end trace 0bb95124affaaa45 ]---
      [   18.825046] Unrecoverable VMX/Altivec Unavailable Exception f20 at c000000000052d08
      
      I believe the right fix is to make memcpy match usercopy and not use
      cr1.
      Signed-off-by: NNishanth Aravamudan <nacc@us.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      CC: <stable@kernel.org> [v3.6]
      c8adfecc
    • M
      asm-generic: Add default clkdev.h · e7a570ff
      Mark Brown 提交于
      Ease the deployment of clkdev by providing a default asm/clkdev.h for
      use if the arch does not have an include/asm/clkdev.h.
      
      Due to limitations in Kbuild we manually add clkdev.h to all
      architectures that don't have one rather than having the header appear
      by default.
      Signed-off-by: NMark Brown <broonie@opensource.wolfsonmicro.com>
      Reviewed-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      e7a570ff
  4. 03 10月, 2012 3 次提交
  5. 27 9月, 2012 7 次提交
  6. 25 9月, 2012 2 次提交
    • F
      vtime: Consolidate system/idle context detection · a7e1a9e3
      Frederic Weisbecker 提交于
      Move the code that finds out to which context we account the
      cputime into generic layer.
      
      Archs that consider the whole time spent in the idle task as idle
      time (ia64, powerpc) can rely on the generic vtime_account()
      and implement vtime_account_system() and vtime_account_idle(),
      letting the generic code to decide when to call which API.
      
      Archs that have their own meaning of idle time, such as s390
      that only considers the time spent in CPU low power mode as idle
      time, can just override vtime_account().
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      a7e1a9e3
    • F
      cputime: Use a proper subsystem naming for vtime related APIs · bf9fae9f
      Frederic Weisbecker 提交于
      Use a naming based on vtime as a prefix for virtual based
      cputime accounting APIs:
      
      - account_system_vtime() -> vtime_account()
      - account_switch_vtime() -> vtime_task_switch()
      
      It makes it easier to allow for further declension such
      as vtime_account_system(), vtime_account_idle(), ... if we
      want to find out the context we account to from generic code.
      
      This also make it better to know on which subsystem these APIs
      refer to.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      bf9fae9f
  7. 21 9月, 2012 1 次提交
  8. 19 9月, 2012 4 次提交
  9. 18 9月, 2012 9 次提交
    • G
      powerpc/eeh: Fix crash on converting OF node to edev · 1e38b714
      Gavin Shan 提交于
      The kernel crash was reported by Alexy. He was testing some feature
      with private kernel, in which Alexy added some code in pci_pm_reset()
      to read the CSR after writting it. The bug could be reproduced on
      Fiber Channel card (Fibre Channel: Emulex Corporation Saturn-X:
      LightPulse Fibre Channel Host Adapter (rev 03)) by the following
      commands.
      
      	# echo 1 > /sys/devices/pci0004:01/0004:01:00.0/reset
      	# rmmod lpfc
      	# modprobe lpfc
      
      The history behind the test case is that those additional config
      space reading operations in pci_pm_reset() would cause EEH error,
      but we didn't detect EEH error until "modprobe lpfc". For the case,
      all the PCI devices on PCI bus (0004:01) were removed and added after
      PE reset. Then the EEH devices would be figured out again based on
      the OF nodes. Unfortunately, there were some child OF nodes under
      PCI device (0004:01:00.0), but they didn't have attached PCI_DN since
      they're invisible from PCI domain. However, we were still trying to
      convert OF node to EEH device without checking on the attached PCI_DN.
      Eventually, it caused the kernel crash as follows:
      
      Unable to handle kernel paging request for data at address 0x00000030
      Faulting instruction address: 0xc00000000004d888
      cpu 0x0: Vector: 300 (Data Access) at [c000000fc797b950]
          pc: c00000000004d888: .eeh_add_device_tree_early+0x78/0x140
          lr: c00000000004d880: .eeh_add_device_tree_early+0x70/0x140
          sp: c000000fc797bbd0
         msr: 8000000000009032
         dar: 30
       dsisr: 40000000
        current = 0xc000000fc78d9f70
        paca    = 0xc00000000edb0000   softe: 0        irq_happened: 0x00
          pid   = 2951, comm = eehd
      enter ? for help
      [c000000fc797bc50] c00000000004d848 .eeh_add_device_tree_early+0x38/0x140
      [c000000fc797bcd0] c00000000004d848 .eeh_add_device_tree_early+0x38/0x140
      [c000000fc797bd50] c000000000051b54 .pcibios_add_pci_devices+0x34/0x190
      [c000000fc797bde0] c00000000004fb10 .eeh_reset_device+0x100/0x160
      [c000000fc797be70] c0000000000502dc .eeh_handle_event+0x19c/0x300
      [c000000fc797bf00] c000000000050570 .eeh_event_handler+0x130/0x1a0
      [c000000fc797bf90] c000000000020138 .kernel_thread+0x54/0x70
      
      The patch changes of_node_to_eeh_dev() and just returns NULL if the
      passed OF node doesn't have attached PCI_DN.
      
      Cc: stable@vger.kernel.org
      Reported-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      1e38b714
    • G
      powerpc/eeh: Lock module while handling EEH event · feadf7c0
      Gavin Shan 提交于
      The EEH core is talking with the PCI device driver to determine the
      action (purely reset, or PCI device removal). During the period, the
      driver might be unloaded and in turn causes kernel crash as follows:
      
      EEH: Detected PCI bus error on PHB#4-PE#10000
      EEH: This PCI device has failed 3 times in the last hour
      lpfc 0004:01:00.0: 0:2710 PCI channel disable preparing for reset
      Unable to handle kernel paging request for data at address 0x00000490
      Faulting instruction address: 0xd00000000e682c90
      cpu 0x1: Vector: 300 (Data Access) at [c000000fc75ffa20]
          pc: d00000000e682c90: .lpfc_io_error_detected+0x30/0x240 [lpfc]
          lr: d00000000e682c8c: .lpfc_io_error_detected+0x2c/0x240 [lpfc]
          sp: c000000fc75ffca0
         msr: 8000000000009032
         dar: 490
       dsisr: 40000000
        current = 0xc000000fc79b88b0
        paca    = 0xc00000000edb0380	 softe: 0	 irq_happened: 0x00
          pid   = 3386, comm = eehd
      enter ? for help
      [c000000fc75ffca0] c000000fc75ffd30 (unreliable)
      [c000000fc75ffd30] c00000000004fd3c .eeh_report_error+0x7c/0xf0
      [c000000fc75ffdc0] c00000000004ee00 .eeh_pe_dev_traverse+0xa0/0x180
      [c000000fc75ffe70] c00000000004ffd8 .eeh_handle_event+0x68/0x300
      [c000000fc75fff00] c0000000000503a0 .eeh_event_handler+0x130/0x1a0
      [c000000fc75fff90] c000000000020138 .kernel_thread+0x54/0x70
      1:mon>
      
      The patch increases the reference of the corresponding driver modules
      while EEH core does the negotiation with PCI device driver so that the
      corresponding driver modules can't be unloaded during the period and
      we're safe to refer the callbacks.
      
      Cc: stable@vger.kernel.org
      Reported-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      feadf7c0
    • T
      powerpc/kprobe: Don't emulate store when kprobe stwu r1 · 8e9f6937
      Tiejun Chen 提交于
      We don't do the real store operation for kprobing 'stwu Rx,(y)R1'
      since this may corrupt the exception frame, now we will do this
      operation safely in exception return code after migrate current
      exception frame below the kprobed function stack.
      
      So we only update gpr[1] here and trigger a thread flag to mask
      this.
      
      Note we should make sure if we trigger kernel stack over flow.
      Signed-off-by: NTiejun Chen <tiejun.chen@windriver.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      8e9f6937
    • T
      powerpc/kprobe: Complete kprobe and migrate exception frame · a9c4e541
      Tiejun Chen 提交于
      We can't emulate stwu since that may corrupt current exception stack.
      So we will have to do real store operation in the exception return code.
      
      Firstly we'll allocate a trampoline exception frame below the kprobed
      function stack and copy the current exception frame to the trampoline.
      Then we can do this real store operation to implement 'stwu', and reroute
      the trampoline frame to r1 to complete this exception migration.
      Signed-off-by: NTiejun Chen <tiejun.chen@windriver.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      a9c4e541
    • T
      powerpc/kprobe: Introduce a new thread flag · f0d1128f
      Tiejun Chen 提交于
      We need to add a new thread flag, TIF_EMULATE_STACK_STORE,
      for emulating stack store operation while exiting exception.
      Signed-off-by: NTiejun Chen <tiejun.chen@windriver.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      f0d1128f
    • B
      powerpc: Remove unused __get_user64() and __put_user64() · 52ab3b2b
      Bharat Bhushan 提交于
      __get_user64()  and __put_user64() are not used.
      Signed-off-by: NBharat Bhushan <bharat.bhushan@freescale.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      52ab3b2b
    • G
      powerpc/eeh: Global mutex to protect PE tree · ea81245c
      Gavin Shan 提交于
      We have missed lots of situations where the PE hierarchy tree need
      protection through the EEH global mutex. The patch fixes that for
      those public APIs implemented in eeh_pe.c. The only exception is
      eeh_pe_restore_bars() because it calls eeh_pe_dev_traverse(), which
      has been protected by the mutex.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      ea81245c
    • G
      powerpc/eeh: Remove EEH PE for normal PCI hotplug · 20ee6a97
      Gavin Shan 提交于
      Function eeh_rmv_from_parent_pe() could be called by the path of
      either normal PCI hotplug, or EEH recovery. For the former case,
      we need purge the corresponding PE on removal of the associated
      PE bus.
      
      The patch tries to cover that by passing more information to function
      pcibios_remove_pci_devices() so that we know if the corresponding PE
      needs to be purged or be marked as "invalid".
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      20ee6a97
    • G
      powerpc/eeh: Introduce EEH_PE_INVALID type PE · 5efc3ad7
      Gavin Shan 提交于
      When EEH error happens on the PE whose PCI devices don't have
      attached drivers. In function eeh_handle_event(), the default
      value PCI_ERS_RESULT_NONE will be returned after iterating all
      drivers of those PCI devices belonging to the PE. Actually, we
      don't have installed drivers for the PCI devices. Under the
      circumstance, we will remove the corresponding PCI bus of the PE,
      including the associated EEH devices and PE instance. However,
      we still need the information stored in the PE instance to do PE
      reset after that. So it's unsafe to free the PE instance.
      
      The patch introduces EEH_PE_INVALID type PE to address the issue.
      When the PCI bus and the corresponding attached EEH devices are
      removed, we will mark the PE as EEH_PE_INVALID. At later point,
      the PE will be changed to EEH_PE_DEVICE or EEH_PE_BUS when the
      corresponding EEH devices are attached again.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      5efc3ad7
新手
引导
客服 返回
顶部