1. 09 10月, 2012 1 次提交
  2. 25 7月, 2012 1 次提交
    • P
      sh: Fix up recursive fault in oops with unset TTB. · 90eed7d8
      Paul Mundt 提交于
      Presently the oops code looks for the pgd either from the mm context or
      the cached TTB value. There are presently cases where the TTB can be
      unset or otherwise cleared by hardware, which we weren't handling,
      resulting in recursive faults on the NULL pgd. In these cases we can
      simply reload from swapper_pg_dir and continue on as normal.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      90eed7d8
  3. 18 5月, 2012 2 次提交
  4. 14 5月, 2012 4 次提交
  5. 19 4月, 2012 2 次提交
    • S
      sh: Improve oops error reporting · 45c0e0e2
      Stuart Menefy 提交于
      In some cases the opps error reporting doesn't give enough information
      to diagnose the problem, only printing information if it is thought
      to be valid. Replace the current code with more detailed output.
      
      This code is based on the ARM reporting, with minor changes for the SH.
      
      [lethal@linux-sh.org: fixed up for 64-bit PTEs and pte_offset_kernel()]
      Signed-off-by: NStuart Menefy <stuart.menefy@st.com>
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      45c0e0e2
    • S
      sh: Fix error synchronising kernel page tables · 8d9a784d
      Stuart Menefy 提交于
      The problem is caused by the interaction of two features in the Linux
      memory management code.
      
      A processes address space is described by a struct mm_struct, and
      every thread has a pointer to the mm it should run in. The exception
      to this are kernel threads, which don't have an mm, and so borrow
      the mm from the last thread which ran. The system is bootstrapped
      by the initial kernel thread using init's mm (even though init hasn't
      been created yet, its mm is the static init_mm).
      
      The other feature is how the kernel handles the page table which
      describes the portion of the address space which is only visible when
      executing inside the kernel, and which is shared by all threads. On
      the SH4 the only portion of the kernel's address space which described
      using the page table is called P3, from 0xc0000000 to 0xdfffffff. This
      portion of the address space is divided into three:
        - mappings for dma_alloc_coherent()
        - mappings for vmalloc() and ioremap()
        - fixmap mappings, primarily used in copy_user_pages() to create
          kernel mappings of user pages with the correct cache colour.
      
      To optimise the TLB miss handler we don't want to add an additional
      condition which checks whether the faulting address is in the user or
      the kernel portion of the address space, and so all page tables have a
      common portion which describes the kernel part of the address
      space. As the SH4 uses a two level page table, only the kernel portion
      of first level page table (the pgd entries) is duplicated. These all
      point to the same second level entries (the pte's), and so no memory
      is wasted.
      
      The reference page table for the kernel is called the swapper_pg_dir,
      and when a new page table is created for a new process the kernel
      portion of the page table is copied from swapper_pg_dir. This works
      fine when changes only occur in the second level of the kernel's page
      table, or the first level entries are created before any new user
      processes. However if a change occurs to the first level of the page
      table, and there are existing processes which don't have this entry in
      their page table, this new entry needs to be added. This is done on
      demand, when the kernel accesses a P3 address which isn't mapped using
      the current page table, the code in vmalloc_fault() copies the entry
      from the reference page table (swapper_pg_dir) into the current
      processes page table.
      
      The bug which this patch addresses is that the code in vmalloc_fault()
      was not copying addresses which fell in the dma_alloc_coherent()
      portion of the address space, and it should have been copying any P3
      address.
      
      Why we hadn't seen this before, and what made this hard to reproduce,
      is that normally the kernel will have called dma_alloc_coherent(), and
      accessed the memory mapping created, before any user process
      runs. Typically drivers such as USB or SATA will have created and used
      mappings of this type during the kernel initialisation, when probing
      for the attached devices, before init runs. Ethernet is slightly
      different, as it normally only creates and accesses
      dma_alloc_coherent() mappings when the network is brought up, but if
      kernel level IP configuration is used this will also occur before any
      user space process runs. So the first reproduction of this problem
      which we saw was occurred when USB and SATA were removed from the
      kernel, and then bring up Ethernet from user space using ifconfig.
      I'd like to thank Joseph Bormolini who did the hard work reducing the
      problem to this simple to reproduce criteria.
      
      In your case the situation is slightly different, and turns out to
      depends on the exact kernel configuration (which we had) and your
      ramdisk contents (which we didn't - hence the need for some assumptions).
      
      In this case the problem is a side effect of kernel level module
      loading. Kernel subsystems sometimes trigger the load of kernel
      modules directly, for example the crypto subsystem tries to load the
      cryptomgr and MTD tries to load modules for Flash partitioning if
      these are not built into the kernel. This is done by the kernel
      creating a user process which runs insmod to try and load the
      appropriate module.
      
      In order for this to cause problems the system must be running with a
      initrd or initramfs, which contains an insmod executable - if the
      kernel can't find an insmod to run, no user process is created, and
      the problem doesn't occur.  If an insmod is found, a process is
      created to run it, which will inherit the kernel portion of the
      swapper_pg_dir first level page table. It doesn't matter whether the
      inmod is successful or not, but when the the kernel scheduler context
      switches back to the kernel initialisation thread, the insmod's mm is
      'borrowed' by the kernel thread, as it doesn't have an address space
      of its own. (Reference counting is used to ensure this mm is not
      destroyed, even though the user process which caused its creation may no
      longer exist.) If this address space doesn't have a first level page
      table entry for the consistent mappings, and a driver tries to access
      such a mapping, we are in the same situation as described above,
      except this time in a kernel thread rather than a user thread
      executing inside the kernel.
      
      See bugzilla: 15425, 15836, 15862, 16106, 16793
      Signed-off-by: NStuart Menefy <stuart.menefy@st.com>
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      8d9a784d
  6. 11 4月, 2012 1 次提交
  7. 29 3月, 2012 1 次提交
  8. 01 7月, 2011 1 次提交
    • P
      perf: Remove the nmi parameter from the swevent and overflow interface · a8b0ca17
      Peter Zijlstra 提交于
      The nmi parameter indicated if we could do wakeups from the current
      context, if not, we would set some state and self-IPI and let the
      resulting interrupt do the wakeup.
      
      For the various event classes:
      
        - hardware: nmi=0; PMI is in fact an NMI or we run irq_work_run from
          the PMI-tail (ARM etc.)
        - tracepoint: nmi=0; since tracepoint could be from NMI context.
        - software: nmi=[0,1]; some, like the schedule thing cannot
          perform wakeups, and hence need 0.
      
      As one can see, there is very little nmi=1 usage, and the down-side of
      not using it is that on some platforms some software events can have a
      jiffy delay in wakeup (when arch_irq_work_raise isn't implemented).
      
      The up-side however is that we can remove the nmi parameter and save a
      bunch of conditionals in fast paths.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Michael Cree <mcree@orcon.net.nz>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Eric B Munson <emunson@mgebm.net>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jason Wessel <jason.wessel@windriver.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Link: http://lkml.kernel.org/n/tip-agjev8eu666tvknpb3iaj0fg@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
      a8b0ca17
  9. 26 4月, 2010 2 次提交
  10. 21 2月, 2010 1 次提交
    • R
      MM: Pass a PTE pointer to update_mmu_cache() rather than the PTE itself · 4b3073e1
      Russell King 提交于
      On VIVT ARM, when we have multiple shared mappings of the same file
      in the same MM, we need to ensure that we have coherency across all
      copies.  We do this via make_coherent() by making the pages
      uncacheable.
      
      This used to work fine, until we allowed highmem with highpte - we
      now have a page table which is mapped as required, and is not available
      for modification via update_mmu_cache().
      
      Ralf Beache suggested getting rid of the PTE value passed to
      update_mmu_cache():
      
        On MIPS update_mmu_cache() calls __update_tlb() which walks pagetables
        to construct a pointer to the pte again.  Passing a pte_t * is much
        more elegant.  Maybe we might even replace the pte argument with the
        pte_t?
      
      Ben Herrenschmidt would also like the pte pointer for PowerPC:
      
        Passing the ptep in there is exactly what I want.  I want that
        -instead- of the PTE value, because I have issue on some ppc cases,
        for I$/D$ coherency, where set_pte_at() may decide to mask out the
        _PAGE_EXEC.
      
      So, pass in the mapped page table pointer into update_mmu_cache(), and
      remove the PTE value, updating all implementations and call sites to
      suit.
      
      Includes a fix from Stephen Rothwell:
      
        sparc: fix fallout from update_mmu_cache API change
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      4b3073e1
  11. 17 12月, 2009 1 次提交
    • M
      sh: Definitions for 3-level page table layout · 5d9b4b19
      Matt Fleming 提交于
      If using 64-bit PTEs and 4K pages then each page table has 512 entries
      (as opposed to 1024 entries with 32-bit PTEs). Unlike MIPS, SH follows
      the convention that all structures in the page table (pgd_t, pmd_t,
      pgprot_t, etc) must be the same size. Therefore, 64-bit PTEs require
      64-bit PGD entries, etc. Using 2-levels of page tables and 64-bit PTEs
      it is only possible to map 1GB of virtual address space.
      
      In order to map all 4GB of virtual address space we need to adopt a
      3-level page table layout. This actually works out better for
      CONFIG_SUPERH32 because we only waste 2 PGD entries on the P1 and P2
      areas (which are untranslated) instead of 256.
      Signed-off-by: NMatt Fleming <matt@console-pimps.org>
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      5d9b4b19
  12. 21 9月, 2009 1 次提交
    • I
      perf: Do the big rename: Performance Counters -> Performance Events · cdd6c482
      Ingo Molnar 提交于
      Bye-bye Performance Counters, welcome Performance Events!
      
      In the past few months the perfcounters subsystem has grown out its
      initial role of counting hardware events, and has become (and is
      becoming) a much broader generic event enumeration, reporting, logging,
      monitoring, analysis facility.
      
      Naming its core object 'perf_counter' and naming the subsystem
      'perfcounters' has become more and more of a misnomer. With pending
      code like hw-breakpoints support the 'counter' name is less and
      less appropriate.
      
      All in one, we've decided to rename the subsystem to 'performance
      events' and to propagate this rename through all fields, variables
      and API names. (in an ABI compatible fashion)
      
      The word 'event' is also a bit shorter than 'counter' - which makes
      it slightly more convenient to write/handle as well.
      
      Thanks goes to Stephane Eranian who first observed this misnomer and
      suggested a rename.
      
      User-space tooling and ABI compatibility is not affected - this patch
      should be function-invariant. (Also, defconfigs were not touched to
      keep the size down.)
      
      This patch has been generated via the following script:
      
        FILES=$(find * -type f | grep -vE 'oprofile|[^K]config')
      
        sed -i \
          -e 's/PERF_EVENT_/PERF_RECORD_/g' \
          -e 's/PERF_COUNTER/PERF_EVENT/g' \
          -e 's/perf_counter/perf_event/g' \
          -e 's/nb_counters/nb_events/g' \
          -e 's/swcounter/swevent/g' \
          -e 's/tpcounter_event/tp_event/g' \
          $FILES
      
        for N in $(find . -name perf_counter.[ch]); do
          M=$(echo $N | sed 's/perf_counter/perf_event/g')
          mv $N $M
        done
      
        FILES=$(find . -name perf_event.*)
      
        sed -i \
          -e 's/COUNTER_MASK/REG_MASK/g' \
          -e 's/COUNTER/EVENT/g' \
          -e 's/\<event\>/event_id/g' \
          -e 's/counter/event/g' \
          -e 's/Counter/Event/g' \
          $FILES
      
      ... to keep it as correct as possible. This script can also be
      used by anyone who has pending perfcounters patches - it converts
      a Linux kernel tree over to the new naming. We tried to time this
      change to the point in time where the amount of pending patches
      is the smallest: the end of the merge window.
      
      Namespace clashes were fixed up in a preparatory patch - and some
      stylistic fallout will be fixed up in a subsequent patch.
      
      ( NOTE: 'counters' are still the proper terminology when we deal
        with hardware registers - and these sed scripts are a bit
        over-eager in renaming them. I've undone some of that, but
        in case there's something left where 'counter' would be
        better than 'event' we can undo that on an individual basis
        instead of touching an otherwise nicely automated patch. )
      Suggested-by: NStephane Eranian <eranian@google.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NPaul Mackerras <paulus@samba.org>
      Reviewed-by: NArjan van de Ven <arjan@linux.intel.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: <linux-arch@vger.kernel.org>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cdd6c482
  13. 03 9月, 2009 1 次提交
    • P
      sh: Fix up and optimize the kmap_coherent() interface. · 0906a3ad
      Paul Mundt 提交于
      This fixes up the kmap_coherent/kunmap_coherent() interface for recent
      changes both in the page fault path and the shared cache flushers, as
      well as adding in some optimizations.
      
      One of the key things to note here is that the TLB flush itself is
      deferred until the unmap, and the call in to update_mmu_cache() itself
      goes away, relying on the regular page fault path to handle the lazy
      dcache writeback if necessary.
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      0906a3ad
  14. 15 8月, 2009 2 次提交
  15. 14 7月, 2009 1 次提交
    • M
      sh: Restore previous behaviour on kernel fault · 05dd2cd3
      Matt Fleming 提交于
      The last commit changed the behaviour on kernel faults when we were
      doing something other than syncing the page tables. vmalloc_sync_one()
      needs to return NULL if the page tables are up to date, because the
      reason for the fault was not a missing/inconsitent page table entry. By
      returning NULL if the page tables are sync'd we signal to the calling
      function that further work must be done to resolve this fault.
      
      Also, remove the superfluous __va() around the first argument to
      vmalloc_sync_one(). The value of pgd_k is already a virtual address and
      using it wth __va() causes a NULL dereference.
      Signed-off-by: NMatt Fleming <matt@console-pimps.org>
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      05dd2cd3
  16. 05 7月, 2009 2 次提交
  17. 25 6月, 2009 1 次提交
  18. 22 6月, 2009 1 次提交
  19. 18 6月, 2009 1 次提交
  20. 22 12月, 2008 2 次提交
    • P
      sh: Generic kgdb stub support. · ab6e570b
      Paul Mundt 提交于
      This migrates from the old bitrotted kgdb stub implementation and moves
      to the generic stub. In the process support for SH-2/SH-2A is also added,
      which the old stub never provided.
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      ab6e570b
    • M
      sh: P4 ioremap pass-through · 716777db
      Magnus Damm 提交于
      This patch adds a pass-through case when ioremapping P4 addresses.
      
      Addresses passed to ioremap() should be physical addresses, so the
      best option is usually to convert the virtual address to a physical
      address before calling ioremap. This will give you a virtual address
      in P2 which matches the physical address and this works well for
      most internal hardware blocks on the SuperH architecture.
      
      However, some hardware blocks must be accessed through P4. Converting
      the P4 address to a physical and then back to a P2 does not work. One
      example of this is the sh7722 TMU block, it must be accessed through P4.
      
      Without this patch P4 addresses will be mapped using PTEs which
      requires the page allocator to be up and running.
      Signed-off-by: NMagnus Damm <damm@igel.co.jp>
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      716777db
  21. 21 9月, 2008 3 次提交
    • P
      sh: Trivial trace_mark() instrumentation for core events. · 3d58695e
      Paul Mundt 提交于
      This implements a few trace points across events that are deemed
      interesting. This implements a number of trace points:
      
      	- The page fault handler / TLB miss
      	- IPC calls
      	- Kernel thread creation
      
      The original LTTng patch had the slow-path instrumented, which
      fails to account for the vast majority of events. In general
      placing this in the fast-path is not a huge performance hit, as
      we don't take page faults for kernel addresses.
      
      The other bits of interest are some of the other trap handlers, as
      well as the syscall entry/exit (which is better off being handled
      through the tracehook API). Most of the other trap handlers are corner
      cases where alternate means of notification exist, so there is little
      value in placing extra trace points in these locations.
      
      Based on top of the points provided both by the LTTng instrumentation
      patch as well as the patch shipping in the ST-Linux tree, albeit in a
      stripped down form.
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      3d58695e
    • P
      sh: Kill off duplicate page fault notifiers in slow path. · 8f2baee2
      Paul Mundt 提交于
      We already have hooks in place in the __do_page_fault() fast-path,
      so kill them off in the slow path.
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      8f2baee2
    • P
      887f1ae3
  22. 08 9月, 2008 2 次提交
  23. 28 7月, 2008 1 次提交
  24. 14 2月, 2008 2 次提交
    • H
      sh: Fix multiple UTLB hit on UP SH-4. · a602cc05
      Hideo Saito 提交于
      This acts as a reversion of 1c6b2ca5 in
      the case of UP SH-4, where we still have the risk of a multiple hit
      between the slow and fast paths. As seen on SH7780.
      Signed-off-by: NHideo Saito <saito@densan.co.jp>
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      a602cc05
    • M
      sh: trapped io support V2 · e7cc9a73
      Magnus Damm 提交于
      The idea is that we want to get rid of the in/out/readb/writeb callbacks from
      the machvec and replace that with simple inline read and write operations to
      memory. Fast and simple for most hardware devices (think pci).
      
      Some devices require special treatment though - like 16-bit only CF devices -
      so we need to have some method to hook in callbacks.
      
      This patch makes it possible to add a per-device trap generating filter. This
      way we can get maximum performance of sane hardware - which doesn't need this
      filter - and crappy hardware works but gets punished by a performance hit.
      
      V2 changes things around a bit and replaces io access callbacks with a
      simple minimum_bus_width value. In the future we can add stride as well.
      Signed-off-by: NMagnus Damm <damm@igel.co.jp>
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      e7cc9a73
  25. 28 1月, 2008 2 次提交
  26. 19 11月, 2007 1 次提交
    • P
      sh: lockless UTLB miss fast-path. · 0f1a394b
      Paul Mundt 提交于
      With the refactored update_mmu_cache() introduced in older kernels,
      there's no longer any need to take the page_table_lock in this path,
      so simply drop it completely.
      
      Without this, performance degradation is seen on SMP on heavily
      threaded workloads that don't use the split ptlock, and ultimately
      we have no reason to contend for the lock in the first place.
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      0f1a394b