1. 14 6月, 2013 6 次提交
    • S
      ARM64: mm: THP support. · af074848
      Steve Capper 提交于
      Bring Transparent HugePage support to ARM. The size of a
      transparent huge page depends on the normal page size. A
      transparent huge page is always represented as a pmd.
      
      If PAGE_SIZE is 4KB, THPs are 2MB.
      If PAGE_SIZE is 64KB, THPs are 512MB.
      Signed-off-by: NSteve Capper <steve.capper@linaro.org>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      af074848
    • S
      ARM64: mm: Raise MAX_ORDER for 64KB pages and THP. · d03bb145
      Steve Capper 提交于
      The buddy allocator has a default MAX_ORDER of 11, which is too
      low to allocate enough memory for 512MB Transparent HugePages if
      our base page size is 64KB.
      
      This patch introduces MAX_ZONE_ORDER and sets it to 14 when 64KB
      pages are used in conjuction with THP, otherwise the default value
      of 11 is used.
      Signed-off-by: NSteve Capper <steve.capper@linaro.org>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      d03bb145
    • S
      ARM64: mm: HugeTLB support. · 084bd298
      Steve Capper 提交于
      Add huge page support to ARM64, different huge page sizes are
      supported depending on the size of normal pages:
      
      PAGE_SIZE is 4KB:
         2MB - (pmds) these can be allocated at any time.
      1024MB - (puds) usually allocated on bootup with the command line
               with something like: hugepagesz=1G hugepages=6
      
      PAGE_SIZE is 64KB:
       512MB - (pmds) usually allocated on bootup via command line.
      Signed-off-by: NSteve Capper <steve.capper@linaro.org>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      084bd298
    • S
      ARM64: mm: Move PTE_PROT_NONE bit. · 59911ca4
      Steve Capper 提交于
      Under ARM64, PTEs can be broadly categorised as follows:
         - Present and valid: Bit #0 is set. The PTE is valid and memory
           access to the region may fault.
      
         - Present and invalid: Bit #0 is clear and bit #1 is set.
           Represents present memory with PROT_NONE protection. The PTE
           is an invalid entry, and the user fault handler will raise a
           SIGSEGV.
      
         - Not present (file or swap): Bits #0 and #1 are clear.
           Memory represented has been paged out. The PTE is an invalid
           entry, and the fault handler will try and re-populate the
           memory where necessary.
      
      Huge PTEs are block descriptors that have bit #1 clear. If we wish
      to represent PROT_NONE huge PTEs we then run into a problem as
      there is no way to distinguish between regular and huge PTEs if we
      set bit #1.
      
      To resolve this ambiguity this patch moves PTE_PROT_NONE from
      bit #1 to bit #2 and moves PTE_FILE from bit #2 to bit #3. The
      number of swap/file bits is reduced by 1 as a consequence, leaving
      60 bits for file and swap entries.
      Signed-off-by: NSteve Capper <steve.capper@linaro.org>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      59911ca4
    • S
      ARM64: mm: Make PAGE_NONE pages read only and no-execute. · 072b1b62
      Steve Capper 提交于
      If we consider the following code sequence:
      
      	my_pte = pte_modify(entry, myprot);
      	x = pte_write(my_pte);
      	y = pte_exec(my_pte);
      
      If myprot comes from a PROT_NONE page, then x and y will both be
      true which is undesireable behaviour.
      
      This patch sets the no-execute and read-only bits for PAGE_NONE
      such that the code above will return false for both x and y.
      Signed-off-by: NSteve Capper <steve.capper@linaro.org>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      072b1b62
    • S
      ARM64: mm: Restore memblock limit when map_mem finished. · f6bc87c3
      Steve Capper 提交于
      In paging_init the memblock limit is set to restrict any addresses
      returned by early_alloc to fit within the initial direct kernel
      mapping in swapper_pg_dir. This allows map_mem to allocate puds,
      pmds and ptes from the initial direct kernel mapping.
      
      The limit stays low after paging_init() though, meaning any
      bootmem allocations will be from a restricted subset of memory.
      Gigabyte huge pages, for instance, are normally allocated from
      bootmem as their order (18) is too large for the default buddy
      allocator (MAX_ORDER = 11).
      
      This patch restores the memblock limit when map_mem has finished,
      allowing gigabyte huge pages (and other objects) to be allocated
      from all of bootmem.
      Signed-off-by: NSteve Capper <steve.capper@linaro.org>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      f6bc87c3
  2. 31 5月, 2013 2 次提交
  3. 25 5月, 2013 2 次提交
  4. 18 5月, 2013 1 次提交
  5. 14 5月, 2013 3 次提交
  6. 13 5月, 2013 2 次提交
  7. 10 5月, 2013 1 次提交
  8. 08 5月, 2013 5 次提交
  9. 01 5月, 2013 2 次提交
    • T
      dump_stack: unify debug information printed by show_regs() · a43cb95d
      Tejun Heo 提交于
      show_regs() is inherently arch-dependent but it does make sense to print
      generic debug information and some archs already do albeit in slightly
      different forms.  This patch introduces a generic function to print debug
      information from show_regs() so that different archs print out the same
      information and it's much easier to modify what's printed.
      
      show_regs_print_info() prints out the same debug info as dump_stack()
      does plus task and thread_info pointers.
      
      * Archs which didn't print debug info now do.
      
        alpha, arc, blackfin, c6x, cris, frv, h8300, hexagon, ia64, m32r,
        metag, microblaze, mn10300, openrisc, parisc, score, sh64, sparc,
        um, xtensa
      
      * Already prints debug info.  Replaced with show_regs_print_info().
        The printed information is superset of what used to be there.
      
        arm, arm64, avr32, mips, powerpc, sh32, tile, unicore32, x86
      
      * s390 is special in that it used to print arch-specific information
        along with generic debug info.  Heiko and Martin think that the
        arch-specific extra isn't worth keeping s390 specfic implementation.
        Converted to use the generic version.
      
      Note that now all archs print the debug info before actual register
      dumps.
      
      An example BUG() dump follows.
      
       kernel BUG at /work/os/work/kernel/workqueue.c:4841!
       invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
       Modules linked in:
       CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.9.0-rc1-work+ #7
       Hardware name: empty empty/S3992, BIOS 080011  10/26/2007
       task: ffff88007c85e040 ti: ffff88007c860000 task.ti: ffff88007c860000
       RIP: 0010:[<ffffffff8234a07e>]  [<ffffffff8234a07e>] init_workqueues+0x4/0x6
       RSP: 0000:ffff88007c861ec8  EFLAGS: 00010246
       RAX: ffff88007c861fd8 RBX: ffffffff824466a8 RCX: 0000000000000001
       RDX: 0000000000000046 RSI: 0000000000000001 RDI: ffffffff8234a07a
       RBP: ffff88007c861ec8 R08: 0000000000000000 R09: 0000000000000000
       R10: 0000000000000001 R11: 0000000000000000 R12: ffffffff8234a07a
       R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
       FS:  0000000000000000(0000) GS:ffff88007dc00000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
       CR2: ffff88015f7ff000 CR3: 00000000021f1000 CR4: 00000000000007f0
       DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
       Stack:
        ffff88007c861ef8 ffffffff81000312 ffffffff824466a8 ffff88007c85e650
        0000000000000003 0000000000000000 ffff88007c861f38 ffffffff82335e5d
        ffff88007c862080 ffffffff8223d8c0 ffff88007c862080 ffffffff81c47760
       Call Trace:
        [<ffffffff81000312>] do_one_initcall+0x122/0x170
        [<ffffffff82335e5d>] kernel_init_freeable+0x9b/0x1c8
        [<ffffffff81c47760>] ? rest_init+0x140/0x140
        [<ffffffff81c4776e>] kernel_init+0xe/0xf0
        [<ffffffff81c6be9c>] ret_from_fork+0x7c/0xb0
        [<ffffffff81c47760>] ? rest_init+0x140/0x140
        ...
      
      v2: Typo fix in x86-32.
      
      v3: CPU number dropped from show_regs_print_info() as
          dump_stack_print_info() has been updated to print it.  s390
          specific implementation dropped as requested by s390 maintainers.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Acked-by: NJesper Nilsson <jesper.nilsson@axis.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      Cc: Mike Frysinger <vapier@gentoo.org>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Acked-by: Chris Metcalf <cmetcalf@tilera.com>		[tile bits]
      Acked-by: Richard Kuo <rkuo@codeaurora.org>		[hexagon bits]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a43cb95d
    • T
      dump_stack: consolidate dump_stack() implementations and unify their behaviors · 196779b9
      Tejun Heo 提交于
      Both dump_stack() and show_stack() are currently implemented by each
      architecture.  show_stack(NULL, NULL) dumps the backtrace for the
      current task as does dump_stack().  On some archs, dump_stack() prints
      extra information - pid, utsname and so on - in addition to the
      backtrace while the two are identical on other archs.
      
      The usages in arch-independent code of the two functions indicate
      show_stack(NULL, NULL) should print out bare backtrace while
      dump_stack() is used for debugging purposes when something went wrong,
      so it does make sense to print additional information on the task which
      triggered dump_stack().
      
      There's no reason to require archs to implement two separate but mostly
      identical functions.  It leads to unnecessary subtle information.
      
      This patch expands the dummy fallback dump_stack() implementation in
      lib/dump_stack.c such that it prints out debug information (taken from
      x86) and invokes show_stack(NULL, NULL) and drops arch-specific
      dump_stack() implementations in all archs except blackfin.  Blackfin's
      dump_stack() does something wonky that I don't understand.
      
      Debug information can be printed separately by calling
      dump_stack_print_info() so that arch-specific dump_stack()
      implementation can still emit the same debug information.  This is used
      in blackfin.
      
      This patch brings the following behavior changes.
      
      * On some archs, an extra level in backtrace for show_stack() could be
        printed.  This is because the top frame was determined in
        dump_stack() on those archs while generic dump_stack() can't do that
        reliably.  It can be compensated by inlining dump_stack() but not
        sure whether that'd be necessary.
      
      * Most archs didn't use to print debug info on dump_stack().  They do
        now.
      
      An example WARN dump follows.
      
       WARNING: at kernel/workqueue.c:4841 init_workqueues+0x35/0x505()
       Hardware name: empty
       Modules linked in:
       CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.9.0-rc1-work+ #9
        0000000000000009 ffff88007c861e08 ffffffff81c614dc ffff88007c861e48
        ffffffff8108f50f ffffffff82228240 0000000000000040 ffffffff8234a03c
        0000000000000000 0000000000000000 0000000000000000 ffff88007c861e58
       Call Trace:
        [<ffffffff81c614dc>] dump_stack+0x19/0x1b
        [<ffffffff8108f50f>] warn_slowpath_common+0x7f/0xc0
        [<ffffffff8108f56a>] warn_slowpath_null+0x1a/0x20
        [<ffffffff8234a071>] init_workqueues+0x35/0x505
        ...
      
      v2: CPU number added to the generic debug info as requested by s390
          folks and dropped the s390 specific dump_stack().  This loses %ksp
          from the debug message which the maintainers think isn't important
          enough to keep the s390-specific dump_stack() implementation.
      
          dump_stack_print_info() is moved to kernel/printk.c from
          lib/dump_stack.c.  Because linkage is per objecct file,
          dump_stack_print_info() living in the same lib file as generic
          dump_stack() means that archs which implement custom dump_stack()
          - at this point, only blackfin - can't use dump_stack_print_info()
          as that will bring in the generic version of dump_stack() too.  v1
          The v1 patch broke build on blackfin due to this issue.  The build
          breakage was reported by Fengguang Wu.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Acked-by: NVineet Gupta <vgupta@synopsys.com>
      Acked-by: NJesper Nilsson <jesper.nilsson@axis.com>
      Acked-by: NVineet Gupta <vgupta@synopsys.com>
      Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	[s390 bits]
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Mike Frysinger <vapier@gentoo.org>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Acked-by: Richard Kuo <rkuo@codeaurora.org>		[hexagon bits]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      196779b9
  10. 30 4月, 2013 4 次提交
  11. 26 4月, 2013 8 次提交
  12. 23 4月, 2013 1 次提交
  13. 19 4月, 2013 1 次提交
  14. 17 4月, 2013 2 次提交