1. 10 3月, 2009 1 次提交
    • T
      percpu: make x86 addr <-> pcpu ptr conversion macros generic · e0100983
      Tejun Heo 提交于
      Impact: generic addr <-> pcpu ptr conversion macros
      
      There's nothing arch specific about x86 __addr_to_pcpu_ptr() and
      __pcpu_ptr_to_addr().  With proper __per_cpu_load and __per_cpu_start
      defined, they'll do the right thing regardless of actual layout.
      
      Move these macros from arch/x86/include/asm/percpu.h to mm/percpu.c
      and allow archs to override it as necessary.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      e0100983
  2. 08 3月, 2009 1 次提交
    • C
      x86: UV: remove uv_flush_tlb_others() WARN_ON · 3a450de1
      Cliff Wickman 提交于
      In uv_flush_tlb_others() (arch/x86/kernel/tlb_uv.c),
      the "WARN_ON(!in_atomic())" fails if CONFIG_PREEMPT is not enabled.
      
      And CONFIG_PREEMPT is not enabled by default in the distribution that
      most UV owners will use.
      
      We could #ifdef CONFIG_PREEMPT the warning, but that is not good form.
      And there seems to be no suitable fix to in_atomic() when CONFIG_PREMPT
      is not on.
      
      As Ingo commented:
      
        > and we have no proper primitive to test for atomicity. (mainly
        > because we dont know about atomicity on a non-preempt kernel)
      
      So we drop the WARN_ON.
      Signed-off-by: NCliff Wickman <cpw@sgi.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3a450de1
  3. 06 3月, 2009 4 次提交
    • T
      x86, percpu: setup reserved percpu area for x86_64 · 6b19b0c2
      Tejun Heo 提交于
      Impact: fix relocation overflow during module load
      
      x86_64 uses 32bit relocations for symbol access and static percpu
      symbols whether in core or modules must be inside 2GB of the percpu
      segement base which the dynamic percpu allocator doesn't guarantee.
      This patch makes x86_64 reserve PERCPU_MODULE_RESERVE bytes in the
      first chunk so that module percpu areas are always allocated from the
      first chunk which is always inside the relocatable range.
      
      This problem exists for any percpu allocator but is easily triggered
      when using the embedding allocator because the second chunk is located
      beyond 2GB on it.
      
      This patch also changes the meaning of PERCPU_DYNAMIC_RESERVE such
      that it only indicates the size of the area to reserve for dynamic
      allocation as static and dynamic areas can be separate.  New
      PERCPU_DYNAMIC_RESERVED is increased by 4k for both 32 and 64bits as
      the reserved area separation eats away some allocatable space and
      having slightly more headroom (currently between 4 and 8k after
      minimal boot sans module area) makes sense for common case
      performance.
      
      x86_32 can address anywhere from anywhere and doesn't need reserving.
      
      Mike Galbraith first reported the problem first and bisected it to the
      embedding percpu allocator commit.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NMike Galbraith <efault@gmx.de>
      Reported-by: NJaswinder Singh Rajput <jaswinder@kernel.org>
      6b19b0c2
    • T
      percpu, module: implement reserved allocation and use it for module percpu variables · edcb4639
      Tejun Heo 提交于
      Impact: add reserved allocation functionality and use it for module
      	percpu variables
      
      This patch implements reserved allocation from the first chunk.  When
      setting up the first chunk, arch can ask to set aside certain number
      of bytes right after the core static area which is available only
      through a separate reserved allocator.  This will be used primarily
      for module static percpu variables on architectures with limited
      relocation range to ensure that the module perpcu symbols are inside
      the relocatable range.
      
      If reserved area is requested, the first chunk becomes reserved and
      isn't available for regular allocation.  If the first chunk also
      includes piggy-back dynamic allocation area, a separate chunk mapping
      the same region is created to serve dynamic allocation.  The first one
      is called static first chunk and the second dynamic first chunk.
      Although they share the page map, their different area map
      initializations guarantee they serve disjoint areas according to their
      purposes.
      
      If arch doesn't setup reserved area, reserved allocation is handled
      like any other allocation.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      edcb4639
    • T
      x86: make embedding percpu allocator return excessive free space · 9a4f8a87
      Tejun Heo 提交于
      Impact: reduce unnecessary memory usage on certain configurations
      
      Embedding percpu allocator allocates unit_size *
      smp_num_possible_cpus() bytes consecutively and use it for the first
      chunk.  However, if the static area is small, this can result in
      excessive prellocated free space in the first chunk due to
      PCPU_MIN_UNIT_SIZE restriction.
      
      This patch makes embedding percpu allocator preallocate only what's
      necessary as described by PERPCU_DYNAMIC_RESERVE and return the
      leftover to the bootmem allocator.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      9a4f8a87
    • T
      percpu: use negative for auto for pcpu_setup_first_chunk() arguments · cafe8816
      Tejun Heo 提交于
      Impact: argument semantic cleanup
      
      In pcpu_setup_first_chunk(), zero @unit_size and @dyn_size meant
      auto-sizing.  It's okay for @unit_size as 0 doesn't make sense but 0
      dynamic reserve size is valid.  Alos, if arch @dyn_size is calculated
      from other parameters, it might end up passing in 0 @dyn_size and
      malfunction when the size is automatically adjusted.
      
      This patch makes both @unit_size and @dyn_size ssize_t and use -1 for
      auto sizing.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      cafe8816
  4. 04 3月, 2009 2 次提交
  5. 03 3月, 2009 12 次提交
  6. 02 3月, 2009 12 次提交
    • J
      xen: deal with virtually mapped percpu data · 9976b39b
      Jeremy Fitzhardinge 提交于
      The virtually mapped percpu space causes us two problems:
      
       - for hypercalls which take an mfn, we need to do a full pagetable
         walk to convert the percpu va into an mfn, and
      
       - when a hypercall requires a page to be mapped RO via all its aliases,
         we need to make sure its RO in both the percpu mapping and in the
         linear mapping
      
      This primarily affects the gdt and the vcpu info structure.
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Cc: Xen-devel <xen-devel@lists.xensource.com>
      Cc: Gerd Hoffmann <kraxel@redhat.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Tejun Heo <htejun@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9976b39b
    • J
      x86: add forward decl for tss_struct · 2fb6b2a0
      Jeremy Fitzhardinge 提交于
      Its the correct thing to do before using the struct in a prototype.
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2fb6b2a0
    • J
      x86: unify chunks of kernel/process*.c · 389d1fb1
      Jeremy Fitzhardinge 提交于
      With x86-32 and -64 using the same mechanism for managing the
      tss io permissions bitmap, large chunks of process*.c are
      trivially unifyable, including:
      
       - exit_thread
       - flush_thread
       - __switch_to_xtra (along with tsc enable/disable)
      
      and as bonus pickups:
      
       - sys_fork
       - sys_vfork
      
      (Note: asmlinkage expands to empty on x86-64)
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      389d1fb1
    • J
      x86-32: use non-lazy io bitmap context switching · db949bba
      Jeremy Fitzhardinge 提交于
      Impact: remove 32-bit optimization to prepare unification
      
      x86-32 and -64 differ in the way they context-switch tasks
      with io permission bitmaps.  x86-64 simply copies the next
      tasks io bitmap into place (if any) on context switch.  x86-32
      invalidates the bitmap on context switch, so that the next
      IO instruction will fault; at that point it installs the
      appropriate IO bitmap.
      
      This makes context switching IO-bitmap-using tasks a bit more
      less expensive, at the cost of making the next IO instruction
      slower due to the extra fault.  This tradeoff only makes sense
      if IO-bitmap-using processes are relatively common, but they
      don't actually use IO instructions very often.
      
      However, in a typical desktop system, the only process likely
      to be using IO bitmaps is the X server, and nothing at all on
      a server.  Therefore the lazy context switch doesn't really win
      all that much, and its just a gratuitious difference from
      64-bit code.
      
      This patch removes the lazy context switch, with a view to
      unifying this code in a later change.
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      db949bba
    • J
      x86_32: apic/numaq_32, fix section mismatch · b6122b38
      Jiri Slaby 提交于
      Remove __cpuinitdata section placement for translation_table
      structure, since it is referenced from a functions within .text.
      Signed-off-by: NJiri Slaby <jirislaby@gmail.com>
      Cc: Jiri Slaby <jirislaby@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Cc: H. Peter Anvin <hpa@zytor.com>
      b6122b38
    • J
      x86_32: apic/summit_32, fix section mismatch · 2fcb1f1f
      Jiri Slaby 提交于
      Remove __init section placement for some functions/data, so that
      we don't get section mismatch warnings.
      
      Also make inline function instead of empty setup_summit macro.
      
      [v2]
      One of them was not caught by
      DEBUG_SECTION_MISMATCH=y
      magic. Fix it.
      Signed-off-by: NJiri Slaby <jirislaby@gmail.com>
      Cc: Jiri Slaby <jirislaby@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Cc: H. Peter Anvin <hpa@zytor.com>
      2fcb1f1f
    • J
      x86_32: apic/es7000_32, fix section mismatch · 871d78c6
      Jiri Slaby 提交于
      Remove __init section placement for some functions, so that we don't
      get section mismatch warnings.
      
      [v2]:
      2 of them were not caught by
      DEBUG_SECTION_MISMATCH=y
      magic. Fix it.
      Signed-off-by: NJiri Slaby <jirislaby@gmail.com>
      Cc: Jiri Slaby <jirislaby@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Cc: H. Peter Anvin <hpa@zytor.com>
      871d78c6
    • J
      x86_32: apic/summit_32, fix cpu_mask_to_apicid · fae176d6
      Jiri Slaby 提交于
      Perform same-cluster checking even for masks with all (nr_cpu_ids)
      bits set and report correct apicid on success instead.
      
      While at it, convert it to for_each_cpu and newer cpumask api.
      Signed-off-by: NJiri Slaby <jirislaby@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      fae176d6
    • J
      x86_32: apic/es7000_32, fix cpu_mask_to_apicid · 0edc0b32
      Jiri Slaby 提交于
      Perform same-cluster checking even for masks with all (nr_cpu_ids)
      bits set and report BAD_APICID on failure.
      
      While at it, convert it to for_each_cpu.
      Signed-off-by: NJiri Slaby <jirislaby@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0edc0b32
    • J
      x86_32: apic/es7000_32, cpu_mask_to_apicid cleanup · c2b20cbd
      Jiri Slaby 提交于
      Remove es7000_cpu_mask_to_apicid_cluster completely, because it's
      almost the same as es7000_cpu_mask_to_apicid except 2 code paths.
      One of them is about to be removed soon, the another should be
      BAD_APICID (it's a fail path).
      
      The _cluster one was not invoked on apic->cpu_mask_to_apicid_and
      anyway, since there was no _cluster_and variant.
      
      Also use newer cpumask functions.
      Signed-off-by: NJiri Slaby <jirislaby@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c2b20cbd
    • J
      x86_32: apic/bigsmp_32, de-inline functions · 9694cd6c
      Jiri Slaby 提交于
      The ones which go only into struct apic are de-inlined
      by compiler anyway, so remove the inline specifier from them.
      
      Afterwards, remove bigsmp_setup_portio_remap completely as it
      is unused.
      Signed-off-by: NJiri Slaby <jirislaby@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9694cd6c
    • I
      x86, mm: dont use non-temporal stores in pagecache accesses · f1800536
      Ingo Molnar 提交于
      Impact: standardize IO on cached ops
      
      On modern CPUs it is almost always a bad idea to use non-temporal stores,
      as the regression in this commit has shown it:
      
        30d697fa: x86: fix performance regression in write() syscall
      
      The kernel simply has no good information about whether using non-temporal
      stores is a good idea or not - and trying to add heuristics only increases
      complexity and inserts fragility.
      
      The regression on cached write()s took very long to be found - over two
      years. So dont take any chances and let the hardware decide how it makes
      use of its caches.
      
      The only exception is drivers/gpu/drm/i915/i915_gem.c: there were we are
      absolutely sure that another entity (the GPU) will pick up the dirty
      data immediately and that the CPU will not touch that data before the
      GPU will.
      
      Also, keep the _nocache() primitives to make it easier for people to
      experiment with these details. There may be more clear-cut cases where
      non-cached copies can be used, outside of filemap.c.
      
      Cc: Salman Qazi <sqazi@google.com>
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f1800536
  7. 01 3月, 2009 3 次提交
    • I
      Revert "gpu/drm, x86, PAT: PAT support for io_mapping_*" · f5c1aa15
      Ingo Molnar 提交于
      This reverts commit 17581ad8.
      
      Sitsofe Wheeler reported that /dev/dri/card0 is MIA on his EeePC 900
      and bisected it to this commit.
      
      Graphics card is an i915 in an EeePC 900:
      
       00:02.0 VGA compatible controller [0300]:
         Intel Corporation Mobile 915GM/GMS/910GML
           Express Graphics Controller [8086:2592] (rev 04)
      
      ( Most likely the ioremap() of the driver failed and hence the card
        did not initialize. )
      Reported-by: NSitsofe Wheeler <sitsofe@yahoo.com>
      Bisected-by: NSitsofe Wheeler <sitsofe@yahoo.com>
      Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f5c1aa15
    • T
      bootmem, x86: further fixes for arch-specific bootmem wrapping · d0c4f570
      Tejun Heo 提交于
      Impact: fix new breakages introduced by previous fix
      
      Commit c1329375 tried to clean up
      bootmem arch wrapper but it wasn't quite correct.  Before the commit,
      the followings were broken.
      
      * Low level interface functions prefixed with __ ignored arch
        preference.
      
      * reserve_bootmem(...) can't be mapped into
        reserve_bootmem_node(NODE_DATA(0)->bdata, ...) because the node is
        not preference here.  The region specified MUST fall into the
        specified region; otherwise, it will panic.
      
      After the commit,
      
      * If allocation fails for the arch preferred node, it should fallback
        to whatever is available.  Instead, it simply failed allocation.
      
      There are too many internal details to allow generic wrapping and
      still keep things simple for archs.  Plus, all that arch wants is a
      way to prefer certain node over another.
      
      This patch drops the generic wrapping around alloc_bootmem_core() and
      add alloc_bootmem_core() instead.  If necessary, arch can define
      bootmem_arch_referred_node() macro or function which takes all
      allocation information and returns the preferred node.  bootmem
      generic code will always try the preferred node first and then
      fallback to other nodes as usual.
      
      Breakages noted and changes reviewed by Johannes Weiner.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      d0c4f570
    • J
      x86: remove double copy of show_cpuinfo_core for 32 and 64 bit · 327f4387
      Jaswinder Singh Rajput 提交于
      Impact: unification
      
      show_cpuinfo_core is identical for 32 and 64 bit and can be unified,
      and CONFIG_X86_HT inherently depends on CONFIG_X86_SMP.
      Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      327f4387
  8. 28 2月, 2009 5 次提交