1. 24 9月, 2009 2 次提交
    • R
      x86: Reduce verbosity of "PAT enabled" kernel message · e23a8b6a
      Roland Dreier 提交于
      On modern systems, the kernel prints the message
      
          x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
      
      once for every CPU.
      
      This gets kind of ridiculous on huge systems; for example, on a
      64-thread system I was lucky enough to get:
      
          dmesg| grep 'PAT enabled' | wc
               64     704    5174
      
      There is already a BUG() if non-boot CPUs have PAT capabilities
      that don't match the boot CPU, so just print the message on the
      boot CPU. (I kept the print after the wrmsrl() that enables PAT,
      so that the log output continues to mean that the system survived
      enabling PAT on the boot CPU)
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      LKML-Reference: <adavdj92sso.fsf@cisco.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e23a8b6a
    • R
      cpumask: use mm_cpumask() wrapper: x86 · 78f1c4d6
      Rusty Russell 提交于
      Makes code futureproof against the impending change to mm->cpu_vm_mask (to be a pointer).
      
      It's also a chance to use the new cpumask_ ops which take a pointer
      (the older ones are deprecated, but there's no hurry for arch code).
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      78f1c4d6
  2. 23 9月, 2009 5 次提交
  3. 22 9月, 2009 4 次提交
  4. 21 9月, 2009 1 次提交
    • I
      perf: Do the big rename: Performance Counters -> Performance Events · cdd6c482
      Ingo Molnar 提交于
      Bye-bye Performance Counters, welcome Performance Events!
      
      In the past few months the perfcounters subsystem has grown out its
      initial role of counting hardware events, and has become (and is
      becoming) a much broader generic event enumeration, reporting, logging,
      monitoring, analysis facility.
      
      Naming its core object 'perf_counter' and naming the subsystem
      'perfcounters' has become more and more of a misnomer. With pending
      code like hw-breakpoints support the 'counter' name is less and
      less appropriate.
      
      All in one, we've decided to rename the subsystem to 'performance
      events' and to propagate this rename through all fields, variables
      and API names. (in an ABI compatible fashion)
      
      The word 'event' is also a bit shorter than 'counter' - which makes
      it slightly more convenient to write/handle as well.
      
      Thanks goes to Stephane Eranian who first observed this misnomer and
      suggested a rename.
      
      User-space tooling and ABI compatibility is not affected - this patch
      should be function-invariant. (Also, defconfigs were not touched to
      keep the size down.)
      
      This patch has been generated via the following script:
      
        FILES=$(find * -type f | grep -vE 'oprofile|[^K]config')
      
        sed -i \
          -e 's/PERF_EVENT_/PERF_RECORD_/g' \
          -e 's/PERF_COUNTER/PERF_EVENT/g' \
          -e 's/perf_counter/perf_event/g' \
          -e 's/nb_counters/nb_events/g' \
          -e 's/swcounter/swevent/g' \
          -e 's/tpcounter_event/tp_event/g' \
          $FILES
      
        for N in $(find . -name perf_counter.[ch]); do
          M=$(echo $N | sed 's/perf_counter/perf_event/g')
          mv $N $M
        done
      
        FILES=$(find . -name perf_event.*)
      
        sed -i \
          -e 's/COUNTER_MASK/REG_MASK/g' \
          -e 's/COUNTER/EVENT/g' \
          -e 's/\<event\>/event_id/g' \
          -e 's/counter/event/g' \
          -e 's/Counter/Event/g' \
          $FILES
      
      ... to keep it as correct as possible. This script can also be
      used by anyone who has pending perfcounters patches - it converts
      a Linux kernel tree over to the new naming. We tried to time this
      change to the point in time where the amount of pending patches
      is the smallest: the end of the merge window.
      
      Namespace clashes were fixed up in a preparatory patch - and some
      stylistic fallout will be fixed up in a subsequent patch.
      
      ( NOTE: 'counters' are still the proper terminology when we deal
        with hardware registers - and these sed scripts are a bit
        over-eager in renaming them. I've undone some of that, but
        in case there's something left where 'counter' would be
        better than 'event' we can undo that on an individual basis
        instead of touching an otherwise nicely automated patch. )
      Suggested-by: NStephane Eranian <eranian@google.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NPaul Mackerras <paulus@samba.org>
      Reviewed-by: NArjan van de Ven <arjan@linux.intel.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: <linux-arch@vger.kernel.org>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cdd6c482
  5. 20 9月, 2009 1 次提交
  6. 18 9月, 2009 1 次提交
    • S
      x86, pat: don't use rb-tree based lookup in reserve_memtype() · dcb73bf4
      Suresh Siddha 提交于
      Recent enhancement of rb-tree based lookup exposed a  bug with the lookup
      mechanism in the reserve_memtype() which ensures that there are no conflicting
      memtype requests for the memory range.
      
      memtype_rb_search() returns an entry which has a start address <= new start
      address. And from here we traverse the linear linked list to check if there
      any conflicts with the existing mappings. As the rbtree is based on the
      start address of the memory range, it is quite possible that we have several
      overlapped mappings whose start address is much less than new requested start
      but the end is >= new requested end. This results in conflicting memtype
      mappings.
      
      Same bug exists with the old code which uses cached_entry from where
      we traverse the linear linked list. But the new rb-tree code exposes this
      bug fairly easily.
      
      For now, don't use the memtype_rb_search() and always start the search from
      the head of linear linked list in reserve_memtype(). Linear linked list
      for most of the systems grow's to few 10's of entries(as we track memory type
      of RAM pages using struct page). So we should be ok for now.
      
      We still retain the rbtree and use it to speed up free_memtype() which
      doesn't have the same bug(as we know what exactly we are searching for
      in free_memtype).
      
      Also use list_for_each_entry_from() in free_memtype() so that we start
      the search from rb-tree lookup result.
      Reported-by: NMarkus Trippelsdorf <markus@trippelsdorf.de>
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      LKML-Reference: <1253136483.4119.12.camel@sbs-t61.sc.intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      dcb73bf4
  7. 16 9月, 2009 1 次提交
  8. 12 9月, 2009 1 次提交
    • E
      agp/intel: Fix the pre-9xx chipset flush. · e517a5e9
      Eric Anholt 提交于
      Ever since we enabled GEM, the pre-9xx chipsets (particularly 865) have had
      serious stability issues.  Back in May a wbinvd was added to the DRM to
      work around much of the problem.  Some failure remained -- easily visible
      by dragging a window around on an X -retro desktop, or by looking at bugzilla.
      
      The chipset flush was on the right track -- hitting the right amount of
      memory, and it appears to be the only way to flush on these chipsets, but the
      flush page was mapped uncached.  As a result, the writes trying to clear the
      writeback cache ended up bypassing the cache, and not flushing anything!  The
      wbinvd would flush out other writeback data and often cause the data we wanted
      to get flushed, but not always.  By removing the setting of the page to UC
      and instead just clflushing the data we write to try to flush it, we get the
      desired behavior with no wbinvd.
      
      This exports clflush_cache_range(), which was laying around and happened to
      basically match the code I was otherwise going to copy from the DRM.
      Signed-off-by: NEric Anholt <eric@anholt.net>
      Signed-off-by: NBrice Goglin <Brice.Goglin@ens-lyon.org>
      Cc: stable@kernel.org
      e517a5e9
  9. 11 9月, 2009 2 次提交
  10. 10 9月, 2009 3 次提交
    • A
      x86: Export kmap_atomic_to_page() · 256cd2ef
      Avi Kivity 提交于
      Needed by KVM.
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      256cd2ef
    • J
      xen: make -fstack-protector work under Xen · 577eebea
      Jeremy Fitzhardinge 提交于
      -fstack-protector uses a special per-cpu "stack canary" value.
      gcc generates special code in each function to test the canary to make
      sure that the function's stack hasn't been overrun.
      
      On x86-64, this is simply an offset of %gs, which is the usual per-cpu
      base segment register, so setting it up simply requires loading %gs's
      base as normal.
      
      On i386, the stack protector segment is %gs (rather than the usual kernel
      percpu %fs segment register).  This requires setting up the full kernel
      GDT and then loading %gs accordingly.  We also need to make sure %gs is
      initialized when bringing up secondary cpus too.
      
      To keep things consistent, we do the full GDT/segment register setup on
      both architectures.
      
      Because we need to avoid -fstack-protected code before setting up the GDT
      and because there's no way to disable it on a per-function basis, several
      files need to have stack-protector inhibited.
      
      [ Impact: allow Xen booting with stack-protector enabled ]
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      577eebea
    • J
      x86, pat: Fix cacheflush address in change_page_attr_set_clr() · fa526d0d
      Jack Steiner 提交于
      Fix address passed to cpa_flush_range() when changing page
      attributes from WB to UC. The address (*addr) is
      modified by __change_page_attr_set_clr(). The result is that
      the pages being flushed start at the _end_ of the changed range
      instead of the beginning.
      
      This should be considered for 2.6.30-stable and 2.6.31-stable.
      Signed-off-by: NJack Steiner <steiner@sgi.com>
      Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: Stable team <stable@kernel.org>
      fa526d0d
  11. 06 9月, 2009 2 次提交
  12. 04 9月, 2009 1 次提交
    • P
      kmemleak: Don't scan uninitialized memory when kmemcheck is enabled · 8e019366
      Pekka Enberg 提交于
      Ingo Molnar reported the following kmemcheck warning when running both
      kmemleak and kmemcheck enabled:
      
        PM: Adding info for No Bus:vcsa7
        WARNING: kmemcheck: Caught 32-bit read from uninitialized memory
        (f6f6e1a4)
        d873f9f600000000c42ae4c1005c87f70000000070665f666978656400000000
         i i i i u u u u i i i i i i i i i i i i i i i i i i i i i u u u
                 ^
      
        Pid: 3091, comm: kmemleak Not tainted (2.6.31-rc7-tip #1303) P4DC6
        EIP: 0060:[<c110301f>] EFLAGS: 00010006 CPU: 0
        EIP is at scan_block+0x3f/0xe0
        EAX: f40bd700 EBX: f40bd780 ECX: f16b46c0 EDX: 00000001
        ESI: f6f6e1a4 EDI: 00000000 EBP: f10f3f4c ESP: c2605fcc
         DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
        CR0: 8005003b CR2: e89a4844 CR3: 30ff1000 CR4: 000006f0
        DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
        DR6: ffff4ff0 DR7: 00000400
         [<c110313c>] scan_object+0x7c/0xf0
         [<c1103389>] kmemleak_scan+0x1d9/0x400
         [<c1103a3c>] kmemleak_scan_thread+0x4c/0xb0
         [<c10819d4>] kthread+0x74/0x80
         [<c10257db>] kernel_thread_helper+0x7/0x3c
         [<ffffffff>] 0xffffffff
        kmemleak: 515 new suspected memory leaks (see
        /sys/kernel/debug/kmemleak)
        kmemleak: 42 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
      
      The problem here is that kmemleak will scan partially initialized
      objects that makes kmemcheck complain. Fix that up by skipping
      uninitialized memory regions when kmemcheck is enabled.
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
      8e019366
  13. 27 8月, 2009 9 次提交
  14. 25 8月, 2009 1 次提交
  15. 22 8月, 2009 1 次提交
    • L
      x86: don't call '->send_IPI_mask()' with an empty mask · b04e6373
      Linus Torvalds 提交于
      As noted in 83d349f3 ("x86: don't send
      an IPI to the empty set of CPU's"), some APIC's will be very unhappy
      with an empty destination mask.  That commit added a WARN_ON() for that
      case, and avoided the resulting problem, but didn't fix the underlying
      reason for why those empty mask cases happened.
      
      This fixes that, by checking the result of 'cpumask_andnot()' of the
      current CPU actually has any other CPU's left in the set of CPU's to be
      sent a TLB flush, and not calling down to the IPI code if the mask is
      empty.
      
      The reason this started happening at all is that we started passing just
      the CPU mask pointers around in commit 4595f962 ("x86: change
      flush_tlb_others to take a const struct cpumask"), and when we did that,
      the cpumask was no longer thread-local.
      
      Before that commit, flush_tlb_mm() used to create it's own copy of
      'mm->cpu_vm_mask' and pass that copy down to the low-level flush
      routines after having tested that it was not empty.  But after changing
      it to just pass down the CPU mask pointer, the lower level TLB flush
      routines would now get a pointer to that 'mm->cpu_vm_mask', and that
      could still change - and become empty - after the test due to other
      CPU's having flushed their own TLB's.
      
      See
      
      	http://bugzilla.kernel.org/show_bug.cgi?id=13933
      
      for details.
      Tested-by: NThomas Björnell <thomas.bjornell@gmail.com>
      Cc: stable@kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b04e6373
  16. 21 8月, 2009 1 次提交
  17. 18 8月, 2009 1 次提交
    • S
      x86, pat: Allow ISA memory range uncacheable mapping requests · 1adcaafe
      Suresh Siddha 提交于
      Max Vozeler reported:
      >  Bug 13877 -  bogl-term broken with CONFIG_X86_PAT=y, works with =n
      >
      >  strace of bogl-term:
      >  814   mmap2(NULL, 65536, PROT_READ|PROT_WRITE, MAP_SHARED, 4, 0)
      >				 = -1 EAGAIN (Resource temporarily unavailable)
      >  814   write(2, "bogl: mmaping /dev/fb0: Resource temporarily unavailable\n",
      >	       57) = 57
      
      PAT code maps the ISA memory range as WB in the PAT attribute, so that
      fixed range MTRR registers define the actual memory type (UC/WC/WT etc).
      
      But the upper level is_new_memtype_allowed() API checks are failing,
      as the request here is for UC and the return tracked type is WB (Tracked type is
      WB as MTRR type for this legacy range potentially will be different for each
      4k page).
      
      Fix is_new_memtype_allowed() by always succeeding the ISA address range
      checks, as the null PAT (WB) and def MTRR fixed range register settings
      satisfy the memory type needs of the applications that map the ISA address
      range.
      Reported-and-Tested-by: NMax Vozeler <xam@debian.org>
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      1adcaafe
  18. 14 8月, 2009 1 次提交
  19. 04 8月, 2009 2 次提交