1. 19 12月, 2012 1 次提交
  2. 16 12月, 2012 1 次提交
  3. 15 11月, 2012 1 次提交
  4. 30 10月, 2012 4 次提交
    • J
      x86-64/efi: Use EFI to deal with platform wall clock (again) · bd52276f
      Jan Beulich 提交于
      Other than ix86, x86-64 on EFI so far didn't set the
      {g,s}et_wallclock accessors to the EFI routines, thus
      incorrectly using raw RTC accesses instead.
      
      Simply removing the #ifdef around the respective code isn't
      enough, however: While so far early get-time calls were done in
      physical mode, this doesn't work properly for x86-64, as virtual
      addresses would still need to be set up for all runtime regions
      (which wasn't the case on the system I have access to), so
      instead the patch moves the call to efi_enter_virtual_mode()
      ahead (which in turn allows to drop all code related to calling
      efi-get-time in physical mode).
      
      Additionally the earlier calling of efi_set_executable()
      requires the CPA code to cope, i.e. during early boot it must be
      avoided to call cpa_flush_array(), as the first thing this
      function does is a BUG_ON(irqs_disabled()).
      
      Also make the two EFI functions in question here static -
      they're not being referenced elsewhere.
      
      History:
      
          This commit was originally merged as bacef661 ("x86-64/efi:
          Use EFI to deal with platform wall clock") but it resulted in some
          ASUS machines no longer booting due to a firmware bug, and so was
          reverted in f026cfa8. A pre-emptive fix for the buggy ASUS
          firmware was merged in 03a1c254975e ("x86, efi: 1:1 pagetable
          mapping for virtual EFI calls") so now this patch can be
          reapplied.
      Signed-off-by: NJan Beulich <jbeulich@suse.com>
      Tested-by: NMatt Fleming <matt.fleming@intel.com>
      Acked-by: NMatthew Garrett <mjg@redhat.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Signed-off-by: Matt Fleming <matt.fleming@intel.com> [added commit history]
      bd52276f
    • M
      x86, efi: 1:1 pagetable mapping for virtual EFI calls · 185034e7
      Matt Fleming 提交于
      Some firmware still needs a 1:1 (virt->phys) mapping even after we've
      called SetVirtualAddressMap(). So install the mapping alongside our
      existing kernel mapping whenever we make EFI calls in virtual mode.
      
      This bug was discovered on ASUS machines where the firmware
      implementation of GetTime() accesses the RTC device via physical
      addresses, even though that's bogus per the UEFI spec since we've
      informed the firmware via SetVirtualAddressMap() that the boottime
      memory map is no longer valid.
      
      This bug seems to be present in a lot of consumer devices, so there's
      not a lot we can do about this spec violation apart from workaround
      it.
      
      Cc: JérômeCarretero <cJ-ko@zougloub.eu>
      Cc: Vasco Dias <rafa.vasco@gmail.com>
      Acked-by: NJan Beulich <jbeulich@suse.com>
      Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
      185034e7
    • M
      x86/ce4100: Fix reboot by forcing the reboot method to be KBD · d7959916
      Maxime Bizon 提交于
      The default reboot is via ACPI for this platform, and the CEFDK
      bootloader actually supports this, but will issue a system power
      off instead of a real reboot. Setting the reboot method to be
      KBD instead of ACPI ensures proper system reboot.
      Acked-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: NMaxime Bizon <mbizon@freebox.fr>
      Signed-off-by: NFlorian Fainelli <ffainelli@freebox.fr>
      Cc: rui.zhang@intel.com
      Cc: alan@linux.intel.com
      Link: http://lkml.kernel.org/r/1351518020-25556-3-git-send-email-ffainelli@freebox.frSigned-off-by: NIngo Molnar <mingo@kernel.org>
      d7959916
    • F
      x86/ce4100: Fix pm_poweroff · f49f4ab9
      Florian Fainelli 提交于
      The CE4100 platform is currently missing a proper pm_poweroff
      implementation leading to poweroff making the CPU spin forever
      and the CE4100 platform does not enter a low-power mode where
      the external Power Management Unit can properly power off the
      system. Power off on this platform is implemented pretty much
      like reboot, by writing to the SoC built-in 8051 microcontroller
      mapped at I/O port 0xcf9, the value 0x4.
      Signed-off-by: NFlorian Fainelli <ffainelli@freebox.fr>
      Acked-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: rui.zhang@intel.com
      Cc: alan@linux.intel.com
      Link: http://lkml.kernel.org/r/1351518020-25556-2-git-send-email-ffainelli@freebox.frSigned-off-by: NIngo Molnar <mingo@kernel.org>
      f49f4ab9
  5. 26 10月, 2012 1 次提交
  6. 25 10月, 2012 1 次提交
  7. 24 10月, 2012 1 次提交
    • M
      x86/efi: Fix oops caused by incorrect set_memory_uc() usage · 3e8fa263
      Matt Fleming 提交于
      Calling __pa() with an ioremap'd address is invalid. If we
      encounter an efi_memory_desc_t without EFI_MEMORY_WB set in
      ->attribute we currently call set_memory_uc(), which in turn
      calls __pa() on a potentially ioremap'd address.
      
      On CONFIG_X86_32 this results in the following oops:
      
        BUG: unable to handle kernel paging request at f7f22280
        IP: [<c10257b9>] reserve_ram_pages_type+0x89/0x210
        *pdpt = 0000000001978001 *pde = 0000000001ffb067 *pte = 0000000000000000
        Oops: 0000 [#1] PREEMPT SMP
        Modules linked in:
      
        Pid: 0, comm: swapper Not tainted 3.0.0-acpi-efi-0805 #3
         EIP: 0060:[<c10257b9>] EFLAGS: 00010202 CPU: 0
         EIP is at reserve_ram_pages_type+0x89/0x210
         EAX: 0070e280 EBX: 38714000 ECX: f7814000 EDX: 00000000
         ESI: 00000000 EDI: 38715000 EBP: c189fef0 ESP: c189fea8
         DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
        Process swapper (pid: 0, ti=c189e000 task=c18bbe60 task.ti=c189e000)
        Stack:
         80000200 ff108000 00000000 c189ff00 00038714 00000000 00000000 c189fed0
         c104f8ca 00038714 00000000 00038715 00000000 00000000 00038715 00000000
         00000010 38715000 c189ff48 c1025aff 38715000 00000000 00000010 00000000
        Call Trace:
         [<c104f8ca>] ? page_is_ram+0x1a/0x40
         [<c1025aff>] reserve_memtype+0xdf/0x2f0
         [<c1024dc9>] set_memory_uc+0x49/0xa0
         [<c19334d0>] efi_enter_virtual_mode+0x1c2/0x3aa
         [<c19216d4>] start_kernel+0x291/0x2f2
         [<c19211c7>] ? loglevel+0x1b/0x1b
         [<c19210bf>] i386_start_kernel+0xbf/0xc8
      
      The only time we can call set_memory_uc() for a memory region is
      when it is part of the direct kernel mapping. For the case where
      we ioremap a memory region we must leave it alone.
      
      This patch reimplements the fix from e8c71062 ("x86, efi:
      Calling __pa() with an ioremap()ed address is invalid") which
      was reverted in e1ad783b because it caused a regression on
      some MacBooks (they hung at boot). The regression was caused
      because the commit only marked EFI_RUNTIME_SERVICES_DATA as
      E820_RESERVED_EFI, when it should have marked all regions that
      have the EFI_MEMORY_RUNTIME attribute.
      
      Despite first impressions, it's not possible to use
      ioremap_cache() to map all cached memory regions on
      CONFIG_X86_64 because of the way that the memory map might be
      configured as detailed in the following bug report,
      
      	https://bugzilla.redhat.com/show_bug.cgi?id=748516
      
      e.g. some of the EFI memory regions *need* to be mapped as part
      of the direct kernel mapping.
      Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
      Cc: Matthew Garrett <mjg@redhat.com>
      Cc: Zhang Rui <rui.zhang@intel.com>
      Cc: Huang Ying <huang.ying.caritas@gmail.com>
      Cc: Keith Packard <keithp@keithp.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Link: http://lkml.kernel.org/r/1350649546-23541-1-git-send-email-matt@console-pimps.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      3e8fa263
  8. 30 9月, 2012 3 次提交
  9. 17 9月, 2012 1 次提交
  10. 15 8月, 2012 1 次提交
  11. 01 8月, 2012 5 次提交
  12. 13 7月, 2012 1 次提交
  13. 28 6月, 2012 1 次提交
    • A
      x86/flush_tlb: try flush_tlb_single one by one in flush_tlb_range · e7b52ffd
      Alex Shi 提交于
      x86 has no flush_tlb_range support in instruction level. Currently the
      flush_tlb_range just implemented by flushing all page table. That is not
      the best solution for all scenarios. In fact, if we just use 'invlpg' to
      flush few lines from TLB, we can get the performance gain from later
      remain TLB lines accessing.
      
      But the 'invlpg' instruction costs much of time. Its execution time can
      compete with cr3 rewriting, and even a bit more on SNB CPU.
      
      So, on a 512 4KB TLB entries CPU, the balance points is at:
      	(512 - X) * 100ns(assumed TLB refill cost) =
      		X(TLB flush entries) * 100ns(assumed invlpg cost)
      
      Here, X is 256, that is 1/2 of 512 entries.
      
      But with the mysterious CPU pre-fetcher and page miss handler Unit, the
      assumed TLB refill cost is far lower then 100ns in sequential access. And
      2 HT siblings in one core makes the memory access more faster if they are
      accessing the same memory. So, in the patch, I just do the change when
      the target entries is less than 1/16 of whole active tlb entries.
      Actually, I have no data support for the percentage '1/16', so any
      suggestions are welcomed.
      
      As to hugetlb, guess due to smaller page table, and smaller active TLB
      entries, I didn't see benefit via my benchmark, so no optimizing now.
      
      My micro benchmark show in ideal scenarios, the performance improves 70
      percent in reading. And in worst scenario, the reading/writing
      performance is similar with unpatched 3.4-rc4 kernel.
      
      Here is the reading data on my 2P * 4cores *HT NHM EP machine, with THP
      'always':
      
      multi thread testing, '-t' paramter is thread number:
      	       	        with patch   unpatched 3.4-rc4
      ./mprotect -t 1           14ns		24ns
      ./mprotect -t 2           13ns		22ns
      ./mprotect -t 4           12ns		19ns
      ./mprotect -t 8           14ns		16ns
      ./mprotect -t 16          28ns		26ns
      ./mprotect -t 32          54ns		51ns
      ./mprotect -t 128         200ns		199ns
      
      Single process with sequencial flushing and memory accessing:
      
      		       	with patch   unpatched 3.4-rc4
      ./mprotect		    7ns			11ns
      ./mprotect -p 4096  -l 8 -n 10240
      			    21ns		21ns
      
      [ hpa: http://lkml.kernel.org/r/1B4B44D9196EFF41AE41FDA404FC0A100BFF94@SHSMSX101.ccr.corp.intel.com
        has additional performance numbers. ]
      Signed-off-by: NAlex Shi <alex.shi@intel.com>
      Link: http://lkml.kernel.org/r/1340845344-27557-3-git-send-email-alex.shi@intel.comSigned-off-by: NH. Peter Anvin <hpa@zytor.com>
      e7b52ffd
  14. 25 6月, 2012 3 次提交
    • C
      x86/uv: Work around UV2 BAU hangs · 8b6e511e
      Cliff Wickman 提交于
      On SGI's UV2 the BAU (Broadcast Assist Unit) driver can hang
      under a heavy load. To cure this:
      
      - Disable the UV2 extended status mode (see UV2_EXT_SHFT), as
        this mode changes BAU behavior in more ways then just delivering
        an extra bit of status.  Revert status to just two meaningful bits,
        like UV1.
      
      - Use no IPI-style resets on UV2.  Just give up the request for
        whatever the reason it failed and let it be accomplished with
        the legacy IPI method.
      
      - Use no alternate sending descriptor (the former UV2 workaround
        bcp->using_desc and handle_uv2_busy() stuff).  Just disable the
        use of the BAU for a period of time in favor of the legacy IPI
        method when the h/w bug leaves a descriptor busy.
      
        -- new tunable: giveup_limit determines the threshold at which a hub is
           so plugged that it should do all requests with the legacy IPI method for a
           period of time
        -- generalize disable_for_congestion() (renamed disable_for_period()) for
           use whenever a hub should avoid using the BAU for a period of time
      
      Also:
      
       - Fix find_another_by_swack(), which is part of the UV2 bug workaround
      
       - Correct and clarify the statistics (new stats s_overipilimit, s_giveuplimit,
         s_enters, s_ipifordisabled, s_plugged, s_congested)
      Signed-off-by: NCliff Wickman <cpw@sgi.com>
      Link: http://lkml.kernel.org/r/20120622131459.GC31884@sgi.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      8b6e511e
    • C
      x86/uv: Implement UV BAU runtime enable and disable control via /proc/sgi_uv/ · 26ef8577
      Cliff Wickman 提交于
      This patch enables the BAU to be turned on or off dynamically.
      
        echo "on"  > /proc/sgi_uv/ptc_statistics
        echo "off" > /proc/sgi_uv/ptc_statistics
      
      The system may be booted with or without the nobau option.
      
      Whether the system currently has the BAU off can be seen in
      the /proc file -- normally with the baustats script.
      Each cpu will have a 1 in the bauoff field if the BAU was turned
      off, so baustats will give a count of cpus that have it off.
      Signed-off-by: NCliff Wickman <cpw@sgi.com>
      Link: http://lkml.kernel.org/r/20120622131330.GB31884@sgi.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      26ef8577
    • C
      x86/uv: Fix the UV BAU destination timeout period · 11cab711
      Cliff Wickman 提交于
      Correct the calculation of a destination timeout period, which
      is used to distinguish between a destination timeout and the
      situation where all the target software ack resources are full
      and a request is returned immediately.
      
      The problem is that integer arithmetic was overflowing, yielding
      a very large result.
      
      Without this fix destination timeouts are identified as resource
      'plugged' events and an ipi method of resource releasing is
      unnecessarily employed.
      Signed-off-by: NCliff Wickman <cpw@sgi.com>
      Link: http://lkml.kernel.org/r/20120622131212.GA31884@sgi.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      11cab711
  15. 16 6月, 2012 1 次提交
  16. 14 6月, 2012 1 次提交
  17. 08 6月, 2012 2 次提交
    • C
      x86/uv: Fix UV2 BAU legacy mode · d5d2d2ee
      Cliff Wickman 提交于
      The SGI Altix UV2 BAU (Broadcast Assist Unit) as used for
      tlb-shootdown (selective broadcast mode) always uses UV2
      broadcast descriptor format. There is no need to clear the
      'legacy' (UV1) mode, because the hardware always uses UV2 mode
      for selective broadcast.
      
      But the BIOS uses general broadcast and legacy mode, and the
      hardware pays attention to the legacy mode bit for general
      broadcast. So the kernel must not clear that mode bit.
      Signed-off-by: NCliff Wickman <cpw@sgi.com>
      Cc: <stable@kernel.org>
      Link: http://lkml.kernel.org/r/E1SccoO-0002Lh-Cb@eag09.americas.sgi.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      d5d2d2ee
    • A
      x86/apic: Make cpu_mask_to_apicid() operations return error code · ff164324
      Alexander Gordeev 提交于
      Current cpu_mask_to_apicid() and cpu_mask_to_apicid_and()
      implementations have few shortcomings:
      
      1. A value returned by cpu_mask_to_apicid() is written to
      hardware registers unconditionally. Should BAD_APICID get ever
      returned it will be written to a hardware too. But the value of
      BAD_APICID is not universal across all hardware in all modes and
      might cause unexpected results, i.e. interrupts might get routed
      to CPUs that are not configured to receive it.
      
      2. Because the value of BAD_APICID is not universal it is
      counter- intuitive to return it for a hardware where it does not
      make sense (i.e. x2apic).
      
      3. cpu_mask_to_apicid_and() operation is thought as an
      complement to cpu_mask_to_apicid() that only applies a AND mask
      on top of a cpumask being passed. Yet, as consequence of 18374d89
      commit the two operations are inconsistent in that of:
        cpu_mask_to_apicid() should not get a offline CPU with the cpumask
        cpu_mask_to_apicid_and() should not fail and return BAD_APICID
      These limitations are impossible to realize just from looking at
      the operations prototypes.
      
      Most of these shortcomings are resolved by returning a error
      code instead of BAD_APICID. As the result, faults are reported
      back early rather than possibilities to cause a unexpected
      behaviour exist (in case of [1]).
      
      The only exception is setup_timer_IRQ0_pin() routine. Although
      obviously controversial to this fix, its existing behaviour is
      preserved to not break the fragile check_timer() and would
      better addressed in a separate fix.
      Signed-off-by: NAlexander Gordeev <agordeev@redhat.com>
      Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Link: http://lkml.kernel.org/r/20120607131559.GF4759@dhcp-26-207.brq.redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      ff164324
  18. 06 6月, 2012 2 次提交
  19. 25 5月, 2012 1 次提交
  20. 18 5月, 2012 1 次提交
  21. 07 5月, 2012 2 次提交
  22. 05 5月, 2012 1 次提交
  23. 25 4月, 2012 1 次提交
  24. 28 3月, 2012 1 次提交
  25. 21 3月, 2012 2 次提交