1. 17 4月, 2013 10 次提交
    • K
      xen/smp: Unifiy some of the PVs and PVHVM offline CPU path · b12abaa1
      Konrad Rzeszutek Wilk 提交于
      The "xen_cpu_die" and "xen_hvm_cpu_die" are very similar.
      Lets coalesce them.
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      b12abaa1
    • K
      xen/smp/pvhvm: Don't initialize IRQ_WORKER as we are using the native one. · 27d8b207
      Konrad Rzeszutek Wilk 提交于
      There is no need to use the PV version of the IRQ_WORKER mechanism
      as under PVHVM we are using the native version. The native
      version is using the SMP API.
      
      They just sit around unused:
      
        69:          0          0  xen-percpu-ipi       irqwork0
        83:          0          0  xen-percpu-ipi       irqwork1
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      27d8b207
    • K
      xen/spinlock: Disable IRQ spinlock (PV) allocation on PVHVM · 70dd4998
      Konrad Rzeszutek Wilk 提交于
      See git commit f10cd522
      (xen: disable PV spinlocks on HVM) for details.
      
      But we did not disable it everywhere - which means that when
      we boot as PVHVM we end up allocating per-CPU irq line for
      spinlock. This fixes that.
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      70dd4998
    • K
      xen/spinlock: Check against default value of -1 for IRQ line. · cb9c6f15
      Konrad Rzeszutek Wilk 提交于
      The default (uninitialized) value of the IRQ line is -1.
      Check if we already have allocated an spinlock interrupt line
      and if somebody is trying to do it again. Also set it to -1
      when we offline the CPU.
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      cb9c6f15
    • K
      xen/time: Add default value of -1 for IRQ and check for that. · ef35a4e6
      Konrad Rzeszutek Wilk 提交于
      If the timer interrupt has been de-init or is just now being
      initialized, the default value of -1 should be preset as
      interrupt line. Check for that and if something is odd
      WARN us.
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      ef35a4e6
    • K
      xen/time: Fix kasprintf splat when allocating timer%d IRQ line. · 7918c92a
      Konrad Rzeszutek Wilk 提交于
      When we online the CPU, we get this splat:
      
      smpboot: Booting Node 0 Processor 1 APIC 0x2
      installing Xen timer for CPU 1
      BUG: sleeping function called from invalid context at /home/konrad/ssd/konrad/linux/mm/slab.c:3179
      in_atomic(): 1, irqs_disabled(): 0, pid: 0, name: swapper/1
      Pid: 0, comm: swapper/1 Not tainted 3.9.0-rc6upstream-00001-g3884fad #1
      Call Trace:
       [<ffffffff810c1fea>] __might_sleep+0xda/0x100
       [<ffffffff81194617>] __kmalloc_track_caller+0x1e7/0x2c0
       [<ffffffff81303758>] ? kasprintf+0x38/0x40
       [<ffffffff813036eb>] kvasprintf+0x5b/0x90
       [<ffffffff81303758>] kasprintf+0x38/0x40
       [<ffffffff81044510>] xen_setup_timer+0x30/0xb0
       [<ffffffff810445af>] xen_hvm_setup_cpu_clockevents+0x1f/0x30
       [<ffffffff81666d0a>] start_secondary+0x19c/0x1a8
      
      The solution to that is use kasprintf in the CPU hotplug path
      that 'online's the CPU. That is, do it in in xen_hvm_cpu_notify,
      and remove the call to in xen_hvm_setup_cpu_clockevents.
      
      Unfortunatly the later is not a good idea as the bootup path
      does not use xen_hvm_cpu_notify so we would end up never allocating
      timer%d interrupt lines when booting. As such add the check for
      atomic() to continue.
      
      CC: stable@vger.kernel.org
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      7918c92a
    • K
      xen/smp/spinlock: Fix leakage of the spinlock interrupt line for every CPU online/offline · 66ff0fe9
      Konrad Rzeszutek Wilk 提交于
      While we don't use the spinlock interrupt line (see for details
      commit f10cd522 -
      xen: disable PV spinlocks on HVM) - we should still do the proper
      init / deinit sequence. We did not do that correctly and for the
      CPU init for PVHVM guest we would allocate an interrupt line - but
      failed to deallocate the old interrupt line.
      
      This resulted in leakage of an irq_desc but more importantly this splat
      as we online an offlined CPU:
      
      genirq: Flags mismatch irq 71. 0002cc20 (spinlock1) vs. 0002cc20 (spinlock1)
      Pid: 2542, comm: init.late Not tainted 3.9.0-rc6upstream #1
      Call Trace:
       [<ffffffff811156de>] __setup_irq+0x23e/0x4a0
       [<ffffffff81194191>] ? kmem_cache_alloc_trace+0x221/0x250
       [<ffffffff811161bb>] request_threaded_irq+0xfb/0x160
       [<ffffffff8104c6f0>] ? xen_spin_trylock+0x20/0x20
       [<ffffffff813a8423>] bind_ipi_to_irqhandler+0xa3/0x160
       [<ffffffff81303758>] ? kasprintf+0x38/0x40
       [<ffffffff8104c6f0>] ? xen_spin_trylock+0x20/0x20
       [<ffffffff810cad35>] ? update_max_interval+0x15/0x40
       [<ffffffff816605db>] xen_init_lock_cpu+0x3c/0x78
       [<ffffffff81660029>] xen_hvm_cpu_notify+0x29/0x33
       [<ffffffff81676bdd>] notifier_call_chain+0x4d/0x70
       [<ffffffff810bb2a9>] __raw_notifier_call_chain+0x9/0x10
       [<ffffffff8109402b>] __cpu_notify+0x1b/0x30
       [<ffffffff8166834a>] _cpu_up+0xa0/0x14b
       [<ffffffff816684ce>] cpu_up+0xd9/0xec
       [<ffffffff8165f754>] store_online+0x94/0xd0
       [<ffffffff8141d15b>] dev_attr_store+0x1b/0x20
       [<ffffffff81218f44>] sysfs_write_file+0xf4/0x170
       [<ffffffff811a2864>] vfs_write+0xb4/0x130
       [<ffffffff811a302a>] sys_write+0x5a/0xa0
       [<ffffffff8167ada9>] system_call_fastpath+0x16/0x1b
      cpu 1 spinlock event irq -16
      smpboot: Booting Node 0 Processor 1 APIC 0x2
      
      And if one looks at the /proc/interrupts right after
      offlining (CPU1):
      
        70:          0          0  xen-percpu-ipi       spinlock0
        71:          0          0  xen-percpu-ipi       spinlock1
        77:          0          0  xen-percpu-ipi       spinlock2
      
      There is the oddity of the 'spinlock1' still being present.
      
      CC: stable@vger.kernel.org
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      66ff0fe9
    • K
      xen/smp: Fix leakage of timer interrupt line for every CPU online/offline. · 888b65b4
      Konrad Rzeszutek Wilk 提交于
      In the PVHVM path when we do CPU online/offline path we would
      leak the timer%d IRQ line everytime we do a offline event. The
      online path (xen_hvm_setup_cpu_clockevents via
      x86_cpuinit.setup_percpu_clockev) would allocate a new interrupt
      line for the timer%d.
      
      But we would still use the old interrupt line leading to:
      
      kernel BUG at /home/konrad/ssd/konrad/linux/kernel/hrtimer.c:1261!
      invalid opcode: 0000 [#1] SMP
      RIP: 0010:[<ffffffff810b9e21>]  [<ffffffff810b9e21>] hrtimer_interrupt+0x261/0x270
      .. snip..
       <IRQ>
       [<ffffffff810445ef>] xen_timer_interrupt+0x2f/0x1b0
       [<ffffffff81104825>] ? stop_machine_cpu_stop+0xb5/0xf0
       [<ffffffff8111434c>] handle_irq_event_percpu+0x7c/0x240
       [<ffffffff811175b9>] handle_percpu_irq+0x49/0x70
       [<ffffffff813a74a3>] __xen_evtchn_do_upcall+0x1c3/0x2f0
       [<ffffffff813a760a>] xen_evtchn_do_upcall+0x2a/0x40
       [<ffffffff8167c26d>] xen_hvm_callback_vector+0x6d/0x80
       <EOI>
       [<ffffffff81666d01>] ? start_secondary+0x193/0x1a8
       [<ffffffff81666cfd>] ? start_secondary+0x18f/0x1a8
      
      There is also the oddity (timer1) in the /proc/interrupts after
      offlining CPU1:
      
        64:       1121          0  xen-percpu-virq      timer0
        78:          0          0  xen-percpu-virq      timer1
        84:          0       2483  xen-percpu-virq      timer2
      
      This patch fixes it.
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      CC: stable@vger.kernel.org
      888b65b4
    • J
      xen: drop tracking of IRQ vector · dec02dea
      Jan Beulich 提交于
      For quite a few Xen versions, this wasn't the IRQ vector anymore
      anyway, and it is not being used by the kernel for anything. Hence
      drop the field from struct irq_info, and respective function
      parameters.
      Signed-off-by: NJan Beulich <jbeulich@suse.com>
      Acked-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      dec02dea
    • D
      x86/xen: populate boot_params with EDD data · 96f28bc6
      David Vrabel 提交于
      During early setup of a dom0 kernel, populate boot_params with the
      Enhanced Disk Drive (EDD) and MBR signature data.  This makes
      information on the BIOS boot device available in /sys/firmware/edd/.
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      Acked-by: NJan Beulich <jbeulich@suse.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      96f28bc6
  2. 08 4月, 2013 4 次提交
  3. 07 4月, 2013 1 次提交
  4. 06 4月, 2013 1 次提交
    • J
      x86: Fix rebuild with EFI_STUB enabled · 91870824
      Jan Beulich 提交于
      eboot.o and efi_stub_$(BITS).o didn't get added to "targets", and hence
      their .cmd files don't get included by the build machinery, leading to
      the files always getting rebuilt.
      
      Rather than adding the two files individually, take the opportunity and
      add $(VMLINUX_OBJS) to "targets" instead, thus allowing the assignment
      at the top of the file to be shrunk quite a bit.
      
      At the same time, remove a pointless flags override line - the variable
      assigned to was misspelled anyway, and the options added are
      meaningless for assembly sources.
      
      [ hpa: the patch is not minimal, but I am taking it for -urgent anyway
        since the excess impact of the patch seems to be small enough. ]
      Signed-off-by: NJan Beulich <jbeulich@suse.com>
      Link: http://lkml.kernel.org/r/515C5D2502000078000CA6AD@nat28.tlf.novell.com
      Cc: Matthew Garrett <mjg@redhat.com>
      Cc: Matt Fleming <matt.fleming@intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      91870824
  5. 05 4月, 2013 6 次提交
  6. 03 4月, 2013 7 次提交
    • P
      ARM: 7690/1: mm: fix CONFIG_LPAE typos · 4e1db26a
      Paul Bolle 提交于
      CONFIG_LPAE doesn't exist: the correct option is CONFIG_ARM_LPAE, so fix
      up the two typos under arch/arm/.
      
      The fix to head.S is slightly scary, but this is just for setting up
      an early io-mapping for the serial port when running on a big-endian,
      LPAE system. Since these systems don't exist in the wild (at least, I
      have no access to one outside of kvmtool, which doesn't provide a serial
      port suitable for earlyprintk), then we can revisit the code later if it
      causes any problems.
      Signed-off-by: NPaul Bolle <pebolle@tiscali.nl>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      4e1db26a
    • R
      ARM: 7689/1: add unwind annotations to ftrace asm · b21e023b
      Rabin Vincent 提交于
      Add unwind annotations to the ftrace assembly code so that the function
      tracer's stacktracing options (func_stack_trace, etc.) work when
      CONFIG_ARM_UNWIND is enabled.
      Signed-off-by: NRabin Vincent <rabin@rab.in>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      b21e023b
    • W
      ARM: 7685/1: delay: use private ticks_per_jiffy field for timer-based delay ops · 6f3d90e5
      Will Deacon 提交于
      Commit 70264367 ("ARM: 7653/2: do not scale loops_per_jiffy when
      using a constant delay clock") fixed a problem with our timer-based
      delay loop, where loops_per_jiffy is scaled by cpufreq yet used directly
      by the timer delay ops.
      
      This patch fixes the problem in a more elegant way by keeping a private
      ticks_per_jiffy field in the delay ops, independent of loops_per_jiffy
      and therefore not subject to scaling. The loop-based delay continues to
      use loops_per_jiffy directly, as it should.
      Acked-by: NNicolas Pitre <nico@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      6f3d90e5
    • C
      ARM: 7684/1: errata: Workaround for Cortex-A15 erratum 798181 (TLBI/DSB operations) · 93dc6887
      Catalin Marinas 提交于
      On Cortex-A15 (r0p0..r3p2) the TLBI/DSB are not adequately shooting down
      all use of the old entries. This patch implements the erratum workaround
      which consists of:
      
      1. Dummy TLBIMVAIS and DSB on the CPU doing the TLBI operation.
      2. Send IPI to the CPUs that are running the same mm (and ASID) as the
         one being invalidated (or all the online CPUs for global pages).
      3. CPU receiving the IPI executes a DMB and CLREX (part of the exception
         return code already).
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      93dc6887
    • R
      ARM: 7682/1: cache-l2x0: fix masking of RTL revision numbering and set_debug init · 6e7aceeb
      Rob Herring 提交于
      Commit b8db6b88 (ARM: 7547/4: cache-l2x0: add support for Aurora L2 cache
      ctrl) moved the masking of the part ID which caused the RTL version to be
      lost. Commit 6248d060 (ARM: 7545/1: cache-l2x0: make outer_cache_fns a
      field of l2x0_of_data) changed how .set_debug is initialized. Both commits
      break commit 74ddcdb8 (ARM: 7608/1: l2x0: Only set .set_debug
      on PL310 r3p0 and earlier) which uses the RTL version to conditionally set
      .set_debug function pointer. Commit b8db6b88 also caused the printed cache
      ID to be missing the version information.
      
      Fix this by reverting how the part number is masked so the RTL version
      info is maintained. The cache-id-part DT property does not set the RTL
      bits so masking them should have no effect. Also, re-arrange the order
      of the function pointer init so the .set_debug function can be overridden.
      Reported-by: NPaolo Pisati <paolo.pisati@canonical.com>
      Signed-off-by: NRob Herring <rob.herring@calxeda.com>
      Cc: Gregory CLEMENT <gregory.clement@free-electrons.com>
      Cc: Yehuda Yitschak <yehuday@marvell.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      6e7aceeb
    • R
      ARM: iWMMXt: always enable iWMMXt support with PJ4 CPUs · 698613b6
      Russell King 提交于
      Jason Cooper reports these build errors:
      arch/arm/kernel/built-in.o: In function `iwmmxt_do':
      /.../arch/arm/kernel/pj4-cp0.c:36: undefined reference to `iwmmxt_task_release'
      /.../arch/arm/kernel/pj4-cp0.c:40: undefined reference to `iwmmxt_task_switch'
      make: *** [vmlinux] Error 1
      
      This is caused because the PJ4 code explicitly references the iWMMXt
      code, but doesn't require it to be built.  Fix this by ensuring that
      iWMMXt is always enabled with PJ4.
      Reported-by: NJason Cooper <jason@lakedaemon.net>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      698613b6
    • P
      x86: remove the x32 syscall bitmask from syscall_get_nr() · 8b4b9f27
      Paul Moore 提交于
      Commit fca460f9 simplified the x32
      implementation by creating a syscall bitmask, equal to 0x40000000, that
      could be applied to x32 syscalls such that the masked syscall number
      would be the same as a x86_64 syscall.  While that patch was a nice
      way to simplify the code, it went a bit too far by adding the mask to
      syscall_get_nr(); returning the masked syscall numbers can cause
      confusion with callers that expect syscall numbers matching the x32
      ABI, e.g. unmasked syscall numbers.
      
      This patch fixes this by simply removing the mask from syscall_get_nr()
      while preserving the other changes from the original commit.  While
      there are several syscall_get_nr() callers in the kernel, most simply
      check that the syscall number is greater than zero, in this case this
      patch will have no effect.  Of those remaining callers, they appear
      to be few, seccomp and ftrace, and from my testing of seccomp without
      this patch the original commit definitely breaks things; the seccomp
      filter does not correctly filter the syscalls due to the difference in
      syscall numbers in the BPF filter and the value from syscall_get_nr().
      Applying this patch restores the seccomp BPF filter functionality on
      x32.
      
      I've tested this patch with the seccomp BPF filters as well as ftrace
      and everything looks reasonable to me; needless to say general usage
      seemed fine as well.
      Signed-off-by: NPaul Moore <pmoore@redhat.com>
      Link: http://lkml.kernel.org/r/20130215172143.12549.10292.stgit@localhost
      Cc: <stable@vger.kernel.org>
      Cc: Will Drewry <wad@chromium.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      8b4b9f27
  7. 02 4月, 2013 2 次提交
    • H
      s390/mm: provide emtpy check_pgt_cache() function · 765a0cac
      Heiko Carstens 提交于
      All architectures need to provide a check_pgt_cache() function. The s390 one
      got lost somewhere.
      So reintroduce it to prevent future compile errors e.g. if Thomas Gleixner's
      idle loop rework patches get merged.
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      765a0cac
    • H
      s390/uaccess: fix page table walk · ea81531d
      Heiko Carstens 提交于
      When translating user space addresses to kernel addresses the follow_table()
      function had two bugs:
      
      - PROT_NONE mappings could be read accessed via the kernel mapping. That is
        e.g. putting a filename into a user page, then protecting the page with
        PROT_NONE and afterwards issuing the "open" syscall with a pointer to
        the filename would incorrectly succeed.
      
      - when walking the page tables it used the pgd/pud/pmd/pte primitives which
        with dynamic page tables give no indication which real level of page tables
        is being walked (region2, region3, segment or page table). So in case of an
        exception the translation exception code passed to __handle_fault() is not
        necessarily correct.
        This is not really an issue since __handle_fault() doesn't evaluate the code.
        Only in case of e.g. a SIGBUS this code gets passed to user space. If user
        space can do something sane with the value is a different question though.
      
      To fix these issues don't use any Linux primitives. Only walk the page tables
      like the hardware would do it, however we leave quite some checks away since
      we know that we only have full size page tables and each index is within bounds.
      
      In theory this should fix all issues...
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Reviewed-by: NGerald Schaefer <gerald.schaefer@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      ea81531d
  8. 31 3月, 2013 1 次提交
  9. 30 3月, 2013 2 次提交
  10. 29 3月, 2013 6 次提交