1. 16 1月, 2016 2 次提交
    • K
      x86/PCI: Add driver for Intel Volume Management Device (VMD) · 185a383a
      Keith Busch 提交于
      The Intel Volume Management Device (VMD) is a Root Complex Integrated
      Endpoint that acts as a host bridge to a secondary PCIe domain.  BIOS can
      reassign one or more Root Ports to appear within a VMD domain instead of
      the primary domain.  The immediate benefit is that additional PCIe domains
      allow more than 256 buses in a system by letting bus numbers be reused
      across different domains.
      
      VMD domains do not define ACPI _SEG, so to avoid domain clashing with host
      bridges defining this segment, VMD domains start at 0x10000, which is
      greater than the highest possible 16-bit ACPI defined _SEG.
      
      This driver enumerates and enables the domain using the root bus
      configuration interface provided by the PCI subsystem.  The driver provides
      configuration space accessor functions (pci_ops), bus and memory resources,
      an MSI IRQ domain with irq_chip implementation, and DMA operations
      necessary to use devices through the VMD endpoint's interface.
      
      VMD routes I/O as follows:
      
         1) Configuration Space: BAR 0 ("CFGBAR") of VMD provides the base
         address and size for configuration space register access to VMD-owned
         root ports.  It works similarly to MMCONFIG for extended configuration
         space.  Bus numbering is independent and does not conflict with the
         primary domain.
      
         2) MMIO Space: BARs 2 and 4 ("MEMBAR1" and "MEMBAR2") of VMD provide the
         base address, size, and type for MMIO register access.  These addresses
         are not translated by VMD hardware; they are simply reservations to be
         distributed to root ports' memory base/limit registers and subdivided
         among devices downstream.
      
         3) DMA: To interact appropriately with an IOMMU, the source ID DMA read
         and write requests are translated to the bus-device-function of the VMD
         endpoint.  Otherwise, DMA operates normally without VMD-specific address
         translation.
      
         4) Interrupts: Part of VMD's BAR 4 is reserved for VMD's MSI-X Table and
         PBA.  MSIs from VMD domain devices and ports are remapped to appear as
         if they were issued using one of VMD's MSI-X table entries.  Each MSI
         and MSI-X address of VMD-owned devices and ports has a special format
         where the address refers to specific entries in the VMD's MSI-X table.
         As with DMA, the interrupt source ID is translated to VMD's
         bus-device-function.
      
         The driver provides its own MSI and MSI-X configuration functions
         specific to how MSI messages are used within the VMD domain, and
         provides an irq_chip for independent IRQ allocation to relay interrupts
         from VMD's interrupt handler to the appropriate device driver's handler.
      
         5) Errors: PCIe error message are intercepted by the root ports normally
         (e.g., AER), except with VMD, system errors (i.e., firmware first) are
         disabled by default.  AER and hotplug interrupts are translated in the
         same way as endpoint interrupts.
      
         6) VMD does not support INTx interrupts or IO ports.  Devices or drivers
         requiring these features should either not be placed below VMD-owned
         root ports, or VMD should be disabled by BIOS for such endpoints.
      
      [bhelgaas: add VMD BAR #defines, factor out vmd_cfg_addr(), rework VMD
      resource setup, whitespace, changelog]
      Signed-off-by: NKeith Busch <keith.busch@intel.com>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Acked-by: Thomas Gleixner <tglx@linutronix.de> (IRQ-related parts)
      185a383a
    • K
      x86/PCI: Allow DMA ops specific to a PCI domain · d9c3d6ff
      Keith Busch 提交于
      The Intel Volume Management Device (VMD) is a PCIe endpoint that acts as a
      host bridge to another PCI domain.  When devices below the VMD perform DMA,
      the VMD replaces their DMA source IDs with its own source ID.  Therefore,
      those devices require special DMA ops.
      
      Add interfaces to allow the VMD driver to set up dma_ops for the devices
      below it.
      
      [bhelgaas: remove "extern", add "static", changelog]
      Signed-off-by: NKeith Busch <keith.busch@intel.com>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      d9c3d6ff
  2. 22 11月, 2015 7 次提交
    • H
      parisc: Map kernel text and data on huge pages · 41b85a11
      Helge Deller 提交于
      Adjust the linker script and map_pages() to map kernel text and data on
      physical 1MB huge/large pages.
      Signed-off-by: NHelge Deller <deller@gmx.de>
      41b85a11
    • H
      parisc: Add Huge Page and HUGETLBFS support · 736d2169
      Helge Deller 提交于
      This patch adds huge page support to allow userspace to allocate huge
      pages and to use hugetlbfs filesystem on 32- and 64-bit Linux kernels.
      A later patch will add kernel support to map kernel text and data on
      huge pages.
      
      The only requirement is, that the kernel needs to be compiled for a
      PA8X00 CPU (PA2.0 architecture). Older PA1.X CPUs do not support
      variable page sizes. 64bit Kernels are compiled for PA2.0 by default.
      
      Technically on parisc multiple physical huge pages may be needed to
      emulate standard 2MB huge pages.
      Signed-off-by: NHelge Deller <deller@gmx.de>
      736d2169
    • H
      parisc: Use long branch to do_syscall_trace_exit · 337685e5
      Helge Deller 提交于
      Use the 22bit instead of the 17bit branch instruction on a 64bit kernel
      to reach the do_syscall_trace_exit function from the gateway page.
      A huge page enabled kernel may need the additional branch distance bits.
      Signed-off-by: NHelge Deller <deller@gmx.de>
      337685e5
    • H
      parisc: Increase initial kernel mapping to 32MB on 64bit kernel · 332b42e4
      Helge Deller 提交于
      For the 64bit kernel the initially 16 MB kernel memory might become too
      small if you build a kernel with many modules built-in and with kernel
      text and data areas mapped on huge pages.
      
      This patch increases the initial mapping to 32MB for 64bit kernels and
      keeps 16MB for 32bit kernels.
      Signed-off-by: NHelge Deller <deller@gmx.de>
      332b42e4
    • H
      parisc: Initialize the fault vector earlier in the boot process. · 4182d0cd
      Helge Deller 提交于
      A fault vector on parisc needs to be 2K aligned.  Furthermore the
      checksum of the fault vector needs to sum up to 0 which is being
      calculated and written at runtime.
      
      Up to now we aligned both PA20 and PA11 fault vectors on the same 4K
      page in order to easily write the checksum after having mapped the
      kernel read-only (by mapping this page only as read-write).
      But when we want to map the kernel text and data on huge pages this
      makes things harder.
      So, simplify it by aligning both fault vectors on 2K boundries and write
      the checksum before we map the page read-only.
      Signed-off-by: NHelge Deller <deller@gmx.de>
      4182d0cd
    • H
      parisc: Add defines for Huge page support · 1f25ad26
      Helge Deller 提交于
      Huge pages on parisc will have the same size as one pmd table, which
      is on a 64bit kernel 2MB on a kernel with 4K kernel page sizes, and
      on a 32bit kernel 4MB when used with 4K kernel pages.
      
      Since parisc does not physically supports 2MB huge page sizes, emulate
      it with two consecutive 1MB page sizes instead. Keeping the same huge
      page size as one pmd will allow us to add transparent huge page support
      later on.
      
      Bit 21 in the pte flags was unused and will now be used to mark a page
      as huge page (_PAGE_HPAGE_BIT).
      Signed-off-by: NHelge Deller <deller@gmx.de>
      1f25ad26
    • H
      parisc: Drop unused MADV_xxxK_PAGES flags from asm/mman.h · dcbf0d29
      Helge Deller 提交于
      Drop the MADV_xxK_PAGES flags, which were never used and were from a proposed
      API which was never integrated into the generic Linux kernel code.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NHelge Deller <deller@gmx.de>
      dcbf0d29
  3. 20 11月, 2015 6 次提交
  4. 19 11月, 2015 3 次提交
  5. 18 11月, 2015 6 次提交
    • L
      arm64: Fix R/O permissions in mark_rodata_ro · 0b2aa5b8
      Laura Abbott 提交于
      The permissions in mark_rodata_ro trigger a build error
      with STRICT_MM_TYPECHECKS. Fix this by introducing
      PAGE_KERNEL_ROX for the same reasons as PAGE_KERNEL_RO.
      From Ard:
      
      "PAGE_KERNEL_EXEC has PTE_WRITE set as well, making the range
      writeable under the ARMv8.1 DBM feature, that manages the
      dirty bit in hardware (writing to a page with the PTE_RDONLY
      and PTE_WRITE bits both set will clear the PTE_RDONLY bit in that case)"
      Signed-off-by: NLaura Abbott <labbott@fedoraproject.org>
      Acked-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      0b2aa5b8
    • A
      arm64: crypto: reduce priority of core AES cipher · 08c6781c
      Ard Biesheuvel 提交于
      The asynchronous, merged implementations of AES in CBC, CTR and XTS
      modes are preferred when available (i.e., when instantiating ablkciphers
      explicitly). However, the synchronous core AES cipher combined with the
      generic CBC mode implementation will produce a 'cbc(aes)' blkcipher that
      is callable asynchronously as well. To prevent this implementation from
      being used when the accelerated asynchronous implemenation is also
      available, lower its priority to 250 (i.e., below the asynchronous
      module's priority of 300).
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      08c6781c
    • A
      arm64: use non-global mappings for UEFI runtime regions · 65da0a8e
      Ard Biesheuvel 提交于
      As pointed out by Russell King in response to the proposed ARM version
      of this code, the sequence to switch between the UEFI runtime mapping
      and current's actual userland mapping (and vice versa) is potentially
      unsafe, since it leaves a time window between the switch to the new
      page tables and the TLB flush where speculative accesses may hit on
      stale global TLB entries.
      
      So instead, use non-global mappings, and perform the switch via the
      ordinary ASID-aware context switch routines.
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Reviewed-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      65da0a8e
    • P
      ARM: dts: imx27.dtsi: change the clock information for usb · facf47ee
      Peter Chen 提交于
      For imx27, it needs three clocks to let the controller work,
      the old code is wrong, and usbmisc has not included clock handling
      code any more. Without this patch, it will cause below data
      abort when accessing usbmisc registers.
      
      usbcore: registered new interface driver usb-storage
      Unhandled fault: external abort on non-linefetch (0x008) at 0xf4424600
      pgd = c0004000
      [f4424600] *pgd=10000452(bad)
      Internal error: : 8 [#1] PREEMPT ARM
      Modules linked in:
      CPU: 0 PID: 1 Comm: swapper Not tainted 4.1.0-next-20150701-dirty #3089
      Hardware name: Freescale i.MX27 (Device Tree Support)
      task: c7832b60 ti: c783e000 task.ti: c783e000
      PC is at usbmisc_imx27_init+0x4c/0xbc
      LR is at usbmisc_imx27_init+0x40/0xbc
      pc : [<c03cb5c0>]    lr : [<c03cb5b4>]    psr: 60000093
      sp : c783fe08  ip : 00000000  fp : 00000000
      r10: c0576434  r9 : 0000009c  r8 : c7a773a0
      r7 : 01000000  r6 : 60000013  r5 : c7a776f0  r4 : c7a773f0
      r3 : f4424600  r2 : 00000000  r1 : 00000001  r0 : 00000001
      Flags: nZCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
      Control: 0005317f  Table: a0004000  DAC: 00000017
      Process swapper (pid: 1, stack limit = 0xc783e190)
      Stack: (0xc783fe08 to 0xc7840000)
      Signed-off-by: NPeter Chen <peter.chen@freescale.com>
      Reported-by: NFabio Estevam <fabio.estevam@freescale.com>
      Tested-by: NFabio Estevam <fabio.estevam@freescale.com>
      Cc: <stable@vger.kernel.org> #v4.1+
      Acked-by: NShawn Guo <shawnguo@kernel.org>
      facf47ee
    • Y
      arm64: bpf: make BPF prologue and epilogue align with ARM64 AAPCS · ec0738db
      Yang Shi 提交于
      Save and restore FP/LR in BPF prog prologue and epilogue, save SP to FP
      in prologue in order to get the correct stack backtrace.
      
      However, ARM64 JIT used FP (x29) as eBPF fp register, FP is subjected to
      change during function call so it may cause the BPF prog stack base address
      change too.
      
      Use x25 to replace FP as BPF stack base register (fp). Since x25 is callee
      saved register, so it will keep intact during function call.
      It is initialized in BPF prog prologue when BPF prog is started to run
      everytime. Save and restore x25/x26 in BPF prologue and epilogue to keep
      them intact for the outside of BPF. Actually, x26 is unnecessary, but SP
      requires 16 bytes alignment.
      
      So, the BPF stack layout looks like:
      
                                       high
               original A64_SP =>   0:+-----+ BPF prologue
                                      |FP/LR|
               current A64_FP =>  -16:+-----+
                                      | ... | callee saved registers
                                      +-----+
                                      |     | x25/x26
               BPF fp register => -80:+-----+
                                      |     |
                                      | ... | BPF prog stack
                                      |     |
                                      |     |
               current A64_SP =>      +-----+
                                      |     |
                                      | ... | Function call stack
                                      |     |
                                      +-----+
                                        low
      
      CC: Zi Shen Lim <zlim.lnx@gmail.com>
      CC: Xi Wang <xi.wang@gmail.com>
      Signed-off-by: NYang Shi <yang.shi@linaro.org>
      Acked-by: NZi Shen Lim <zlim.lnx@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ec0738db
    • L
      arm64: kernel: pause/unpause function graph tracer in cpu_suspend() · de818bd4
      Lorenzo Pieralisi 提交于
      The function graph tracer adds instrumentation that is required to trace
      both entry and exit of a function. In particular the function graph
      tracer updates the "return address" of a function in order to insert
      a trace callback on function exit.
      
      Kernel power management functions like cpu_suspend() are called
      upon power down entry with functions called "finishers" that are in turn
      called to trigger the power down sequence but they may not return to the
      kernel through the normal return path.
      
      When the core resumes from low-power it returns to the cpu_suspend()
      function through the cpu_resume path, which leaves the trace stack frame
      set-up by the function tracer in an incosistent state upon return to the
      kernel when tracing is enabled.
      
      This patch fixes the issue by pausing/resuming the function graph
      tracer on the thread executing cpu_suspend() (ie the function call that
      subsequently triggers the "suspend finishers"), so that the function graph
      tracer state is kept consistent across functions that enter power down
      states and never return by effectively disabling graph tracer while they
      are executing.
      
      Fixes: 819e50e2 ("arm64: Add ftrace support")
      Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Reported-by: NCatalin Marinas <catalin.marinas@arm.com>
      Reported-by: NAKASHI Takahiro <takahiro.akashi@linaro.org>
      Suggested-by: NSteven Rostedt <rostedt@goodmis.org>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: <stable@vger.kernel.org> # 3.16+
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      de818bd4
  6. 17 11月, 2015 6 次提交
    • A
      arm64: do not include ptrace.h from compat.h · adc235af
      Arnd Bergmann 提交于
      including ptrace.h brings a definition of BITS_PER_PAGE into device
      drivers and cause a build warning in allmodconfig builds:
      
      drivers/block/drbd/drbd_bitmap.c:482:0: warning: "BITS_PER_PAGE" redefined
       #define BITS_PER_PAGE  (1UL << (PAGE_SHIFT + 3))
      
      This uses a slightly different way to express current_pt_regs()
      that avoids the use of the header and gets away with the already
      included asm/ptrace.h.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      adc235af
    • A
      arm64: simplify dma_get_ops · 1dccb598
      Arnd Bergmann 提交于
      Including linux/acpi.h from asm/dma-mapping.h causes tons of compile-time
      warnings, e.g.
      
       drivers/isdn/mISDN/dsp_ecdis.h:43:0: warning: "FALSE" redefined
       drivers/isdn/mISDN/dsp_ecdis.h:44:0: warning: "TRUE" redefined
       drivers/net/fddi/skfp/h/targetos.h:62:0: warning: "TRUE" redefined
       drivers/net/fddi/skfp/h/targetos.h:63:0: warning: "FALSE" redefined
      
      However, it looks like the dependency should not even there as
      I do not see why __generic_dma_ops() cares about whether we have
      an ACPI based system or not.
      
      The current behavior is to fall back to the global dma_ops when
      a device has not set its own dma_ops, but only for DT based systems.
      This seems dangerous, as a random device might have different
      requirements regarding IOMMU or coherency, so we should really
      never have that fallback and just forbid DMA when we have not
      initialized DMA for a device.
      
      This removes the global dma_ops variable and the special-casing
      for ACPI, and just returns the dma ops that got set for the
      device, or the dummy_dma_ops if none were present.
      
      The original code has apparently been copied from arm32 where we
      rely on it for ISA devices things like the floppy controller, but
      we should have no such devices on ARM64.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      [catalin.marinas@arm.com: removed acpi_disabled check in arch_setup_dma_ops()]
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      1dccb598
    • A
      arm64: mm: use correct mapping granularity under DEBUG_RODATA · 4fee9f36
      Ard Biesheuvel 提交于
      When booting a 64k pages kernel that is built with CONFIG_DEBUG_RODATA
      and resides at an offset that is not a multiple of 512 MB, the rounding
      that occurs in __map_memblock() and fixup_executable() results in
      incorrect regions being mapped.
      
      The following snippet from /sys/kernel/debug/kernel_page_tables shows
      how, when the kernel is loaded 2 MB above the base of DRAM at 0x40000000,
      the first 2 MB of memory (which may be inaccessible from non-secure EL1
      or just reserved by the firmware) is inadvertently mapped into the end of
      the module region.
      
        ---[ Modules start ]---
        0xfffffdffffe00000-0xfffffe0000000000     2M RW NX ... UXN MEM/NORMAL
        ---[ Modules end ]---
        ---[ Kernel Mapping ]---
        0xfffffe0000000000-0xfffffe0000090000   576K RW NX ... UXN MEM/NORMAL
        0xfffffe0000090000-0xfffffe0000200000  1472K ro x  ... UXN MEM/NORMAL
        0xfffffe0000200000-0xfffffe0000800000     6M ro x  ... UXN MEM/NORMAL
        0xfffffe0000800000-0xfffffe0000810000    64K ro x  ... UXN MEM/NORMAL
        0xfffffe0000810000-0xfffffe0000a00000  1984K RW NX ... UXN MEM/NORMAL
        0xfffffe0000a00000-0xfffffe00ffe00000  4084M RW NX ... UXN MEM/NORMAL
      
      The same issue is likely to occur on 16k pages kernels whose load
      address is not a multiple of 32 MB (i.e., SECTION_SIZE). So round to
      SWAPPER_BLOCK_SIZE instead of SECTION_SIZE.
      
      Fixes: da141706 ("arm64: add better page protections to arm64")
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Acked-by: NMark Rutland <mark.rutland@arm.com>
      Acked-by: NLaura Abbott <labbott@redhat.com>
      Cc: <stable@vger.kernel.org> # 4.0+
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      4fee9f36
    • D
      bpf, arm64: start flushing icache range from header · c3d4c682
      Daniel Borkmann 提交于
      While recently going over ARM64's BPF code, I noticed that the icache
      range we're flushing should start at header already and not at ctx.image.
      
      Reason is that after b569c1c6 ("net: bpf: arm64: address randomize
      and write protect JIT code"), we also want to make sure to flush the
      random-sized trap in front of the start of the actual program (analogous
      to x86). No operational differences from user side.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NZi Shen Lim <zlim.lnx@gmail.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c3d4c682
    • D
      bpf, arm: start flushing icache range from header · ebaef649
      Daniel Borkmann 提交于
      During review I noticed that the icache range we're flushing should
      start at header already and not at ctx.image.
      
      Reason is that after 55309dd3 ("net: bpf: arm: address randomize
      and write protect JIT code"), we also want to make sure to flush the
      random-sized trap in front of the start of the actual program (analogous
      to x86). No operational differences from user side.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Tested-by: NNicolas Schichan <nschichan@freebox.fr>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ebaef649
    • Y
      arm64: bpf: fix JIT frame pointer setup · 0fcd593b
      Yang Shi 提交于
      BPF fp should point to the top of the BPF prog stack. The original
      implementation made it point to the bottom incorrectly.
      Move A64_SP to fp before reserve BPF prog stack space.
      
      CC: Zi Shen Lim <zlim.lnx@gmail.com>
      CC: Xi Wang <xi.wang@gmail.com>
      Signed-off-by: NYang Shi <yang.shi@linaro.org>
      Reviewed-by: NZi Shen Lim <zlim.lnx@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0fcd593b
  7. 16 11月, 2015 7 次提交
  8. 14 11月, 2015 3 次提交