1. 18 4月, 2015 1 次提交
  2. 17 4月, 2015 7 次提交
    • D
      Merge branch 'generic-iommu-allocator' · a83f5d6a
      David S. Miller 提交于
      Sowmini Varadhan says:
      
      ====================
      Generic IOMMU pooled allocator
      
      Investigation of network performance on Sparc shows a high
      degree of locking contention in the IOMMU allocator, and it
      was noticed that the PowerPC code has a better locking model.
      
      This patch series tries to extract the generic parts of the
      PowerPC code so that it can be shared across multiple PCI
      devices and architectures.
      
      v10: resend patchv9 without RFC tag, and a new mail Message-Id,
      (previous non-RFC attempt did not show up on the patchwork queue?)
      
      Full revision history below:
      v2 changes:
        - incorporate David Miller editorial comments: sparc specific
          fields moved from iommu-common into sparc's iommu_64.h
        - make the npools value an input parameter, for the case when
          the iommu map size is not very large
        - cookie_to_index mapping, and optimizations for span-boundary
          check, for use case such as LDC.
      
      v3: eliminate iommu_sparc, rearrange the ->demap indirection to
          be invoked under the pool lock.
      
      v4: David Miller review changes:
        - s/IOMMU_ERROR_CODE/DMA_ERROR_CODE
        - page_table_map_base and page_table_shift are unsigned long, not u32.
      
      v5: removed ->cookie_to_index and ->demap indirection from the
          iommu_tbl_ops The caller needs to call these functions as needed,
          before invoking the generic arena allocator functions.
          Added the "skip_span_boundary" argument to iommu_tbl_pool_init() for
          those callers like LDC which do no care about span boundary checks.
      
      v6: removed iommu_tbl_ops, and instead pass the ->flush_all as
          an indirection to iommu_tbl_pool_init(); only invoke ->flush_all
          when there is no large_pool, based on the assumption that large-pool
          usage is infrequently encountered
      
      v7: moved pool_hash initialization to lib/iommu-common.c and cleaned up
          code duplication from sun4v/sun4u/ldc.
      
      v8: Addresses BenH comments with one exception: I've left the
          IOMMU_POOL_HASH as is, so that powerpc can tailor it to their
          convenience.  Discard trylock for simple spin_lock to acquire pool
      
      v9: Addresses latest BenH comments: need_flush checks, add support
          for dma mask and align_order.
      
      v10: resend without RFC tag, and new mail Message-Id.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a83f5d6a
    • S
      sparc: Make LDC use common iommu poll management functions · 671d7732
      Sowmini Varadhan 提交于
      Note that this conversion is only being done to consolidate the
      code and ensure that the common code provides the sufficient
      abstraction. It is not expected to result in any noticeable
      performance improvement, as there is typically one ldc_iommu
      per vnet_port, and each one has 8k entries, with a typical
      request for 1-4 pages.  Thus LDC uses npools == 1.
      Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      671d7732
    • S
      sparc: Make sparc64 use scalable lib/iommu-common.c functions · f1600e54
      Sowmini Varadhan 提交于
      In iperf experiments running linux as the Tx side (TCP client) with
      10 threads results in a severe performance drop when TSO is disabled,
      indicating a weakness in the software that can be avoided by using
      the scalable IOMMU arena DMA allocation.
      
      Baseline numbers before this patch:
         with default settings (TSO enabled) :    9-9.5 Gbps
         Disable TSO using ethtool- drops badly:  2-3 Gbps.
      
      After this patch, iperf client with 10 threads, can give a
      throughput of at least 8.5 Gbps, even when TSO is disabled.
      Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f1600e54
    • S
      sparc: Break up monolithic iommu table/lock into finer graularity pools and lock · 10b88a4b
      Sowmini Varadhan 提交于
      Investigation of multithreaded iperf experiments on an ethernet
      interface show the iommu->lock as the hottest lock identified by
      lockstat, with something of the order of  21M contentions out of
      27M acquisitions, and an average wait time of 26 us for the lock.
      This is not efficient. A more scalable design is to follow the ppc
      model, where the iommu_table has multiple pools, each stretching
      over a segment of the map, and with a separate lock for each pool.
      This model allows for better parallelization of the iommu map search.
      
      This patch adds the iommu range alloc/free function infrastructure.
      Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      10b88a4b
    • L
      Merge tag 'stable/for-linus-4.1-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip · 497a5df7
      Linus Torvalds 提交于
      Pull xen features and fixes from David Vrabel:
      
       - use a single source list of hypercalls, generating other tables etc.
         at build time.
      
       - add a "Xen PV" APIC driver to support >255 VCPUs in PV guests.
      
       - significant performance improve to guest save/restore/migration.
      
       - scsiback/front save/restore support.
      
       - infrastructure for multi-page xenbus rings.
      
       - misc fixes.
      
      * tag 'stable/for-linus-4.1-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
        xen/pci: Try harder to get PXM information for Xen
        xenbus_client: Extend interface to support multi-page ring
        xen-pciback: also support disabling of bus-mastering and memory-write-invalidate
        xen: support suspend/resume in pvscsi frontend
        xen: scsiback: add LUN of restored domain
        xen-scsiback: define a pr_fmt macro with xen-pvscsi
        xen/mce: fix up xen_late_init_mcelog() error handling
        xen/privcmd: improve performance of MMAPBATCH_V2
        xen: unify foreign GFN map/unmap for auto-xlated physmap guests
        x86/xen/apic: WARN with details.
        x86/xen: Provide a "Xen PV" APIC driver to support >255 VCPUs
        xen/pciback: Don't print scary messages when unsupported by hypervisor.
        xen: use generated hypercall symbols in arch/x86/xen/xen-head.S
        xen: use generated hypervisor symbols in arch/x86/xen/trace.c
        xen: synchronize include/xen/interface/xen.h with xen
        xen: build infrastructure for generating hypercall depending symbols
        xen: balloon: Use static attribute groups for sysfs entries
        xen: pcpu: Use static attribute groups for sysfs entry
      497a5df7
    • L
      Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · 714d8e7e
      Linus Torvalds 提交于
      Pull arm64 updates from Will Deacon:
       "Here are the core arm64 updates for 4.1.
      
        Highlights include a significant rework to head.S (allowing us to boot
        on machines with physical memory at a really high address), an AES
        performance boost on Cortex-A57 and the ability to run a 32-bit
        userspace with 64k pages (although this requires said userspace to be
        built with a recent binutils).
      
        The head.S rework spilt over into KVM, so there are some changes under
        arch/arm/ which have been acked by Marc Zyngier (KVM co-maintainer).
        In particular, the linker script changes caused us some issues in
        -next, so there are a few merge commits where we had to apply fixes on
        top of a stable branch.
      
        Other changes include:
      
         - AES performance boost for Cortex-A57
         - AArch32 (compat) userspace with 64k pages
         - Cortex-A53 erratum workaround for #845719
         - defconfig updates (new platforms, PCI, ...)"
      
      * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (39 commits)
        arm64: fix midr range for Cortex-A57 erratum 832075
        arm64: errata: add workaround for cortex-a53 erratum #845719
        arm64: Use bool function return values of true/false not 1/0
        arm64: defconfig: updates for 4.1
        arm64: Extract feature parsing code from cpu_errata.c
        arm64: alternative: Allow immediate branch as alternative instruction
        arm64: insn: Add aarch64_insn_decode_immediate
        ARM: kvm: round HYP section to page size instead of log2 upper bound
        ARM: kvm: assert on HYP section boundaries not actual code size
        arm64: head.S: ensure idmap_t0sz is visible
        arm64: pmu: add support for interrupt-affinity property
        dt: pmu: extend ARM PMU binding to allow for explicit interrupt affinity
        arm64: head.S: ensure visibility of page tables
        arm64: KVM: use ID map with increased VA range if required
        arm64: mm: increase VA range of identity map
        ARM: kvm: implement replacement for ld's LOG2CEIL()
        arm64: proc: remove unused cpu_get_pgd macro
        arm64: enforce x1|x2|x3 == 0 upon kernel entry as per boot protocol
        arm64: remove __calc_phys_offset
        arm64: merge __enable_mmu and __turn_mmu_on
        ...
      714d8e7e
    • L
      Merge tag 'powerpc-4.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux · d19d5efd
      Linus Torvalds 提交于
      Pull powerpc updates from Michael Ellerman:
      
       - Numerous minor fixes, cleanups etc.
      
       - More EEH work from Gavin to remove its dependency on device_nodes.
      
       - Memory hotplug implemented entirely in the kernel from Nathan
         Fontenot.
      
       - Removal of redundant CONFIG_PPC_OF by Kevin Hao.
      
       - Rewrite of VPHN parsing logic & tests from Greg Kurz.
      
       - A fix from Nish Aravamudan to reduce memory usage by clamping
         nodes_possible_map.
      
       - Support for pstore on powernv from Hari Bathini.
      
       - Removal of old powerpc specific byte swap routines by David Gibson.
      
       - Fix from Vasant Hegde to prevent the flash driver telling you it was
         flashing your firmware when it wasn't.
      
       - Patch from Ben Herrenschmidt to add an OPAL heartbeat driver.
      
       - Fix for an oops causing get/put_cpu_var() imbalance in perf by Jan
         Stancek.
      
       - Some fixes for migration from Tyrel Datwyler.
      
       - A new syscall to switch the cpu endian by Michael Ellerman.
      
       - Large series from Wei Yang to implement SRIOV, reviewed and acked by
         Bjorn.
      
       - A fix for the OPAL sensor driver from Cédric Le Goater.
      
       - Fixes to get STRICT_MM_TYPECHECKS building again by Michael Ellerman.
      
       - Large series from Daniel Axtens to make our PCI hooks per PHB rather
         than per machine.
      
       - Small patch from Sam Bobroff to explicitly abort non-suspended
         transactions on syscalls, plus a test to exercise it.
      
       - Numerous reworks and fixes for the 24x7 PMU from Sukadev Bhattiprolu.
      
       - Small patch to enable the hard lockup detector from Anton Blanchard.
      
       - Fix from Dave Olson for missing L2 cache information on some CPUs.
      
       - Some fixes from Michael Ellerman to get Cell machines booting again.
      
       - Freescale updates from Scott: Highlights include BMan device tree
         nodes, an MSI erratum workaround, a couple minor performance
         improvements, config updates, and misc fixes/cleanup.
      
      * tag 'powerpc-4.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux: (196 commits)
        powerpc/powermac: Fix build error seen with powermac smp builds
        powerpc/pseries: Fix compile of memory hotplug without CONFIG_MEMORY_HOTREMOVE
        powerpc: Remove PPC32 code from pseries specific find_and_init_phbs()
        powerpc/cell: Fix iommu breakage caused by controller_ops change
        powerpc/eeh: Fix crash in eeh_add_device_early() on Cell
        powerpc/perf: Cap 64bit userspace backtraces to PERF_MAX_STACK_DEPTH
        powerpc/perf/hv-24x7: Fail 24x7 initcall if create_events_from_catalog() fails
        powerpc/pseries: Correct memory hotplug locking
        powerpc: Fix missing L2 cache size in /sys/devices/system/cpu
        powerpc: Add ppc64 hard lockup detector support
        oprofile: Disable oprofile NMI timer on ppc64
        powerpc/perf/hv-24x7: Add missing put_cpu_var()
        powerpc/perf/hv-24x7: Break up single_24x7_request
        powerpc/perf/hv-24x7: Define update_event_count()
        powerpc/perf/hv-24x7: Whitespace cleanup
        powerpc/perf/hv-24x7: Define add_event_to_24x7_request()
        powerpc/perf/hv-24x7: Rename hv_24x7_event_update
        powerpc/perf/hv-24x7: Move debug prints to separate function
        powerpc/perf/hv-24x7: Drop event_24x7_request()
        powerpc/perf/hv-24x7: Use pr_devel() to log message
        ...
      
      Conflicts:
      	tools/testing/selftests/powerpc/Makefile
      	tools/testing/selftests/powerpc/tm/Makefile
      d19d5efd
  3. 16 4月, 2015 32 次提交