1. 25 9月, 2020 4 次提交
  2. 22 9月, 2020 1 次提交
    • R
      iommu/io-pgtable-arm: Clean up faulty sanity check · b9bb694b
      Robin Murphy 提交于
      Checking for a nonzero dma_pfn_offset was a quick shortcut to validate
      whether the DMA == phys assumption could hold at all. Checking for a
      non-NULL dma_range_map is not quite equivalent, since a map may be
      present to describe a limited DMA window even without an offset, and
      thus this check can now yield false positives.
      
      However, it only ever served to short-circuit going all the way through
      to __arm_lpae_alloc_pages(), failing the canonical test there, and
      having a bit more to clean up. As such, we can simply remove it without
      loss of correctness.
      Reported-by: NNaresh Kamboju <naresh.kamboju@linaro.org>
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      b9bb694b
  3. 21 9月, 2020 1 次提交
  4. 18 9月, 2020 7 次提交
  5. 11 9月, 2020 14 次提交
  6. 04 9月, 2020 2 次提交
    • N
      dma-mapping: set default segment_boundary_mask to ULONG_MAX · 135ba11a
      Nicolin Chen 提交于
      The default segment_boundary_mask was set to DMA_BIT_MAKS(32)
      a decade ago by referencing SCSI/block subsystem, as a 32-bit
      mask was good enough for most of the devices.
      
      Now more and more drivers set dma_masks above DMA_BIT_MAKS(32)
      while only a handful of them call dma_set_seg_boundary(). This
      means that most drivers have a 4GB segmention boundary because
      DMA API returns a 32-bit default value, though they might not
      really have such a limit.
      
      The default segment_boundary_mask should mean "no limit" since
      the device doesn't explicitly set the mask. But a 32-bit mask
      certainly limits those devices capable of 32+ bits addressing.
      
      So this patch sets default segment_boundary_mask to ULONG_MAX.
      Signed-off-by: NNicolin Chen <nicoleotsuka@gmail.com>
      Acked-by: NNiklas Schnelle <schnelle@linux.ibm.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      135ba11a
    • N
      dma-mapping: introduce dma_get_seg_boundary_nr_pages() · 1e9d90db
      Nicolin Chen 提交于
      We found that callers of dma_get_seg_boundary mostly do an ALIGN
      with page mask and then do a page shift to get number of pages:
          ALIGN(boundary + 1, 1 << shift) >> shift
      
      However, the boundary might be as large as ULONG_MAX, which means
      that a device has no specific boundary limit. So either "+ 1" or
      passing it to ALIGN() would potentially overflow.
      
      According to kernel defines:
          #define ALIGN_MASK(x, mask) (((x) + (mask)) & ~(mask))
          #define ALIGN(x, a)	ALIGN_MASK(x, (typeof(x))(a) - 1)
      
      We can simplify the logic here into a helper function doing:
        ALIGN(boundary + 1, 1 << shift) >> shift
      = ALIGN_MASK(b + 1, (1 << s) - 1) >> s
      = {[b + 1 + (1 << s) - 1] & ~[(1 << s) - 1]} >> s
      = [b + 1 + (1 << s) - 1] >> s
      = [b + (1 << s)] >> s
      = (b >> s) + 1
      
      This patch introduces and applies dma_get_seg_boundary_nr_pages()
      as an overflow-free helper for the dma_get_seg_boundary() callers
      to get numbers of pages. It also takes care of the NULL dev case
      for non-DMA API callers.
      Suggested-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NNicolin Chen <nicoleotsuka@gmail.com>
      Acked-by: NNiklas Schnelle <schnelle@linux.ibm.com>
      Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      1e9d90db
  7. 01 9月, 2020 3 次提交
    • B
      mm: cma: use CMA_MAX_NAME to define the length of cma name array · 2281f797
      Barry Song 提交于
      CMA_MAX_NAME should be visible to CMA's users as they might need it to set
      the name of CMA areas and avoid hardcoding the size locally.
      So this patch moves CMA_MAX_NAME from local header file to include/linux
      header file and removes the hardcode in both hugetlb.c and contiguous.c.
      Signed-off-by: NBarry Song <song.bao.hua@hisilicon.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      2281f797
    • B
      arm64: mm: reserve per-numa CMA to localize coherent dma buffers · c6303ab9
      Barry Song 提交于
      Right now, smmu is using dma_alloc_coherent() to get memory to save queues
      and tables. Typically, on ARM64 server, there is a default CMA located at
      node0, which could be far away from node2, node3 etc.
      with this patch, smmu will get memory from local numa node to save command
      queues and page tables. that means dma_unmap latency will be shrunk much.
      Meanwhile, when iommu.passthrough is on, device drivers which call dma_
      alloc_coherent() will also get local memory and avoid the travel between
      numa nodes.
      Acked-by: NWill Deacon <will@kernel.org>
      Signed-off-by: NBarry Song <song.bao.hua@hisilicon.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      c6303ab9
    • B
      dma-contiguous: provide the ability to reserve per-numa CMA · b7176c26
      Barry Song 提交于
      Right now, drivers like ARM SMMU are using dma_alloc_coherent() to get
      coherent DMA buffers to save their command queues and page tables. As
      there is only one default CMA in the whole system, SMMUs on nodes other
      than node0 will get remote memory. This leads to significant latency.
      
      This patch provides per-numa CMA so that drivers like SMMU can get local
      memory. Tests show localizing CMA can decrease dma_unmap latency much.
      For instance, before this patch, SMMU on node2  has to wait for more than
      560ns for the completion of CMD_SYNC in an empty command queue; with this
      patch, it needs 240ns only.
      
      A positive side effect of this patch would be improving performance even
      further for those users who are worried about performance more than DMA
      security and use iommu.passthrough=1 to skip IOMMU. With local CMA, all
      drivers can get local coherent DMA buffers.
      
      Also, this patch changes the default CONFIG_CMA_AREAS to 19 in NUMA. As
      1+CONFIG_CMA_AREAS should be quite enough for most servers on the market
      even they enable both hugetlb_cma and pernuma_cma.
      2 numa nodes: 2(hugetlb) + 2(pernuma) + 1(default global cma) = 5
      4 numa nodes: 4(hugetlb) + 4(pernuma) + 1(default global cma) = 9
      8 numa nodes: 8(hugetlb) + 8(pernuma) + 1(default global cma) = 17
      Signed-off-by: NBarry Song <song.bao.hua@hisilicon.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      b7176c26
  8. 31 8月, 2020 8 次提交
    • L
      Linux 5.9-rc3 · f75aef39
      Linus Torvalds 提交于
      f75aef39
    • L
      Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · e43327c7
      Linus Torvalds 提交于
      Pull crypto fixes from Herbert Xu:
      
       - fix regression in af_alg that affects iwd
      
       - restore polling delay in qat
      
       - fix double free in ingenic on error path
      
       - fix potential build failure in sa2ul due to missing Kconfig dependency
      
      * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
        crypto: af_alg - Work around empty control messages without MSG_MORE
        crypto: sa2ul - add Kconfig selects to fix build error
        crypto: ingenic - Drop kfree for memory allocated with devm_kzalloc
        crypto: qat - add delay before polling mailbox
      e43327c7
    • L
      Merge tag 'x86-urgent-2020-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · dcc5c6f0
      Linus Torvalds 提交于
      Pull x86 fixes from Thomas Gleixner:
       "Three interrupt related fixes for X86:
      
         - Move disabling of the local APIC after invoking fixup_irqs() to
           ensure that interrupts which are incoming are noted in the IRR and
           not ignored.
      
         - Unbreak affinity setting.
      
           The rework of the entry code reused the regular exception entry
           code for device interrupts. The vector number is pushed into the
           errorcode slot on the stack which is then lifted into an argument
           and set to -1 because that's regs->orig_ax which is used in quite
           some places to check whether the entry came from a syscall.
      
           But it was overlooked that orig_ax is used in the affinity cleanup
           code to validate whether the interrupt has arrived on the new
           target. It turned out that this vector check is pointless because
           interrupts are never moved from one vector to another on the same
           CPU. That check is a historical leftover from the time where x86
           supported multi-CPU affinities, but not longer needed with the now
           strict single CPU affinity. Famous last words ...
      
         - Add a missing check for an empty cpumask into the matrix allocator.
      
           The affinity change added a warning to catch the case where an
           interrupt is moved on the same CPU to a different vector. This
           triggers because a condition with an empty cpumask returns an
           assignment from the allocator as the allocator uses for_each_cpu()
           without checking the cpumask for being empty. The historical
           inconsistent for_each_cpu() behaviour of ignoring the cpumask and
           unconditionally claiming that CPU0 is in the mask struck again.
           Sigh.
      
        plus a new entry into the MAINTAINER file for the HPE/UV platform"
      
      * tag 'x86-urgent-2020-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        genirq/matrix: Deal with the sillyness of for_each_cpu() on UP
        x86/irq: Unbreak interrupt affinity setting
        x86/hotplug: Silence APIC only after all interrupts are migrated
        MAINTAINERS: Add entry for HPE Superdome Flex (UV) maintainers
      dcc5c6f0
    • L
      Merge tag 'irq-urgent-2020-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · d2283cdc
      Linus Torvalds 提交于
      Pull irq fixes from Thomas Gleixner:
       "A set of fixes for interrupt chip drivers:
      
         - Revert the platform driver conversion of interrupt chip drivers as
           it turned out to create more problems than it solves.
      
         - Fix a trivial typo in the new module helpers which made probing
           reliably fail.
      
         - Small fixes in the STM32 and MIPS Ingenic drivers
      
         - The TI firmware rework which had badly managed dependencies and had
           to wait post rc1"
      
      * tag 'irq-urgent-2020-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        irqchip/ingenic: Leave parent IRQ unmasked on suspend
        irqchip/stm32-exti: Avoid losing interrupts due to clearing pending bits by mistake
        irqchip: Revert modular support for drivers using IRQCHIP_PLATFORM_DRIVER helperse
        irqchip: Fix probing deferal when using IRQCHIP_PLATFORM_DRIVER helpers
        arm64: dts: k3-am65: Update the RM resource types
        arm64: dts: k3-am65: ti-sci-inta/intr: Update to latest bindings
        arm64: dts: k3-j721e: ti-sci-inta/intr: Update to latest bindings
        irqchip/ti-sci-inta: Add support for INTA directly connecting to GIC
        irqchip/ti-sci-inta: Do not store TISCI device id in platform device id field
        dt-bindings: irqchip: Convert ti, sci-inta bindings to yaml
        dt-bindings: irqchip: ti, sci-inta: Update docs to support different parent.
        irqchip/ti-sci-intr: Add support for INTR being a parent to INTR
        dt-bindings: irqchip: Convert ti, sci-intr bindings to yaml
        dt-bindings: irqchip: ti, sci-intr: Update bindings to drop the usage of gic as parent
        firmware: ti_sci: Add support for getting resource with subtype
        firmware: ti_sci: Drop unused structure ti_sci_rm_type_map
        firmware: ti_sci: Drop the device id to resource type translation
      d2283cdc
    • L
      Merge tag 'sched-urgent-2020-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 0063a82d
      Linus Torvalds 提交于
      Pull scheduler fix from Thomas Gleixner:
       "A single fix for the scheduler:
      
         - Make is_idle_task() __always_inline to prevent the compiler from
           putting it out of line into the wrong section because it's used
           inside noinstr sections"
      
      * tag 'sched-urgent-2020-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched: Use __always_inline on is_idle_task()
      0063a82d
    • L
      Merge tag 'locking-urgent-2020-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · b69bea8a
      Linus Torvalds 提交于
      Pull locking fixes from Thomas Gleixner:
       "A set of fixes for lockdep, tracing and RCU:
      
         - Prevent recursion by using raw_cpu_* operations
      
         - Fixup the interrupt state in the cpu idle code to be consistent
      
         - Push rcu_idle_enter/exit() invocations deeper into the idle path so
           that the lock operations are inside the RCU watching sections
      
         - Move trace_cpu_idle() into generic code so it's called before RCU
           goes idle.
      
         - Handle raw_local_irq* vs. local_irq* operations correctly
      
         - Move the tracepoints out from under the lockdep recursion handling
           which turned out to be fragile and inconsistent"
      
      * tag 'locking-urgent-2020-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        lockdep,trace: Expose tracepoints
        lockdep: Only trace IRQ edges
        mips: Implement arch_irqs_disabled()
        arm64: Implement arch_irqs_disabled()
        nds32: Implement arch_irqs_disabled()
        locking/lockdep: Cleanup
        x86/entry: Remove unused THUNKs
        cpuidle: Move trace_cpu_idle() into generic code
        cpuidle: Make CPUIDLE_FLAG_TLB_FLUSHED generic
        sched,idle,rcu: Push rcu_idle deeper into the idle path
        cpuidle: Fixup IRQ state
        lockdep: Use raw_cpu_*() for per-cpu variables
      b69bea8a
    • L
      Merge tag '5.9-rc2-smb-fix' of git://git.samba.org/sfrench/cifs-2.6 · 3edd8db2
      Linus Torvalds 提交于
      Pull cfis fix from Steve French:
       "DFS fix for referral problem when using SMB1"
      
      * tag '5.9-rc2-smb-fix' of git://git.samba.org/sfrench/cifs-2.6:
        cifs: fix check of tcon dfs in smb1
      3edd8db2
    • L
      Merge tag 'powerpc-5.9-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 8bb5021c
      Linus Torvalds 提交于
      Pull powerpc fixes from Michael Ellerman:
      
       - Revert our removal of PROT_SAO, at least one user expressed an
         interest in using it on Power9. Instead don't allow it to be used in
         guests unless enabled explicitly at compile time.
      
       - A fix for a crash introduced by a recent change to FP handling.
      
       - Revert a change to our idle code that left Power10 with no idle
         support.
      
       - One minor fix for the new scv system call path to set PPR.
      
       - Fix a crash in our "generic" PMU if branch stack events were enabled.
      
       - A fix for the IMC PMU, to correctly identify host kernel samples.
      
       - The ADB_PMU powermac code was found to be incompatible with
         VMAP_STACK, so make them incompatible in Kconfig until the code can
         be fixed.
      
       - A build fix in drivers/video/fbdev/controlfb.c, and a documentation
         fix.
      
      Thanks to Alexey Kardashevskiy, Athira Rajeev, Christophe Leroy,
      Giuseppe Sacco, Madhavan Srinivasan, Milton Miller, Nicholas Piggin,
      Pratik Rajesh Sampat, Randy Dunlap, Shawn Anastasio, Vaidyanathan
      Srinivasan.
      
      * tag 'powerpc-5.9-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/32s: Disable VMAP stack which CONFIG_ADB_PMU
        Revert "powerpc/powernv/idle: Replace CPU feature check with PVR check"
        powerpc/perf: Fix reading of MSR[HV/PR] bits in trace-imc
        powerpc/perf: Fix crashes with generic_compat_pmu & BHRB
        powerpc/64s: Fix crash in load_fp_state() due to fpexc_mode
        powerpc/64s: scv entry should set PPR
        Documentation/powerpc: fix malformed table in syscall64-abi
        video: fbdev: controlfb: Fix build for COMPILE_TEST=y && PPC_PMAC=n
        selftests/powerpc: Update PROT_SAO test to skip ISA 3.1
        powerpc/64s: Disallow PROT_SAO in LPARs by default
        Revert "powerpc/64s: Remove PROT_SAO support"
      8bb5021c