- 05 11月, 2019 11 次提交
-
-
由 Robin Murphy 提交于
Between VMSAv8-64 and the various 32-bit formats, there is either one 64-bit MAIR or a pair of 32-bit MAIR0/MAIR1 or NMRR/PMRR registers. As such, keeping two 64-bit values in io_pgtable_cfg has always been overkill. Signed-off-by: NRobin Murphy <robin.murphy@arm.com> Signed-off-by: NWill Deacon <will@kernel.org>
-
由 Robin Murphy 提交于
The nature of the LPAE format means that data->pg_shift is always redundant with data->bits_per_level, since they represent the size of a page and the number of PTEs per page respectively, and the size of a PTE is constant. Thus it works out more efficient to only store the latter, and derive the former via a trivial addition where necessary. Signed-off-by: NRobin Murphy <robin.murphy@arm.com> [will: Reworked granule check in iopte_to_paddr()] Signed-off-by: NWill Deacon <will@kernel.org>
-
由 Robin Murphy 提交于
We use data->pgd_size directly for the one-off allocation and freeing of the top-level table, but otherwise it serves for ARM_LPAE_PGD_IDX() to repeatedly re-calculate the effective number of top-level address bits it represents. Flip this around so we store the form we most commonly need, and derive the lesser-used one instead. This cuts a whole bunch of code out of the map/unmap/iova_to_phys fast-paths. Signed-off-by: NRobin Murphy <robin.murphy@arm.com> Signed-off-by: NWill Deacon <will@kernel.org>
-
由 Robin Murphy 提交于
Beyond a couple of allocation-time calculations, data->levels is only ever used to derive the start level. Storing the start level directly leads to a small reduction in object code, which should help eke out a little more efficiency, and slightly more readable source to boot. Signed-off-by: NRobin Murphy <robin.murphy@arm.com> Signed-off-by: NWill Deacon <will@kernel.org>
-
由 Robin Murphy 提交于
We're merely checking that the relevant upper bits of each address are all zero, so there are cheaper ways to achieve that. Signed-off-by: NRobin Murphy <robin.murphy@arm.com> Signed-off-by: NWill Deacon <will@kernel.org>
-
由 Robin Murphy 提交于
It makes little sense to only validate the requested size after we think we've found a matching block size - making the check up-front is simple, and far more logical than waiting to walk off the bottom of the table to infer that we must have been passed a bogus size to start with. We're missing an equivalent check on the unmap path, so add that as well for consistency. Signed-off-by: NRobin Murphy <robin.murphy@arm.com> Signed-off-by: NWill Deacon <will@kernel.org>
-
由 Robin Murphy 提交于
The selftests run as an initcall, but the annotation of the various callbacks and data seems to be somewhat arbitrary. Add it consistently for everything related to the selftests. Signed-off-by: NRobin Murphy <robin.murphy@arm.com> Signed-off-by: NWill Deacon <will@kernel.org>
-
由 Vivek Gautam 提交于
Add reset hook for sdm845 based platforms to turn off the wait-for-safe sequence. Understanding how wait-for-safe logic affects USB and UFS performance on MTP845 and DB845 boards: Qcom's implementation of arm,mmu-500 adds a WAIT-FOR-SAFE logic to address under-performance issues in real-time clients, such as Display, and Camera. On receiving an invalidation requests, the SMMU forwards SAFE request to these clients and waits for SAFE ack signal from real-time clients. The SAFE signal from such clients is used to qualify the start of invalidation. This logic is controlled by chicken bits, one for each - MDP (display), IFE0, and IFE1 (camera), that can be accessed only from secure software on sdm845. This configuration, however, degrades the performance of non-real time clients, such as USB, and UFS etc. This happens because, with wait-for-safe logic enabled the hardware tries to throttle non-real time clients while waiting for SAFE ack signals from real-time clients. On mtp845 and db845 devices, with wait-for-safe logic enabled by the bootloaders we see degraded performance of USB and UFS when kernel enables the smmu stage-1 translations for these clients. Turn off this wait-for-safe logic from the kernel gets us back the perf of USB and UFS devices until we re-visit this when we start seeing perf issues on display/camera on upstream supported SDM845 platforms. The bootloaders on these boards implement secure monitor callbacks to handle a specific command - QCOM_SCM_SVC_SMMU_PROGRAM with which the logic can be toggled. There are other boards such as cheza whose bootloaders don't enable this logic. Such boards don't implement callbacks to handle the specific SCM call so disabling this logic for such boards will be a no-op. This change is inspired by the downstream change from Patrick Daly to address performance issues with display and camera by handling this wait-for-safe within separte io-pagetable ops to do TLB maintenance. So a big thanks to him for the change and for all the offline discussions. Without this change the UFS reads are pretty slow: $ time dd if=/dev/sda of=/dev/zero bs=1048576 count=10 conv=sync 10+0 records in 10+0 records out 10485760 bytes (10.0MB) copied, 22.394903 seconds, 457.2KB/s real 0m 22.39s user 0m 0.00s sys 0m 0.01s With this change they are back to rock! $ time dd if=/dev/sda of=/dev/zero bs=1048576 count=300 conv=sync 300+0 records in 300+0 records out 314572800 bytes (300.0MB) copied, 1.030541 seconds, 291.1MB/s real 0m 1.03s user 0m 0.00s sys 0m 0.54s Signed-off-by: NVivek Gautam <vivek.gautam@codeaurora.org> Reviewed-by: NRobin Murphy <robin.murphy@arm.com> Reviewed-by: NStephen Boyd <swboyd@chromium.org> Reviewed-by: NBjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: NSai Prakash Ranjan <saiprakash.ranjan@codeaurora.org> Signed-off-by: NWill Deacon <will@kernel.org>
-
由 Vivek Gautam 提交于
Qcom's smmu-500 needs to toggle wait-for-safe sequence to handle TLB invalidation sync's. Few firmwares allow doing that through SCM interface. Add API to toggle wait for safe from firmware through a SCM call. Signed-off-by: NVivek Gautam <vivek.gautam@codeaurora.org> Reviewed-by: NBjorn Andersson <bjorn.andersson@linaro.org> Reviewed-by: NStephen Boyd <swboyd@chromium.org> Acked-by: NAndy Gross <agross@kernel.org> Signed-off-by: NSai Prakash Ranjan <saiprakash.ranjan@codeaurora.org> Signed-off-by: NWill Deacon <will@kernel.org>
-
由 Vivek Gautam 提交于
There are scnenarios where drivers are required to make a scm call in atomic context, such as in one of the qcom's arm-smmu-500 errata [1]. [1] ("https://source.codeaurora.org/quic/la/kernel/msm-4.9/ tree/drivers/iommu/arm-smmu.c?h=msm-4.9#n4842") Signed-off-by: NVivek Gautam <vivek.gautam@codeaurora.org> Reviewed-by: NBjorn Andersson <bjorn.andersson@linaro.org> Reviewed-by: NStephen Boyd <swboyd@chromium.org> Acked-by: NAndy Gross <agross@kernel.org> Signed-off-by: NSai Prakash Ranjan <saiprakash.ranjan@codeaurora.org> Signed-off-by: NWill Deacon <will@kernel.org>
-
由 Will Deacon 提交于
The 'a0' member of 'struct arm_smccc_res' is declared as 'unsigned long', however the Qualcomm SCM firmware interface driver expects to receive negative error codes via this field, so ensure that it's cast to 'long' before comparing to see if it is less than 0. Cc: <stable@vger.kernel.org> Reviewed-by: NBjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: NWill Deacon <will@kernel.org>
-
- 02 11月, 2019 1 次提交
-
-
由 Rob Clark 提交于
When games, browser, or anything using a lot of GPU buffers exits, there can be many hundreds or thousands of buffers to unmap and free. If the GPU is otherwise suspended, this can cause arm-smmu to resume/suspend for each buffer, resulting 5-10 seconds worth of reprogramming the context bank (arm_smmu_write_context_bank()/arm_smmu_write_s2cr()/etc). To the user it would appear that the system just locked up. A simple solution is to use pm_runtime_put_autosuspend() instead, so we don't immediately suspend the SMMU device. Reviewed-by: NRobin Murphy <robin.murphy@arm.com> Signed-off-by: NRob Clark <robdclark@chromium.org> Signed-off-by: NWill Deacon <will@kernel.org>
-
- 01 10月, 2019 10 次提交
-
-
由 Christophe JAILLET 提交于
'iommu_group_get_for_dev()' never returns NULL, so this test can be removed. Reviewed-by: NRobin Murphy <robin.murphy@arm.com> Signed-off-by: NChristophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: NWill Deacon <will@kernel.org>
-
由 Christophe JAILLET 提交于
The memory used by '__init' functions can be freed once the initialization phase has been performed. Mark some 'static const' array defined and used within some '__init' functions as '__initconst', so that the corresponding data can also be discarded. Without '__initconst', the data are put in the .rodata section. With the qualifier, they are put in the .init.rodata section. With gcc 8.3.0, the following changes have been measured: Without '__initconst': section size .rodata 00000720 .init.rodata 00000018 With '__initconst': section size .rodata 00000660 .init.rodata 00000058 Signed-off-by: NChristophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: NWill Deacon <will@kernel.org>
-
由 Robin Murphy 提交于
Although CONFIG_ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT is a welcome tool for smoking out inadequate firmware, the failure mode is non-obvious and can be confusing for end users. Add some special-case reporting of Unidentified Stream Faults to help clarify this particular symptom. Since we're adding yet another print to the mix, also break out an explicit ratelimit state to make sure everything stays together (and reduce the static storage footprint a little). Reviewed-by: NDouglas Anderson <dianders@chromium.org> Signed-off-by: NRobin Murphy <robin.murphy@arm.com> Signed-off-by: NWill Deacon <will@kernel.org>
-
由 Robin Murphy 提交于
Now it's just an empty wrapper. Signed-off-by: NRobin Murphy <robin.murphy@arm.com> Signed-off-by: NWill Deacon <will@kernel.org>
-
由 Robin Murphy 提交于
With the .tlb_sync interface no longer exposed directly to io-pgtable, strip away the remains of that abstraction layer. Retain the callback in spirit, though, by transforming it into an implementation override for the low-level sync routine itself, for which we will have at least one user. Signed-off-by: NRobin Murphy <robin.murphy@arm.com> Signed-off-by: NWill Deacon <will@kernel.org>
-
由 Robin Murphy 提交于
Now that the "leaf" flag is no longer part of an external interface, there's no need to use it to infer a register offset at runtime when we can just as easily encode the offset directly in its place. Signed-off-by: NRobin Murphy <robin.murphy@arm.com> Signed-off-by: NWill Deacon <will@kernel.org>
-
由 Robin Murphy 提交于
Fill in 'native' iommu_flush_ops callbacks for all the arm_smmu_flush_ops variants, and clear up the remains of the previous .tlb_inv_range abstraction. Signed-off-by: NRobin Murphy <robin.murphy@arm.com> Signed-off-by: NWill Deacon <will@kernel.org>
-
由 Robin Murphy 提交于
In principle, Midgard GPUs supporting smaller VA sizes should only require 3-level pagetables, since level 0 only resolves bits 48:40 of the address. However, the kbase driver does not appear to have any notion of a variable start level, and empirically T720 and T820 rapidly blow up with translation faults unless given a full 4-level table, despite only supporting a 33-bit VA size. The 'real' IAS value is still valuable in terms of validating addresses on map/unmap, so tweak the allocator to allow smaller values while still forcing the resultant tables to the full 4 levels. As far as I can test, this should make all known Midgard variants happy. Fixes: d08d42de ("iommu: io-pgtable: Add ARM Mali midgard MMU page table format") Tested-by: NNeil Armstrong <narmstrong@baylibre.com> Reviewed-by: NSteven Price <steven.price@arm.com> Reviewed-by: NRob Herring <robh@kernel.org> Signed-off-by: NRobin Murphy <robin.murphy@arm.com> Signed-off-by: NWill Deacon <will@kernel.org>
-
由 Robin Murphy 提交于
Whilst Midgard's MEMATTR follows a similar principle to the VMSA MAIR, the actual attribute values differ, so although it currently appears to work to some degree, we probably shouldn't be using our standard stage 1 MAIR for that. Instead, generate a reasonable MEMATTR with attribute values borrowed from the kbase driver; at this point we'll be overriding or ignoring pretty much all of the LPAE config, so just implement these Mali details in a dedicated allocator instead of pretending to subclass the standard VMSA format. Fixes: d08d42de ("iommu: io-pgtable: Add ARM Mali midgard MMU page table format") Tested-by: NNeil Armstrong <narmstrong@baylibre.com> Reviewed-by: NSteven Price <steven.price@arm.com> Reviewed-by: NRob Herring <robh@kernel.org> Signed-off-by: NRobin Murphy <robin.murphy@arm.com> Signed-off-by: NWill Deacon <will@kernel.org>
-
由 Liu Xiang 提交于
When alloc_io_pgtable_ops is failed, context bitmap which is just allocated by __arm_smmu_alloc_bitmap should be freed to release the resource. Signed-off-by: NLiu Xiang <liuxiang_1999@126.com> Signed-off-by: NWill Deacon <will@kernel.org>
-
- 30 9月, 2019 1 次提交
-
-
由 Linus Torvalds 提交于
For 5.3 we had to revert a nice ext4 IO pattern improvement, because it caused a bootup regression due to lack of entropy at bootup together with arguably broken user space that was asking for secure random numbers when it really didn't need to. See commit 72dbcf72 (Revert "ext4: make __ext4_get_inode_loc plug"). This aims to solve the issue by actively generating entropy noise using the CPU cycle counter when waiting for the random number generator to initialize. This only works when you have a high-frequency time stamp counter available, but that's the case on all modern x86 CPU's, and on most other modern CPU's too. What we do is to generate jitter entropy from the CPU cycle counter under a somewhat complex load: calling the scheduler while also guaranteeing a certain amount of timing noise by also triggering a timer. I'm sure we can tweak this, and that people will want to look at other alternatives, but there's been a number of papers written on jitter entropy, and this should really be fairly conservative by crediting one bit of entropy for every timer-induced jump in the cycle counter. Not because the timer itself would be all that unpredictable, but because the interaction between the timer and the loop is going to be. Even if (and perhaps particularly if) the timer actually happens on another CPU, the cacheline interaction between the loop that reads the cycle counter and the timer itself firing is going to add perturbations to the cycle counter values that get mixed into the entropy pool. As Thomas pointed out, with a modern out-of-order CPU, even quite simple loops show a fair amount of hard-to-predict timing variability even in the absense of external interrupts. But this tries to take that further by actually having a fairly complex interaction. This is not going to solve the entropy issue for architectures that have no CPU cycle counter, but it's not clear how (and if) that is solvable, and the hardware in question is largely starting to be irrelevant. And by doing this we can at least avoid some of the even more contentious approaches (like making the entropy waiting time out in order to avoid the possibly unbounded waiting). Cc: Ahmed Darwish <darwish.07@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Nicholas Mc Guire <hofrat@opentech.at> Cc: Andy Lutomirski <luto@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Willy Tarreau <w@1wt.eu> Cc: Alexander E. Patrakov <patrakov@gmail.com> Cc: Lennart Poettering <mzxreary@0pointer.de> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 29 9月, 2019 4 次提交
-
-
由 Björn Ardö 提交于
Add read-only versions of all EEPROMs. These versions are read-only on the i2c side, but can be written from the sysfs side. Signed-off-by: NBjörn Ardö <bjorn.ardo@axis.com> Signed-off-by: NWolfram Sang <wsa@the-dreams.de>
-
由 Jarkko Nikula 提交于
Commit b84398d6 ("i2c: i801: Use iTCO version 6 in Cannon Lake PCH and beyond") looks like to drop by accident Block Write-Block Read Process Call support for Intel Sunrisepoint, Lewisburg, Denverton and Kaby Lake. That support was added for above and newer platforms by the commit 315cd67c ("i2c: i801: Add Block Write-Block Read Process Call support") so bring it back for above platforms. Fixes: b84398d6 ("i2c: i801: Use iTCO version 6 in Cannon Lake PCH and beyond") Signed-off-by: NJarkko Nikula <jarkko.nikula@linux.intel.com> Reviewed-by: NAlexander Sverdlin <alexander.sverdlin@nokia.com> Signed-off-by: NWolfram Sang <wsa@the-dreams.de>
-
由 Chris Brandt 提交于
The NACKF flag should be cleared in INTRIICNAKI interrupt processing as description in HW manual. This issue shows up quickly when PREEMPT_RT is applied and a device is probed that is not plugged in (like a touchscreen controller). The result is endless interrupts that halt system boot. Fixes: 310c18a4 ("i2c: riic: add driver") Cc: stable@vger.kernel.org Reported-by: NChien Nguyen <chien.nguyen.eb@rvc.renesas.com> Signed-off-by: NChris Brandt <chris.brandt@renesas.com> Signed-off-by: NWolfram Sang <wsa@the-dreams.de>
-
由 Lee Jones 提交于
We have a production-level laptop (Lenovo Yoga C630) which is exhibiting a rather horrific bug. When I2C HID devices are being scanned for at boot-time the QCom Geni based I2C (Serial Engine) attempts to use DMA. When it does, the laptop reboots and the user never sees the OS. Attempts are being made to debug the reason for the spontaneous reboot. No luck so far, hence the requirement for this hot-fix. This workaround will be removed once we have a viable fix. Signed-off-by: NLee Jones <lee.jones@linaro.org> Tested-by: NBjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: NWolfram Sang <wsa@the-dreams.de>
-
- 28 9月, 2019 13 次提交
-
-
由 Joerg Roedel 提交于
The traversing of this list requires protection_domain->lock to be taken to avoid nasty races with attach/detach code. Make sure the lock is held on all code-paths traversing this list. Reported-by: NFilippo Sironi <sironi@amazon.de> Fixes: 92d420ec ("iommu/amd: Relax locking in dma_ops path") Reviewed-by: NFilippo Sironi <sironi@amazon.de> Reviewed-by: NJerry Snitselaar <jsnitsel@redhat.com> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Joerg Roedel 提交于
Make sure that attaching a detaching a device can't race against each other and protect the iommu_dev_data with a spin_lock in these code paths. Fixes: 92d420ec ("iommu/amd: Relax locking in dma_ops path") Reviewed-by: NFilippo Sironi <sironi@amazon.de> Reviewed-by: NJerry Snitselaar <jsnitsel@redhat.com> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Joerg Roedel 提交于
Check early in attach_device whether the device is already attached to a domain. This also simplifies the code path so that __attach_device() can be removed. Fixes: 92d420ec ("iommu/amd: Relax locking in dma_ops path") Reviewed-by: NFilippo Sironi <sironi@amazon.de> Reviewed-by: NJerry Snitselaar <jsnitsel@redhat.com> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Joerg Roedel 提交于
The code-paths before __attach_device() and __detach_device() are called also access and modify domain state, so take the domain lock there too. This allows to get rid of the __detach_device() function. Fixes: 92d420ec ("iommu/amd: Relax locking in dma_ops path") Reviewed-by: NFilippo Sironi <sironi@amazon.de> Reviewed-by: NJerry Snitselaar <jsnitsel@redhat.com> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Joerg Roedel 提交于
The lock is not necessary because the device table does not contain shared state that needs protection. Locking is only needed on an individual entry basis, and that needs to happen on the iommu_dev_data level. Fixes: 92d420ec ("iommu/amd: Relax locking in dma_ops path") Reviewed-by: NFilippo Sironi <sironi@amazon.de> Reviewed-by: NJerry Snitselaar <jsnitsel@redhat.com> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Joerg Roedel 提交于
This struct member was used to track whether a domain change requires updates to the device-table and IOMMU cache flushes. The problem is, that access to this field is racy since locking in the common mapping code-paths has been eliminated. Move the updated field to the stack to get rid of all potential races and remove the field from the struct. Fixes: 92d420ec ("iommu/amd: Relax locking in dma_ops path") Reviewed-by: NFilippo Sironi <sironi@amazon.de> Reviewed-by: NJerry Snitselaar <jsnitsel@redhat.com> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Colin Ian King 提交于
There is a statement that is indented too deeply, remove the extraneous tab. Signed-off-by: NColin Ian King <colin.king@canonical.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Navid Emamdoost 提交于
In nfp_abm_u32_knode_replace if the allocation for match fails it should go to the error handling instead of returning. Updated other gotos to have correct errno returned, too. Signed-off-by: NNavid Emamdoost <navid.emamdoost@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ben Chuang 提交于
Add support for the GL9750 and GL9755 chipsets. Enable v4 mode and wait 5ms after set 1.8V signal enable for GL9750/ GL9755. Fix the value of SDHCI_MAX_CURRENT register and use the vendor tuning flow for GL9750. Co-developed-by: NMichael K Johnson <johnsonm@danlj.org> Signed-off-by: NMichael K Johnson <johnsonm@danlj.org> Signed-off-by: NBen Chuang <ben.chuang@genesyslogic.com.tw> Acked-by: NAdrian Hunter <adrian.hunter@intel.com> Signed-off-by: NUlf Hansson <ulf.hansson@linaro.org>
-
由 Danielle Ratson 提交于
The ASIC can only mirror a packet to one port, but when user is trying to set more than one mirror action, it doesn't fail. Add a check if more than one mirror action was specified per rule and if so, fail for not being supported. Fixes: d0d13c18 ("mlxsw: spectrum_acl: Add support for mirror action") Signed-off-by: NDanielle Ratson <danieller@mellanox.com> Acked-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ido Schimmel 提交于
When a port is created, its VLAN filters are not cleared by the firmware. This causes tagged packets to be later dropped by the ingress STP filters, which default to DISCARD state. The above did not matter much until commit b5ce611f ("mlxsw: spectrum: Add devlink-trap support") where we exposed the drop reason to users. Without this patch, the drop reason users will see is not consistent. If a port is enslaved to a VLAN-aware bridge and a packet with an invalid VLAN tries to ingress the bridge, it will be dropped due to ingress STP filter. If the VLAN is later enabled and then disabled, the packet will be dropped by the ingress VLAN filter despite the above being a seemingly NOP operation. Fix this by clearing all the VLAN filters during port initialization. Adjust the test accordingly. Fixes: b5ce611f ("mlxsw: spectrum: Add devlink-trap support") Reported-by: NAlex Kushnarov <alexanderk@mellanox.com> Tested-by: NAlex Kushnarov <alexanderk@mellanox.com> Acked-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Colin Ian King 提交于
There memset is indented incorrectly, remove the extraneous tabs. Signed-off-by: NColin Ian King <colin.king@canonical.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Colin Ian King 提交于
The return statement is indented incorrectly, add in a missing tab and remove an extraneous space after the return Signed-off-by: NColin Ian King <colin.king@canonical.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-