- 17 1月, 2023 6 次提交
-
-
由 Thomas Weißschuh 提交于
As there are no external users this implementation detail does not need to be exported. Signed-off-by: NThomas Weißschuh <linux@weissschuh.net> Reviewed-by: NDavid Rheinsberg <david.rheinsberg@gmail.com> Reviewed-by: NHans de Goede <hdegoede@redhat.com> Signed-off-by: NJiri Kosina <jkosina@suse.cz>
-
由 Thomas Weißschuh 提交于
As there are no external users this implementation detail does not need to be exported. Signed-off-by: NThomas Weißschuh <linux@weissschuh.net> Reviewed-by: NDavid Rheinsberg <david.rheinsberg@gmail.com> Reviewed-by: NHans de Goede <hdegoede@redhat.com> Signed-off-by: NJiri Kosina <jkosina@suse.cz>
-
由 Thomas Weißschuh 提交于
As there are no external users this implementation detail does not need to be exported. Signed-off-by: NThomas Weißschuh <linux@weissschuh.net> Reviewed-by: NDavid Rheinsberg <david.rheinsberg@gmail.com> Reviewed-by: NHans de Goede <hdegoede@redhat.com> Signed-off-by: NJiri Kosina <jkosina@suse.cz>
-
由 Thomas Weißschuh 提交于
As no external users remain this implementation detail does not need to be exported anymore. Signed-off-by: NThomas Weißschuh <linux@weissschuh.net> Reviewed-by: NDavid Rheinsberg <david.rheinsberg@gmail.com> Reviewed-by: NHans de Goede <hdegoede@redhat.com> Signed-off-by: NJiri Kosina <jkosina@suse.cz>
-
由 Thomas Weißschuh 提交于
As the last user was removed we can delete this function. Signed-off-by: NThomas Weißschuh <linux@weissschuh.net> Reviewed-by: NDavid Rheinsberg <david.rheinsberg@gmail.com> Reviewed-by: NHans de Goede <hdegoede@redhat.com> Signed-off-by: NJiri Kosina <jkosina@suse.cz>
-
由 Thomas Weißschuh 提交于
By making hid_is_usb() a non-inline function the lowlevel usbhid driver does not have to be exported anymore. Also mark the argument as const as it is not modified. Signed-off-by: NThomas Weißschuh <linux@weissschuh.net> Reviewed-by: NDavid Rheinsberg <david.rheinsberg@gmail.com> Reviewed-by: NHans de Goede <hdegoede@redhat.com> Signed-off-by: NJiri Kosina <jkosina@suse.cz>
-
- 20 12月, 2022 1 次提交
-
-
由 José Expósito 提交于
HID descriptors with Battery System (0x85) Charging (0x44) usage are ignored and POWER_SUPPLY_STATUS_DISCHARGING is always reported to user space, even when the device is charging. Map this usage and when it is reported set the right charging status. In addition, add KUnit tests to make sure that the charging status is correctly set and reported. They can be run with the usual command: $ ./tools/testing/kunit/kunit.py run --kunitconfig=drivers/hid Signed-off-by: NJosé Expósito <jose.exposito89@gmail.com> Signed-off-by: NJiri Kosina <jkosina@suse.cz>
-
- 12 12月, 2022 3 次提交
-
-
由 Rong Tao 提交于
Fix the typo of 'suport' in kcov.h Link: https://lkml.kernel.org/r/tencent_922CA94B789587D79FD154445D035AA19E07@qq.comSigned-off-by: NRong Tao <rongtao@cestc.cn> Reviewed-by: NDmitry Vyukov <dvyukov@google.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Christophe JAILLET 提交于
It is spurious to have some code out-side the include guard in a .h file. Fix it. Link: https://lkml.kernel.org/r/4dbaf427d4300edba6c6bbfaf4d57493b9bec6ee.1669565241.git.christophe.jaillet@wanadoo.fr Fixes: 1fbaf8fc ("mm: add a io_mapping_map_user helper") Signed-off-by: NChristophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Zhang Qilong 提交于
Commit ee62c6b2 ("eventfd: change int to __u64 in eventfd_signal()") forgot to change int to __u64 in the CONFIG_EVENTFD=n stub function. Link: https://lkml.kernel.org/r/20221124140154.104680-1-zhangqilong3@huawei.com Fixes: ee62c6b2 ("eventfd: change int to __u64 in eventfd_signal()") Signed-off-by: NZhang Qilong <zhangqilong3@huawei.com> Cc: Dylan Yudaken <dylany@fb.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Sha Zhengju <handai.szj@taobao.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
- 11 12月, 2022 1 次提交
-
-
由 Dai Ngo 提交于
The delegation reaper is called by nfsd memory shrinker's on the 'count' callback. It scans the client list and sends the courtesy CB_RECALL_ANY to the clients that hold delegations. To avoid flooding the clients with CB_RECALL_ANY requests, the delegation reaper sends only one CB_RECALL_ANY request to each client per 5 seconds. Signed-off-by: NDai Ngo <dai.ngo@oracle.com> [ cel: moved definition of RCA4_TYPE_MASK_RDATA_DLG ] Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
-
- 10 12月, 2022 5 次提交
-
-
由 Li zeming 提交于
The svc_ungetu32 function is not used, you could remove it. Signed-off-by: NLi zeming <zeming@nfschina.com> Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
-
由 Tejun Heo 提交于
memcg_write_event_control() accesses the dentry->d_name of the specified control fd to route the write call. As a cgroup interface file can't be renamed, it's safe to access d_name as long as the specified file is a regular cgroup file. Also, as these cgroup interface files can't be removed before the directory, it's safe to access the parent too. Prior to 347c4a87 ("memcg: remove cgroup_event->cft"), there was a call to __file_cft() which verified that the specified file is a regular cgroupfs file before further accesses. The cftype pointer returned from __file_cft() was no longer necessary and the commit inadvertently dropped the file type check with it allowing any file to slip through. With the invarients broken, the d_name and parent accesses can now race against renames and removals of arbitrary files and cause use-after-free's. Fix the bug by resurrecting the file type check in __file_cft(). Now that cgroupfs is implemented through kernfs, checking the file operations needs to go through a layer of indirection. Instead, let's check the superblock and dentry type. Link: https://lkml.kernel.org/r/Y5FRm/cfcKPGzWwl@slm.duckdns.org Fixes: 347c4a87 ("memcg: remove cgroup_event->cft") Signed-off-by: NTejun Heo <tj@kernel.org> Reported-by: NJann Horn <jannh@google.com> Acked-by: NRoman Gushchin <roman.gushchin@linux.dev> Acked-by: NJohannes Weiner <hannes@cmpxchg.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: <stable@vger.kernel.org> [3.14+] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 David Hildenbrand 提交于
We use "unsigned long" to store a PFN in the kernel and phys_addr_t to store a physical address. On a 64bit system, both are 64bit wide. However, on a 32bit system, the latter might be 64bit wide. This is, for example, the case on x86 with PAE: phys_addr_t and PTEs are 64bit wide, while "unsigned long" only spans 32bit. The current definition of SWP_PFN_BITS without MAX_PHYSMEM_BITS misses that case, and assumes that the maximum PFN is limited by an 32bit phys_addr_t. This implies, that SWP_PFN_BITS will currently only be able to cover 4 GiB - 1 on any 32bit system with 4k page size, which is wrong. Let's rely on the number of bits in phys_addr_t instead, but make sure to not exceed the maximum swap offset, to not make the BUILD_BUG_ON() in is_pfn_swap_entry() unhappy. Note that swp_entry_t is effectively an unsigned long and the maximum swap offset shares that value with the swap type. For example, on an 8 GiB x86 PAE system with a kernel config based on Debian 11.5 (-> CONFIG_FLATMEM=y, CONFIG_X86_PAE=y), we will currently fail removing migration entries (remove_migration_ptes()), because mm/page_vma_mapped.c:check_pte() will fail to identify a PFN match as swp_offset_pfn() wrongly masks off PFN bits. For example, split_huge_page_to_list()->...->remap_page() will leave migration entries in place and continue to unlock the page. Later, when we stumble over these migration entries (e.g., via /proc/self/pagemap), pfn_swap_entry_to_page() will BUG_ON() because these migration entries shouldn't exist anymore and the page was unlocked. [ 33.067591] kernel BUG at include/linux/swapops.h:497! [ 33.067597] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI [ 33.067602] CPU: 3 PID: 742 Comm: cow Tainted: G E 6.1.0-rc8+ #16 [ 33.067605] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-1.fc36 04/01/2014 [ 33.067606] EIP: pagemap_pmd_range+0x644/0x650 [ 33.067612] Code: 00 00 00 00 66 90 89 ce b9 00 f0 ff ff e9 ff fb ff ff 89 d8 31 db e8 48 c6 52 00 e9 23 fb ff ff e8 61 83 56 00 e9 b6 fe ff ff <0f> 0b bf 00 f0 ff ff e9 38 fa ff ff 3e 8d 74 26 00 55 89 e5 57 31 [ 33.067615] EAX: ee394000 EBX: 00000002 ECX: ee394000 EDX: 00000000 [ 33.067617] ESI: c1b0ded4 EDI: 00024a00 EBP: c1b0ddb4 ESP: c1b0dd68 [ 33.067619] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00010246 [ 33.067624] CR0: 80050033 CR2: b7a00000 CR3: 01bbbd20 CR4: 00350ef0 [ 33.067625] Call Trace: [ 33.067628] ? madvise_free_pte_range+0x720/0x720 [ 33.067632] ? smaps_pte_range+0x4b0/0x4b0 [ 33.067634] walk_pgd_range+0x325/0x720 [ 33.067637] ? mt_find+0x1d6/0x3a0 [ 33.067641] ? mt_find+0x1d6/0x3a0 [ 33.067643] __walk_page_range+0x164/0x170 [ 33.067646] walk_page_range+0xf9/0x170 [ 33.067648] ? __kmem_cache_alloc_node+0x2a8/0x340 [ 33.067653] pagemap_read+0x124/0x280 [ 33.067658] ? default_llseek+0x101/0x160 [ 33.067662] ? smaps_account+0x1d0/0x1d0 [ 33.067664] vfs_read+0x90/0x290 [ 33.067667] ? do_madvise.part.0+0x24b/0x390 [ 33.067669] ? debug_smp_processor_id+0x12/0x20 [ 33.067673] ksys_pread64+0x58/0x90 [ 33.067675] __ia32_sys_ia32_pread64+0x1b/0x20 [ 33.067680] __do_fast_syscall_32+0x4c/0xc0 [ 33.067683] do_fast_syscall_32+0x29/0x60 [ 33.067686] do_SYSENTER_32+0x15/0x20 [ 33.067689] entry_SYSENTER_32+0x98/0xf1 Decrease the indentation level of SWP_PFN_BITS and SWP_PFN_MASK to keep it readable and consistent. [david@redhat.com: rely on sizeof(phys_addr_t) and min_t() instead] Link: https://lkml.kernel.org/r/20221206105737.69478-1-david@redhat.com [david@redhat.com: use "int" for comparison, as we're only comparing numbers < 64] Link: https://lkml.kernel.org/r/1f157500-2676-7cef-a84e-9224ed64e540@redhat.com Link: https://lkml.kernel.org/r/20221205150857.167583-1-david@redhat.com Fixes: 0d206b5d ("mm/swap: add swp_offset_pfn() to fetch PFN from swap entry") Signed-off-by: NDavid Hildenbrand <david@redhat.com> Acked-by: NPeter Xu <peterx@redhat.com> Reviewed-by: NYang Shi <shy828301@gmail.com> Cc: Hugh Dickins <hughd@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 William Breathitt Gray 提交于
Provide a public callback handle_mask_sync() that drivers can use when they have more complex IRQ masking logic. The default implementation is regmap_irq_handle_mask_sync(), used if the chip doesn't provide its own callback. Cc: Mark Brown <broonie@kernel.org> Signed-off-by: NWilliam Breathitt Gray <william.gray@linaro.org> Link: https://lore.kernel.org/r/e083474b3d467a86e6cb53da8072de4515bd6276.1669100542.git.william.gray@linaro.orgSigned-off-by: NMark Brown <broonie@kernel.org>
-
由 Roberto Sassu 提交于
The fs_context_parse_param hook already has a description, which seems the right one according to the code. Fixes: 8eb687bc ("lsm: Add/fix return values in lsm_hooks.h and fix formatting") Signed-off-by: NRoberto Sassu <roberto.sassu@huawei.com> Signed-off-by: NPaul Moore <paul@paul-moore.com>
-
- 09 12月, 2022 5 次提交
-
-
由 Jan Kara 提交于
jbd2_submit_inode_data() hardcoded use of jbd2_journal_submit_inode_data_buffers() for submission of data pages. Make it use j_submit_inode_data_buffers hook instead. This effectively switches ext4 fastcommits to use ext4_writepages() for data writeout instead of generic_writepages(). Signed-off-by: NJan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20221207112722.22220-9-jack@suse.czSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Jan Kara 提交于
When manipulating xattr blocks, we can deadlock infinitely looping inside ext4_xattr_block_set() where we constantly keep finding xattr block for reuse in mbcache but we are unable to reuse it because its reference count is too big. This happens because cache entry for the xattr block is marked as reusable (e_reusable set) although its reference count is too big. When this inconsistency happens, this inconsistent state is kept indefinitely and so ext4_xattr_block_set() keeps retrying indefinitely. The inconsistent state is caused by non-atomic update of e_reusable bit. e_reusable is part of a bitfield and e_reusable update can race with update of e_referenced bit in the same bitfield resulting in loss of one of the updates. Fix the problem by using atomic bitops instead. This bug has been around for many years, but it became *much* easier to hit after commit 65f8b800 ("ext4: fix race when reusing xattr blocks"). Cc: stable@vger.kernel.org Fixes: 6048c64b ("mbcache: add reusable flag to cache entries") Fixes: 65f8b800 ("ext4: fix race when reusing xattr blocks") Reported-and-tested-by: NJeremi Piotrowski <jpiotrowski@linux.microsoft.com> Reported-by: NThilo Fromm <t-lo@linux.microsoft.com> Link: https://lore.kernel.org/r/c77bf00f-4618-7149-56f1-b8d1664b9d07@linux.microsoft.com/Signed-off-by: NJan Kara <jack@suse.cz> Reviewed-by: NAndreas Dilger <adilger@dilger.ca> Link: https://lore.kernel.org/r/20221123193950.16758-1-jack@suse.czSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Tejun Heo 提交于
memcg_write_event_control() accesses the dentry->d_name of the specified control fd to route the write call. As a cgroup interface file can't be renamed, it's safe to access d_name as long as the specified file is a regular cgroup file. Also, as these cgroup interface files can't be removed before the directory, it's safe to access the parent too. Prior to 347c4a87 ("memcg: remove cgroup_event->cft"), there was a call to __file_cft() which verified that the specified file is a regular cgroupfs file before further accesses. The cftype pointer returned from __file_cft() was no longer necessary and the commit inadvertently dropped the file type check with it allowing any file to slip through. With the invarients broken, the d_name and parent accesses can now race against renames and removals of arbitrary files and cause use-after-free's. Fix the bug by resurrecting the file type check in __file_cft(). Now that cgroupfs is implemented through kernfs, checking the file operations needs to go through a layer of indirection. Instead, let's check the superblock and dentry type. Signed-off-by: NTejun Heo <tj@kernel.org> Fixes: 347c4a87 ("memcg: remove cgroup_event->cft") Cc: stable@kernel.org # v3.14+ Reported-by: NJann Horn <jannh@google.com> Acked-by: NJohannes Weiner <hannes@cmpxchg.org> Acked-by: NRoman Gushchin <roman.gushchin@linux.dev> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Dmitry Torokhov 提交于
Drop support for platform data from the driver because there are no users of st33zp24_platform_data structure in the mainline kernel. Signed-off-by: NDmitry Torokhov <dmitry.torokhov@gmail.com> Reviewed-by: NJarkko Sakkinen <jarkko@kernel.org> Signed-off-by: NJarkko Sakkinen <jarkko@kernel.org>
-
由 Christophe JAILLET 提交于
There is no need to include <linux/kernel.h> here. Prefer the less invasive <linux/types.h> and <linux/compiler_types.h> which are needed in this .h file itself. Signed-off-by: NChristophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/c1d479b39e30fe70c4579a1af035d4db49421f56.1670069909.git.christophe.jaillet@wanadoo.frSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 08 12月, 2022 2 次提交
-
-
由 ChiYuan Huang 提交于
Following by the below discussion, there's the potential UAF issue between regulator and mfd. https://lore.kernel.org/all/20221128143601.1698148-1-yangyingliang@huawei.com/ From the analysis of Yingliang CPU A |CPU B mt6370_probe() | devm_mfd_add_devices() | |mt6370_regulator_probe() | regulator_register() | //allocate init_data and add it to devres | regulator_of_get_init_data() i2c_unregister_device() | device_del() | devres_release_all() | // init_data is freed | release_nodes() | | // using init_data causes UAF | regulator_register() It's common to use mfd core to create child device for the regulator. In order to do the DT lookup for init data, the child that registered the regulator would pass its parent as the parameter. And this causes init data resource allocated to its parent, not itself. The issue happen when parent device is going to release and regulator core is still doing some operation of init data constraint for the regulator of child device. To fix it, this patch expand 'regulator_register' API to use the different devices for init data allocation and DT lookup. Reported-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NChiYuan Huang <cy_huang@richtek.com> Link: https://lore.kernel.org/r/1670311341-32664-1-git-send-email-u0084500@gmail.comSigned-off-by: NMark Brown <broonie@kernel.org>
-
由 Christoph Hellwig 提交于
This macro is obsolete, so replace the last few uses with open coded bi_opf assignments. Signed-off-by: NChristoph Hellwig <hch@lst.de> Acked-by: NColy Li <colyli@suse.de <mailto:colyli@suse.de>> Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: NBart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20221206144057.720846-1-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 07 12月, 2022 6 次提交
-
-
由 Pavel Begunkov 提交于
Use task_work for completing rsrc removals, it'll be needed later for spinlock optimisations. Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/cbba5d53a11ee6fc2194dacea262c1d733c8b529.1670384893.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Pavel Begunkov 提交于
This patch adds ctx->task_complete flag. If set, we'll complete all requests in the context of the original task. Note, this extends to completion CQE posting only but not io_kiocb cleanup / free, e.g. io-wq may free the requests in the free calllback. This flag will be used later for optimisations purposes. Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/21ece72953f76bb2e77659a72a14326227ab6460.1670384893.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jingbo Xu 提交于
Add prepare_ondemand_read() callback dedicated for the on-demand read scenario, so that callers from this scenario can be decoupled from netfs_io_subrequest. The original cachefiles_prepare_read() is now refactored to a generic routine accepting a parameter list instead of netfs_io_subrequest. There's no logic change, except that the debug id of subrequest and request is removed from trace_cachefiles_prep_read(). Reviewed-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NJingbo Xu <jefflexu@linux.alibaba.com> Acked-by: NDavid Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/20221124034212.81892-2-jefflexu@linux.alibaba.comSigned-off-by: NGao Xiang <hsiangkao@linux.alibaba.com>
-
由 Roberto Sassu 提交于
Ensure that for non-void LSM hooks there is a description of the return values. Also, replace spaces with tab for indentation, remove empty lines between the hook description and the list of parameters, adjust semicolons and add the period at the end of the parameter description. Finally, move the description of gfp parameter of the xfrm_policy_alloc_security hook together with the others. Signed-off-by: NRoberto Sassu <roberto.sassu@huawei.com> [PM: /replaces./replaced./] Signed-off-by: NPaul Moore <paul@paul-moore.com>
-
由 Roberto Sassu 提交于
include/linux/lsm_hooks.h reports the result of the LSM infrastructure to the callers, not what LSMs should return to the LSM infrastructure. Clarify that and add that if all LSMs return a positive value __vm_enough_memory() will be called with cap_sys_admin set. If at least one LSM returns 0 or negative, it will be called with cap_sys_admin cleared. Signed-off-by: NRoberto Sassu <roberto.sassu@huawei.com> Signed-off-by: NPaul Moore <paul@paul-moore.com>
-
由 Christoph Hellwig 提交于
With the pktcdvdv removal, bio_copy_data_iter is unused now. Fold the logic into bio_copy_data and remove the separate lower level function. Signed-off-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20221206144407.722049-1-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 06 12月, 2022 11 次提交
-
-
由 Thomas Gleixner 提交于
Single vector allocation which allocates the next free index in the IMS space. The free function releases. All allocated vectors are released also via pci_free_vectors() which is also releasing MSI/MSI-X vectors. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NKevin Tian <kevin.tian@intel.com> Acked-by: NBjorn Helgaas <bhelgaas@google.com> Acked-by: NMarc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221124232326.961711347@linutronix.de
-
由 Thomas Gleixner 提交于
IMS (Interrupt Message Store) is a new specification which allows implementation specific storage of MSI messages contrary to the strict standard specified MSI and MSI-X message stores. This requires new device specific interrupt domains to handle the implementation defined storage which can be an array in device memory or host/guest memory which is shared with hardware queues. Add a function to create IMS domains for PCI devices. IMS domains are using the new per device domain mechanism and are configured by the device driver via a template. IMS domains are created as secondary device domains so they work side on side with MSI[-X] on the same device. The IMS domains have a few constraints: - The index space is managed by the core code. Device memory based IMS provides a storage array with a fixed size which obviously requires an index. But there is no association between index and functionality so the core can randomly allocate an index in the array. System memory based IMS does not have the concept of an index as the storage is somewhere in memory. In that case the index is purely software based to keep track of the allocations. - There is no requirement for consecutive index ranges This is currently a limitation of the MSI core and can be implemented if there is a justified use case by changing the internal storage from xarray to maple_tree. For now it's single vector allocation. - The interrupt chip must provide the following callbacks: - irq_mask() - irq_unmask() - irq_write_msi_msg() - The interrupt chip must provide the following optional callbacks when the irq_mask(), irq_unmask() and irq_write_msi_msg() callbacks cannot operate directly on hardware, e.g. in the case that the interrupt message store is in queue memory: - irq_bus_lock() - irq_bus_unlock() These callbacks are invoked from preemptible task context and are allowed to sleep. In this case the mandatory callbacks above just store the information. The irq_bus_unlock() callback is supposed to make the change effective before returning. - Interrupt affinity setting is handled by the underlying parent interrupt domain and communicated to the IMS domain via irq_write_msi_msg(). IMS domains cannot have a irq_set_affinity() callback. That's a reasonable restriction similar to the PCI/MSI device domain implementations. The domain is automatically destroyed when the PCI device is removed. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NKevin Tian <kevin.tian@intel.com> Acked-by: NMarc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221124232326.904316841@linutronix.de -
由 Thomas Gleixner 提交于
Provide the necessary constants for PCI/IMS support: - A new bus token for MSI irqdomain identification - A MSI feature flag for the MSI irqdomains to signal support - A secondary domain id The latter expands the device internal domain pointer storage array from 1 to 2 entries. That extra pointer is mostly unused today, but the alternative solutions would not be free either and would introduce more complexity all over the place. Trade the 8bytes for simplicity. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NKevin Tian <kevin.tian@intel.com> Acked-by: NMarc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221124232326.846169830@linutronix.de
-
由 Thomas Gleixner 提交于
MSI-X vectors can be allocated after the initial MSI-X enablement, but this needs explicit support of the underlying interrupt domains. Provide a function to query the ability and functions to allocate/free individual vectors post-enable. The allocation can either request a specific index in the MSI-X table or with the index argument MSI_ANY_INDEX it allocates the next free vector. The return value is a struct msi_map which on success contains both index and the Linux interrupt number. In case of failure index is negative and the Linux interrupt number is 0. The allocation function is for a single MSI-X index at a time as that's sufficient for the most urgent use case VFIO to get rid of the 'disable MSI-X, reallocate, enable-MSI-X' cycle which is prone to lost interrupts and redirections to the legacy and obviously unhandled INTx. As single index allocation is also sufficient for the use cases Jason Gunthorpe pointed out: Allocation of a MSI-X or IMS vector for a network queue. See Link below. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NKevin Tian <kevin.tian@intel.com> Acked-by: NBjorn Helgaas <bhelgaas@google.com> Acked-by: NMarc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/all/20211126232735.547996838@linutronix.de Link: https://lore.kernel.org/r/20221124232326.731233614@linutronix.de
-
由 Thomas Gleixner 提交于
Provide a new MSI feature flag in preparation for dynamic MSIX allocation after the initial MSI-X enable has been done. This needs to be an explicit MSI interrupt domain feature because quite some implementations (both interrupt domains and legacy allocation mode) have clear expectations that the allocation code is only invoked when MSI-X is about to be enabled. They either talk to hypervisors or do some other work and are not prepared to be invoked on an already MSI-X enabled device. This is also explicit MSI-X only because rewriting the size of the MSI entries is only possible when disabling MSI which in turn might cause lost interrupts on the device. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NJason Gunthorpe <jgg@nvidia.com> Reviewed-by: NKevin Tian <kevin.tian@intel.com> Acked-by: NMarc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221124232326.558843119@linutronix.de
-
由 Thomas Gleixner 提交于
For supporting post MSI-X enable allocations and for the upcoming PCI/IMS support a separate interface is required which allows not only the allocation of a specific index, but also the allocation of any, i.e. the next free index. The latter is especially required for IMS because IMS completely does away with index to functionality mappings which are often found in MSI/MSI-X implementation. But even with MSI-X there are devices where only the first few indices have a fixed functionality and the rest is freely assignable by software, e.g. to queues. msi_domain_alloc_irq_at() is also different from the range based interfaces as it always enforces that the MSI descriptor is allocated by the core code and not preallocated by the caller like the PCI/MSI[-X] enable code path does. msi_domain_alloc_irq_at() can be invoked with the index argument set to MSI_ANY_INDEX which makes the core code pick the next free index. The irq domain can provide a prepare_desc() operation callback in it's msi_domain_ops to do domain specific post allocation initialization before the actual Linux interrupt and the associated interrupt descriptor and hierarchy alloccations are conducted. The function also takes an optional @icookie argument which is of type union msi_instance_cookie. This cookie is not used by the core code and is stored in the allocated msi_desc::data::icookie. The meaning of the cookie is completely implementation defined. In case of IMS this might be a PASID or a pointer to a device queue, but for the MSI core it's opaque and not used in any way. The function returns a struct msi_map which on success contains the allocated index number and the Linux interrupt number so the caller can spare the index to Linux interrupt number lookup. On failure map::index contains the error code and map::virq is 0. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NKevin Tian <kevin.tian@intel.com> Acked-by: NMarc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221124232326.501359457@linutronix.de
-
由 Thomas Gleixner 提交于
The existing MSI domain ops msi_prepare() and set_desc() turned out to be unsuitable for implementing IMS support. msi_prepare() does not operate on the MSI descriptors. set_desc() lacks an irq_domain pointer and has a completely different purpose. Introduce a prepare_desc() op which allows IMS implementations to amend an MSI descriptor which was allocated by the core code, e.g. by adjusting the iomem base or adding some data based on the allocated index. This is way better than requiring that all IMS domain implementations preallocate the MSI descriptor and then allocate the interrupt. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NKevin Tian <kevin.tian@intel.com> Acked-by: NMarc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221124232326.444560717@linutronix.de
-
由 Thomas Gleixner 提交于
The upcoming support for PCI/IMS requires to store some information related to the message handling in the MSI descriptor, e.g. PASID or a pointer to a queue. Provide a generic storage struct which maps over the existing PCI specific storage which means the size of struct msi_desc is not getting bigger. This storage struct has two elements: 1) msi_domain_cookie 2) msi_instance_cookie The domain cookie is going to be used to store domain specific information, e.g. iobase pointer, data pointer. The instance cookie is going to be handed in when allocating an interrupt on an IMS domain so the irq chip callbacks of the IMS domain have the necessary per vector information available. It also comes in handy when cleaning up the platform MSI code for wire to MSI bridges which need to hand down the type information to the underlying interrupt domain. For the core code the cookies are opaque and meaningless. It just stores the instance cookie during an allocation through the upcoming interfaces for IMS and wire to MSI brigdes. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NKevin Tian <kevin.tian@intel.com> Acked-by: NMarc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221124232326.385036043@linutronix.de
-
由 Thomas Gleixner 提交于
A simple struct to hold a MSI index / Linux interrupt number pair. It will be returned from the dynamic vector allocation function and handed back to the corresponding free() function. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NKevin Tian <kevin.tian@intel.com> Acked-by: NMarc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221124232326.326410494@linutronix.de
-
由 Thomas Gleixner 提交于
Remove the global PCI/MSI irqdomain implementation and provide the required MSI parent ops so the PCI/MSI code can detect the new parent and setup per device domains. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NKevin Tian <kevin.tian@intel.com> Acked-by: NMarc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221124232326.209212272@linutronix.de
-
由 Thomas Gleixner 提交于
Remove the global PCI/MSI irqdomain implementation and provide the required MSI parent ops so the PCI/MSI code can detect the new parent and setup per device domains. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NKevin Tian <kevin.tian@intel.com> Acked-by: NMarc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221124232326.151226317@linutronix.de
-