- 17 1月, 2020 40 次提交
-
-
由 Frank Sorenson 提交于
commit 86bbd7422ae6a33735df6846fd685e46686da714 upstream. The filehandle has a length which is defined as a 32-bit "unsigned integer". Change sign of the length appropriately. Signed-off-by: NFrank Sorenson <sorenson@redhat.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
-
由 ZhangXiaoxu 提交于
commit ded52fbe7020a5696b0b0a0fdbf234e37bf16f94 upstream. After setxattr, the nfsv3 cached the acl which set by user. But at the backend, the shared file system (eg. ext4) will check the acl, if it can merged with mode, it won't add acl to the file. So, the nfsv3 cached acl is redundant. Don't 'set_cached_acl' when setxattr. Signed-off-by: NZhangXiaoxu <zhangxiaoxu5@huawei.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
-
由 Amir Goldstein 提交于
commit c553ea4fdf2701d64b9e9cca4497a8a2512bb025 upstream. 23d01270 ("fs/sync.c: make sync_file_range(2) use WB_SYNC_NONE writeback") claims that sync_file_range(2) syscall was "created for userspace to be able to issue background writeout and so waiting for in-flight IO is undesirable there" and changes the writeback (back) to WB_SYNC_NONE. This claim is only partially true. It is true for users that use the flag SYNC_FILE_RANGE_WRITE by itself, as does PostgreSQL, the user that was the reason for changing to WB_SYNC_NONE writeback. However, that claim is not true for users that use that flag combination SYNC_FILE_RANGE_{WAIT_BEFORE|WRITE|_WAIT_AFTER}. Those users explicitly requested to wait for in-flight IO as well as to writeback of dirty pages. Re-brand that flag combination as SYNC_FILE_RANGE_WRITE_AND_WAIT and use WB_SYNC_ALL writeback to perform the full range sync request. Link: http://lkml.kernel.org/r/20190409114922.30095-1-amir73il@gmail.com Link: http://lkml.kernel.org/r/20190419072938.31320-1-amir73il@gmail.com Fixes: 23d01270 ("fs/sync.c: make sync_file_range(2) use WB_SYNC_NONE") Signed-off-by: NAmir Goldstein <amir73il@gmail.com> Acked-by: NJan Kara <jack@suse.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
-
由 Greg Kroah-Hartman 提交于
commit de96e9fea7ba56042f105b6fe163447b280eb800 upstream. It's rude to crash the system just because the developer did something wrong, as it prevents them from usually even seeing what went wrong. So convert the few BUG_ON() calls that have snuck into the sysfs code over the years to WARN_ON() to make it more "friendly". All of these are able to be recovered from, so it makes no sense to crash. Reported-by: NLinus Torvalds <torvalds@linux-foundation.org> Reviewed-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
-
由 zhangyi (F) 提交于
commit 9ba55543fc0c6bb1cf8edd63be8802d9ab7e1202 upstream. If user specify a large enough value of "commit=" option, it may trigger signed integer overflow which may lead to sbi->s_commit_interval becomes a large or small value, zero in particular. UBSAN: Undefined behaviour in ../fs/ext4/super.c:1592:31 signed integer overflow: 536870912 * 1000 cannot be represented in type 'int' [...] Call trace: [...] [<ffffff9008a2d120>] ubsan_epilogue+0x34/0x9c lib/ubsan.c:166 [<ffffff9008a2d8b8>] handle_overflow+0x228/0x280 lib/ubsan.c:197 [<ffffff9008a2d95c>] __ubsan_handle_mul_overflow+0x4c/0x68 lib/ubsan.c:218 [<ffffff90086d070c>] handle_mount_opt fs/ext4/super.c:1592 [inline] [<ffffff90086d070c>] parse_options+0x1724/0x1a40 fs/ext4/super.c:1773 [<ffffff90086d51c4>] ext4_remount+0x2ec/0x14a0 fs/ext4/super.c:4834 [...] Although it is not a big deal, still silence the UBSAN by limit the input value. Signed-off-by: Nzhangyi (F) <yi.zhang@huawei.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Reviewed-by: NJan Kara <jack@suse.cz> Signed-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 Khazhismel Kumykov 提交于
commit 4b99faa23c51ca31312b9afb876e8e5878daeb80 upstream. Signed-off-by: NKhazhismel Kumykov <khazhy@google.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Reviewed-by: NAndreas Dilger <adilger@dilger.ca> Signed-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 zhangyi (F) 提交于
commit 597599268e3b91cac71faf48743f4783dec682fc upstream. We do not unmap and clear dirty flag when forgetting a buffer without journal or does not belongs to any transaction, so the invalid dirty data may still be written to the disk later. It's fine if the corresponding block is never used before the next mount, and it's also fine that we invoke clean_bdev_aliases() related functions to unmap the block device mapping when re-allocating such freed block as data block. But this logic is somewhat fragile and risky that may lead to data corruption if we forget to clean bdev aliases. So, It's better to discard dirty data during forget time. We have been already handled all the cases of forgetting journalled buffer, this patch deal with the remaining two cases. - buffer is not journalled yet, - buffer is journalled but doesn't belongs to any transaction. We invoke __bforget() instead of __brelese() when forgetting an un-journalled buffer in jbd2_journal_forget(). After this patch we can remove all clean_bdev_aliases() related calls in ext4. Suggested-by: NJan Kara <jack@suse.cz> Signed-off-by: Nzhangyi (F) <yi.zhang@huawei.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Reviewed-by: NJan Kara <jack@suse.cz> Signed-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 Nikolay Borisov 提交于
commit 82dd124c40b8cda710878b88fb0182301c040ffe upstream. There is a function which clearly conveys the objective of checking i_writecount. Additionally the usage in ext4_mb_initialize_context was wrong, since a node would have wrongfully been reported as writable if i_writecount had a negative value (MMAP_DENY_WRITE). Signed-off-by: NNikolay Borisov <nborisov@suse.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Reviewed-by: NJan Kara <jack@suse.cz> Signed-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 Ming Lei 提交于
commit 594b9a89af8e7629e95a4cd844d188361be32790 upstream. mp_bvec_for_each_segment() is a bit big for the iteration, so introduce a light-weight helper for iterating over pages, then 32bytes stack space can be saved. Signed-off-by: NMing Lei <ming.lei@redhat.com> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: NJeffle Xu <jefflexu@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 Joseph Qi 提交于
Cherry-pick from upstream commit 4d633062c1c0794a6b3836b7b55afba4599736e8. Single-page bvec can often be seen in small BS workloads, so introduce bvec_nth_page() for avoiding to call nth_page() unnecessarily, which looks not cheap. Signed-off-by: NMing Lei <ming.lei@redhat.com> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: NJeffle Xu <jefflexu@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 Christoph Hellwig 提交于
commit 81214bab582eeda068e7904d57b6a3095e8f3855 upstream. Store the request queue the last bio was submitted to in the iocb private data in addition to the cookie so that we find the right block device. Also refactor the common direct I/O bio submission code into a nice little helper. Signed-off-by: NChristoph Hellwig <hch@lst.de> Modified to use bio_set_polled(). Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 Jens Axboe 提交于
commit 0bbb280d7b767e7c86a5adfc87c76a6f09ab0423 upstream. For the upcoming async polled IO, we can't sleep allocating requests. If we do, then we introduce a deadlock where the submitter already has async polled IO in-flight, but can't wait for them to complete since polled requests must be active found and reaped. Utilize the helper in the blockdev DIRECT_IO code. Reviewed-by: NHannes Reinecke <hare@suse.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 Christoph Hellwig 提交于
commit eae83ce10b4713d9f4f3419af16436f89c1a7172 upstream. Just call blk_poll on the iocb cookie, we can derive the block device from the inode trivially. Reviewed-by: NHannes Reinecke <hare@suse.com> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 Christoph Hellwig 提交于
commit fb7e160019f4abb4082740bfeb27a38f6389c745 upstream. This new methods is used to explicitly poll for I/O completion for an iocb. It must be called for any iocb submitted asynchronously (that is with a non-null ki_complete) which has the IOCB_HIPRI flag set. The method is assisted by a new ki_cookie field in struct iocb to store the polling cookie. Reviewed-by: NHannes Reinecke <hare@suse.com> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 Christoph Hellwig 提交于
commit d04c406f29d9f4dbcb5eb5aa79ce0445c7e9d652 upstream. This prevents a HIPRI bio from being submitted through a stacking driver that does not support polling and thus won't poll for I/O completion. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 Deepa Dinamani 提交于
commit 854a6ed56839a40f6b5d02a2962f48841482eec4 upstream. Refactor the logic to restore the sigmask before the syscall returns into an api. This is useful for versions of syscalls that pass in the sigmask and expect the current->sigmask to be changed during the execution and restored after the execution of the syscall. With the advent of new y2038 syscalls in the subsequent patches, we add two more new versions of the syscalls (for pselect, ppoll and io_pgetevents) in addition to the existing native and compat versions. Adding such an api reduces the logic that would need to be replicated otherwise. Signed-off-by: NDeepa Dinamani <deepa.kernel@gmail.com> Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 Deepa Dinamani 提交于
commit ded653ccbec0335a78fa7a7aff3ec9870349fafb upstream. Refactor reading sigset from userspace and updating sigmask into an api. This is useful for versions of syscalls that pass in the sigmask and expect the current->sigmask to be changed during, and restored after, the execution of the syscall. With the advent of new y2038 syscalls in the subsequent patches, we add two more new versions of the syscalls (for pselect, ppoll, and io_pgetevents) in addition to the existing native and compat versions. Adding such an api reduces the logic that would need to be replicated otherwise. Note that the calls to sigprocmask() ignored the return value from the api as the function only returns an error on an invalid first argument that is hardcoded at these call sites. The updated logic uses set_current_blocked() instead. Signed-off-by: NDeepa Dinamani <deepa.kernel@gmail.com> Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 Christoph Hellwig 提交于
commit 529262d56dbebe6a26df5d2fd24cc0e1bc8579e5 upstream. This was intended to support users like nvme multipath, but is just getting in the way and adding another indirect call. Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 Jens Axboe 提交于
commit 0a1b8b87d064a47fad9ec475316002da28559207 upstream. blk_poll() has always kept spinning until it found an IO. This is fine for SYNC polling, since we need to find one request we have pending, but in preparation for ASYNC polling it can be beneficial to just check if we have any entries available or not. Existing callers are converted to pass in 'spin == true', to retain the old behavior. Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 Jens Axboe 提交于
commit 1052b8ac5282daf35df331edcbdb645839d17e6a upstream. If we want to support async IO polling, then we have to allow finding completions that aren't just for the one we are looking for. Always pass in -1 to the mq_ops->poll() helper, and have that return how many events were found in this poll loop. Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 Damien Le Moal 提交于
commit 64845a1ddd655574886eb48e9a5eaeeb9b05bf0d upstream. Define get_current_ioprio() as an inline helper to obtain the caller I/O priority from its task I/O context. Use this helper in blk_init_request_from_bio() to set a request ioprio. Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 Jens Axboe 提交于
commit 85f4d4b65fdd67f1d6dc9eeb1d91923cef07eb6a upstream. We currently only really support sync poll, ie poll with 1 IO in flight. This prepares us for supporting async poll. Note that the returned value isn't necessarily 100% accurate. If poll races with IRQ completion, we assume that the fact that the task is now runnable means we found at least one entry. In reality it could be more than 1, or not even 1. This is fine, the caller will just need to take this into account. Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 Jens Axboe 提交于
commit d34513d384487e8022f143a3a6b791e6d7f0dad6 upstream. Inherit the iocb IOCB_HIPRI flag, and pass on REQ_HIPRI for those kinds of requests. Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 Jens Axboe 提交于
commit d1e36282b0bbd5de6a9c4d5275e94ef3b3438f48 upstream. We use IOCB_HIPRI to poll for IO in the caller instead of scheduling. This information is not available for (or after) IO submission. The driver may make different queue choices based on the type of IO, so make the fact that we will poll for this IO known to the lower layers as well. Reviewed-by: NHannes Reinecke <hare@suse.com> Reviewed-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 David Howells 提交于
commit aa563d7bca6e882ec2bdae24603c8f016401a144 upstream. In the iov_iter struct, separate the iterator type from the iterator direction and use accessor functions to access them in most places. Convert a bunch of places to use switch-statements to access them rather then chains of bitwise-AND statements. This makes it easier to add further iterator types. Also, this can be more efficient as to implement a switch of small contiguous integers, the compiler can use ~50% fewer compare instructions than it has to use bitwise-and instructions. Further, cease passing the iterator type into the iterator setup function. The iterator function can set that itself. Only the direction is required. Signed-off-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Reviewed-by: NJeffle Xu <jefflexu@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 David Howells 提交于
commit 00e23707442a75b404392cef1405ab4fd498de6b upstream. Use accessor functions to access an iterator's type and direction. This allows for the possibility of using some other method of determining the type of iterator than if-chains with bitwise-AND conditions. Signed-off-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Reviewed-by: NJeffle Xu <jefflexu@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 Aristeu Rozanski 提交于
cherry-pick form linux-next commit 854bb48018d5da261d438b2232fa683bdb553979. Both skx_edac and i10nm_edac drivers are loaded based on the matching CPU being available which leads the module to be automatically loaded in virtual machines as well. That will fail due the missing PCI devices. In both drivers the first function to make use of the PCI devices is skx_get_hi_lo() will simply print EDAC skx: Can't get tolm/tohm for each CPU core, which is noisy. This patch makes it a debug message. Signed-off-by: NAristeu Rozanski <aris@redhat.com> Signed-off-by: NTony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20191204212325.c4k47p5hrnn3vpb5@redhat.comSigned-off-by: NShile Zhang <shile.zhang@linux.alibaba.com> Acked-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 Joseph Qi 提交于
This is to fix the build error if CONFIG_CGROUP_CPUACCT is not enabled. kernel/sched/psi.c: In function 'iterate_groups': kernel/sched/psi.c:732:31: error: 'cpuacct_cgrp_id' undeclared (first use in this function); did you mean 'cpuacct_charge'? Reported-by: Nkbuild test robot <lkp@intel.com> Fixes: 1f49a738 ("alinux: psi: Support PSI under cgroup v1") Signed-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: NXunlei Pang <xlpang@linux.alibaba.com>
-
由 Joseph Qi 提交于
This fixes the following format build warning: block/blk-iocost.c: In function 'ioc_stat_prfill': block/blk-iocost.c:2506:17: warning: format '%llu' expects argument of type 'long long unsigned int', but argument 9 has type 'long int' [-Wformat=] Reported-by: Nkbuild test robot <lkp@intel.com> Fixes: 0670363c ("alinux: iocost: add ioc_gq stat") Signed-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: NJeffle Xu <jefflexu@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 kbuild test robot 提交于
Fixes: 8b371635 ("alinux: mm: memcontrol: support background async page reclaim") Signed-off-by: Nkbuild test robot <lkp@intel.com> Cc: Yang Shi <yang.shi@linux.alibaba.com> Cc: Xunlei Pang <xlpang@linux.alibaba.com> Acked-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 Cambda Zhu 提交于
commit 853697504de043ff0bfd815bd3a64de1dce73dc7 upstream. >From commit 50895b9d ("tcp: highest_sack fix"), the logic about setting tp->highest_sack to the head of the send queue was removed. Of course the logic is error prone, but it is logical. Before we remove the pointer to the highest sack skb and use the seq instead, we need to set tp->highest_sack to NULL when there is no skb after the last sack, and then replace NULL with the real skb when new skb inserted into the rtx queue, because the NULL means the highest sack seq is tp->snd_nxt. If tp->highest_sack is NULL and new data sent, the next ACK with sack option will increase tp->reordering unexpectedly. This patch sets tp->highest_sack to the tail of the rtx queue if it's NULL and new data is sent. The patch keeps the rule that the highest_sack can only be maintained by sack processing, except for this only case. Fixes: 50895b9d ("tcp: highest_sack fix") Signed-off-by: NCambda Zhu <cambda@linux.alibaba.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: NDust Li <dust.li@linux.alibaba.com>
-
由 Joerg Roedel 提交于
commit 46ac18c347b00be29b265c28209b0f3c38a1f142 upstream. The increase_address_space() function has to check the PM_LEVEL_SIZE() condition again under the domain->lock to avoid a false trigger of the WARN_ON_ONCE() and to avoid that the address space is increase more often than necessary. Reported-by: NQian Cai <cai@lca.pw> Fixes: 754265bcab78 ("iommu/amd: Fix race in increase_address_space()") Reviewed-by: NJerry Snitselaar <jsnitsel@redhat.com> Signed-off-by: NJoerg Roedel <jroedel@suse.de> Signed-off-by: NWANG Siyuan <Siyuan.Wang@amd.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 Filippo Sironi 提交于
commit 0b15e02f0cc4fb34a9160de7ba6db3a4013dc1b7 upstream. To make sure the domain tlb flush completes before the function returns, explicitly wait for its completion. Signed-off-by: NFilippo Sironi <sironi@amazon.de> Fixes: 42a49f96 ("amd-iommu: flush domain tlb when attaching a new device") [joro: Added commit message and fixes tag] Signed-off-by: NJoerg Roedel <jroedel@suse.de> Signed-off-by: NWANG Siyuan <Siyuan.Wang@amd.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 Qian Cai 提交于
commit 8cf66504210d308a35cca35fe9c310b1241f9fa7 upstream. The commit b3aa14f02254 ("iommu: remove the mapping_error dma_map_ops method") incorrectly changed the checking from dma_ops_alloc_iova() in map_sg() causes a crash under memory pressure as dma_ops_alloc_iova() never return DMA_MAPPING_ERROR on failure but 0, so the error handling is all wrong. kernel BUG at drivers/iommu/iova.c:801! Workqueue: kblockd blk_mq_run_work_fn RIP: 0010:iova_magazine_free_pfns+0x7d/0xc0 Call Trace: free_cpu_cached_iovas+0xbd/0x150 alloc_iova_fast+0x8c/0xba dma_ops_alloc_iova.isra.6+0x65/0xa0 map_sg+0x8c/0x2a0 scsi_dma_map+0xc6/0x160 pqi_aio_submit_io+0x1f6/0x440 [smartpqi] pqi_scsi_queue_command+0x90c/0xdd0 [smartpqi] scsi_queue_rq+0x79c/0x1200 blk_mq_dispatch_rq_list+0x4dc/0xb70 blk_mq_sched_dispatch_requests+0x249/0x310 __blk_mq_run_hw_queue+0x128/0x200 blk_mq_run_work_fn+0x27/0x30 process_one_work+0x522/0xa10 worker_thread+0x63/0x5b0 kthread+0x1d2/0x1f0 ret_from_fork+0x22/0x40 Fixes: b3aa14f02254 ("iommu: remove the mapping_error dma_map_ops method") Signed-off-by: NQian Cai <cai@lca.pw> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NWANG Siyuan <Siyuan.Wang@amd.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 Christoph Hellwig 提交于
commit b3aa14f022543ed86823c97c145495b747102fa9 upstream. Return DMA_MAPPING_ERROR instead of 0 on a dma mapping failure and let the core dma-mapping code handle the rest. Note that the existing code used AMD_IOMMU_MAPPING_ERROR to check from a 0 return from the IOVA allocator, which is replaced with an explicit 0 as in the implementation and other users of that interface. Signed-off-by: NChristoph Hellwig <hch@lst.de> Acked-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NWANG Siyuan <Siyuan.Wang@amd.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 Geert Uytterhoeven 提交于
commit 18b3af4492a0aa6046b86d712f6ba4cbb66100fb upstream. A change made in the final version of IOMMU debugfs support replaced the public function iommu_debugfs_new_driver_dir() by the public dentry iommu_debugfs_dir in <linux/iommu.h>, but forgot to update both the implementation in iommu-debugfs.c, and the patch description. Fix this by exporting iommu_debugfs_dir, and removing the reference to and implementation of iommu_debugfs_new_driver_dir(). Fixes: bad614b2 ("iommu: Enable debugfs exposure of IOMMU driver internals") Signed-off-by: NGeert Uytterhoeven <geert+renesas@glider.be> Acked-by: NGary R Hook <gary.hook@amd.com> Signed-off-by: NJoerg Roedel <jroedel@suse.de> Signed-off-by: NWANG Siyuan <Siyuan.Wang@amd.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 Christoph Hellwig 提交于
commit 42ee3cae0ed38b6c04038bf851ea2496da2135bb upstream. Error handling of the dma_map_single and dma_map_page APIs is a little problematic at the moment, in that we use different encodings in the returned dma_addr_t to indicate an error. That means we require an additional indirect call to figure out if a dma mapping call returned an error, and a lot of boilerplate code to implement these semantics. Instead return the maximum addressable value as the error. As long as we don't allow mapping single-byte ranges with single-byte alignment this value can never be a valid return. Additionaly if drivers do not check the return value from the dma_map* routines this values means they will generally not be pointed to actual memory. Once the default value is added here we can start removing the various mapping_error methods and just rely on this generic check. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NRobin Murphy <robin.murphy@arm.com> Acked-by: NRussell King <rmk+kernel@armlinux.org.uk> Acked-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NWANG Siyuan <Siyuan.Wang@amd.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 Yazen Ghannam 提交于
commit fc00c6a416381010c4a721a4142ddd0260d68f20 upstream. AMD systems may support chip select interleaving. However, on family 17h+ this was not taken into account when printing the chip select sizes. Add support to detect if chip selects are interleaved on family 17h+, and adjust the sizes accordingly. Signed-off-by: NYazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Tested-by: NKim Phillips <kim.phillips@amd.com> Cc: James Morse <james.morse@arm.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: linux-edac <linux-edac@vger.kernel.org> Link: https://lkml.kernel.org/r/20190228153558.127292-6-Yazen.Ghannam@amd.comSigned-off-by: NWANG Siyuan <Siyuan.Wang@amd.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 Yazen Ghannam 提交于
commit 0a227af521d6df5286550b62f4b591417170b4ea upstream. The struct chip_select array that's used for saving chip select bases and masks is fixed at length of two. There should be one struct chip_select for each controller, so this array should be increased to support systems that may have more than two controllers. Increase the size of the struct chip_select array to eight, which is the largest number of controllers per die currently supported on AMD systems. Also, carve out the Family 17h+ reading of the bases/masks into a separate function. This effectively reverts the original bases/masks reading code to before Family 17h support was added. Signed-off-by: NYazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Tested-by: NKim Phillips <kim.phillips@amd.com> Cc: James Morse <james.morse@arm.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: linux-edac <linux-edac@vger.kernel.org> Link: https://lkml.kernel.org/r/20190228153558.127292-5-Yazen.Ghannam@amd.comSigned-off-by: NWANG Siyuan <Siyuan.Wang@amd.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 Yazen Ghannam 提交于
commit 7835961d377b75ab9ae77f715e378fcb72508306 upstream. Future AMD systems may support x16 symbol sizes. Recognize if a system is using x16 symbol size. Also, simplify the print statement. Note that a x16 syndrome vector table is not necessary like with x4 or x8 syndromes. This is because systems that support x16 symbol sizes are SMCA systems and in that case, the syndrome can be directly extracted from the MCA_SYND[Syndrome] field. [ bp: massage. ] Signed-off-by: NYazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Tested-by: NKim Phillips <kim.phillips@amd.com> Cc: James Morse <james.morse@arm.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: linux-edac <linux-edac@vger.kernel.org> Link: https://lkml.kernel.org/r/20190228153558.127292-4-Yazen.Ghannam@amd.comSigned-off-by: NWANG Siyuan <Siyuan.Wang@amd.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-