- 13 8月, 2018 1 次提交
-
-
由 Linus Torvalds 提交于
This is purely a preparatory patch for upcoming changes during the 4.19 merge window. We have a function called "boot_cpu_state_init()" that isn't really about the bootup cpu state: that is done much earlier by the similarly named "boot_cpu_init()" (note lack of "state" in name). This function initializes some hotplug CPU state, and needs to run after the percpu data has been properly initialized. It even has a comment to that effect. Except it _doesn't_ actually run after the percpu data has been properly initialized. On x86 it happens to do that, but on at least arm and arm64, the percpu base pointers are initialized by the arch-specific 'smp_prepare_boot_cpu()' hook, which ran _after_ boot_cpu_state_init(). This had some unexpected results, and in particular we have a patch pending for the merge window that did the obvious cleanup of using 'this_cpu_write()' in the cpu hotplug init code: - per_cpu_ptr(&cpuhp_state, smp_processor_id())->state = CPUHP_ONLINE; + this_cpu_write(cpuhp_state.state, CPUHP_ONLINE); which is obviously the right thing to do. Except because of the ordering issue, it actually failed miserably and unexpectedly on arm64. So this just fixes the ordering, and changes the name of the function to be 'boot_cpu_hotplug_init()' to make it obvious that it's about cpu hotplug state, because the core CPU state was supposed to have already been done earlier. Marked for stable, since the (not yet merged) patch that will show this problem is marked for stable. Reported-by: NVlastimil Babka <vbabka@suse.cz> Reported-by: NMian Yousaf Kaukab <yousaf.kaukab@suse.com> Suggested-by: NCatalin Marinas <catalin.marinas@arm.com> Acked-by: NThomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: stable@kernel.org Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 12 8月, 2018 1 次提交
-
-
由 Bart Van Assche 提交于
For legacy queues the only call of blkg_root_lookup() happens after bypass mode has been enabled. Since blkg_lookup() returns NULL for queues in bypass mode, modify the blkg_root_lookup() such that it no longer depends on bypass mode. Rename the function into blk_queue_root_blkg() as suggested by Tejun. Suggested-by: NTejun Heo <tj@kernel.org> Fixes: 6bad9b21 ("blkcg: Introduce blkg_root_lookup()") Signed-off-by: NBart Van Assche <bart.vanassche@wdc.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 09 8月, 2018 3 次提交
-
-
由 Bart Van Assche 提交于
This new function will be used in a later patch to verify whether a queue has been dissociated from the cgroup controller before being released. Signed-off-by: NBart Van Assche <bart.vanassche@wdc.com> Cc: Tejun Heo <tj@kernel.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Ming Lei <ming.lei@redhat.com> Cc: Omar Sandoval <osandov@fb.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> Cc: Alexandru Moise <00moses.alexander00@gmail.com> Cc: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: <stable@vger.kernel.org> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Bart Van Assche 提交于
Commit 12f5b931 ("blk-mq: Remove generation seqeunce") removed the only seqcount_t and u64_stats_sync instances from <linux/blkdev.h> but did not remove the corresponding #include directives. Since these include directives are no longer needed, remove them. Signed-off-by: NBart Van Assche <bart.vanassche@wdc.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Keith Busch <keith.busch@intel.com> Cc: Ming Lei <ming.lei@redhat.com> Cc: Jianchao Wang <jianchao.w.wang@oracle.com> Cc: Hannes Reinecke <hare@suse.com>, Cc: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Crestez Dan Leonard 提交于
The regmap API usually assumes that bulk read operations will read a range of registers but some I2C/SPI devices have certain registers for which a such a read operation will return data from an internal FIFO instead. Add an explicit API to support bulk read without range semantics. Some linux drivers use regmap_bulk_read or regmap_raw_read for such registers, for example mpu6050 or bmi150 from IIO. This only happens to work because when caching is disabled a single regmap read op will map to a single bus read op (as desired). This breaks if caching is enabled and reg+1 happens to be a cacheable register. Without regmap support refactoring a driver to enable regmap caching requires separate I2C and SPI paths. This is exactly what regmap is supposed to help avoid. Suggested-by: NJonathan Cameron <jic23@kernel.org> Signed-off-by: NCrestez Dan Leonard <leonard.crestez@intel.com> Signed-off-by: NStefan Popa <stefan.popa@analog.com> Signed-off-by: NMark Brown <broonie@kernel.org>
-
- 08 8月, 2018 2 次提交
-
-
由 Chaitanya Kulkarni 提交于
Add various definitions from NVMe 1.3 TP 4005. Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Hannes Reinecke 提交于
ANA Phase 3 draft had the 'reserved' field in the group descriptor format set to '23:17' (so that the first namespace identifier started at byte 24), but that got move with the approved TP to '31:17' (so that the first namespace identifier started at byte 32). Signed-off-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 07 8月, 2018 1 次提交
-
-
由 Thomas Gleixner 提交于
Josh reported that the late SMT evaluation in cpu_smt_state_init() sets cpu_smt_control to CPU_SMT_NOT_SUPPORTED in case that 'nosmt' was supplied on the kernel command line as it cannot differentiate between SMT disabled by BIOS and SMT soft disable via 'nosmt'. That wreckages the state and makes the sysfs interface unusable. Rework this so that during bringup of the non boot CPUs the availability of SMT is determined in cpu_smt_allowed(). If a newly booted CPU is not a 'primary' thread then set the local cpu_smt_available marker and evaluate this explicitely right after the initial SMP bringup has finished. SMT evaulation on x86 is a trainwreck as the firmware has all the information _before_ booting the kernel, but there is no interface to query it. Fixes: 73d5e2b4 ("cpu/hotplug: detect SMT disabled by BIOS") Reported-by: NJosh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 06 8月, 2018 1 次提交
-
-
由 Pingfan Liu 提交于
At present, "systemctl suspend" and "shutdown" can run in parrallel. A system can suspend after devices_shutdown(), and resume. Then the shutdown task goes on to power off. This causes many devices are not really shut off. Hence replacing reboot_mutex with system_transition_mutex (renamed from pm_mutex) to achieve the exclusion. The renaming of pm_mutex as system_transition_mutex can be better to reflect the purpose of the mutex. Signed-off-by: NPingfan Liu <kernelfans@gmail.com> Acked-by: NPavel Machek <pavel@ucw.cz> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 04 8月, 2018 2 次提交
-
-
由 Al Viro 提交于
open-coded in a quite a few places... Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk> -
由 Al Viro 提交于
We don't want open-by-handle picking half-set-up in-core struct inode from e.g. mkdir() having failed halfway through. In other words, we don't want such inodes returned by iget_locked() on their way to extinction. However, we can't just have them unhashed - otherwise open-by-handle immediately *after* that would've ended up creating a new in-core inode over the on-disk one that is in process of being freed right under us. Solution: new flag (I_CREATING) set by insert_inode_locked() and removed by unlock_new_inode() and a new primitive (discard_new_inode()) to be used by such halfway-through-setup failure exits instead of unlock_new_inode() / iput() combinations. That primitive unlocks new inode, but leaves I_CREATING in place. iget_locked() treats finding an I_CREATING inode as failure (-ESTALE, once we sort out the error propagation). insert_inode_locked() treats the same as instant -EBUSY. ilookup() treats those as icache miss. [Fix by Dan Carpenter <dan.carpenter@oracle.com> folded in] Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 03 8月, 2018 2 次提交
-
-
由 Boris Brezillon 提交于
There is no reason to make spi_mem->name modifiable. Moreover, spi_mem_ops->get_name() returns a const char *, which generates a gcc warning when assigning the value returned by spi_mem_ops->get_name() to spi_mem->name. Fixes: 5d27a9c8 ("spi: spi-mem: Extend the SPI mem interface to set a custom memory name") Reported-by: NStephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NBoris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: NMark Brown <broonie@kernel.org>
-
由 Kees Cook 提交于
There is a lot of needless struct request_sense usage in the CDROM code. These can all be struct scsi_sense_hdr instead, to avoid any confusion over their respective structure sizes. This patch is a lot of noise changing "sense" to "sshdr", but the final code is more readable to distinguish between "sense" meaning "struct request_sense" and "sshdr" meaning "struct scsi_sense_hdr". Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NKees Cook <keescook@chromium.org> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 02 8月, 2018 4 次提交
-
-
由 Frieder Schrempf 提交于
When porting (Q)SPI controller drivers from the MTD layer to the SPI layer, the naming scheme for the memory devices changes. To be able to keep compatibility with the old drivers naming scheme, a name field is added to struct spi_mem and a hook is added to let controller drivers set a custom name for the memory device. Example for the FSL QSPI driver: Name with the old driver: 21e0000.qspi, or with multiple devices: 21e0000.qspi-0, 21e0000.qspi-1, ... Name with the new driver without spi_mem_get_name: spi4.0 Suggested-by: NBoris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: NFrieder Schrempf <frieder.schrempf@exceet.de> Reviewed-by: NBoris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: NMark Brown <broonie@kernel.org>
-
由 Frieder Schrempf 提交于
Fix a typo in the @drvpriv description. Signed-off-by: NFrieder Schrempf <frieder.schrempf@exceet.de> Acked-by: NBoris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: NMark Brown <broonie@kernel.org>
-
由 Al Viro 提交于
The only user is fuse_create_new_entry(), and there it's used to mitigate the same mkdir/open-by-handle race as in nfs_mkdir(). The same solution applies - unhash the mkdir argument, then call d_splice_alias() and if that returns a reference to preexisting alias, dput() and report success. ->mkdir() argument left unhashed negative with the preexisting alias moved in the right place is just fine from the ->mkdir() callers point of view. Cc: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk> -
由 Linus Torvalds 提交于
Commit 2c4541e2 ("mm: use vma_init() to initialize VMAs on stack and data segments") tried to initialize various left-over ad-hoc vma's "properly", but actually made things worse for the temporary vma's used for TLB flushing. vma_init() doesn't actually initialize all of the vma, just a few fields, so doing something like - struct vm_area_struct vma = { .vm_mm = tlb->mm, }; + struct vm_area_struct vma; + + vma_init(&vma, tlb->mm); was actually very bad: instead of having a nicely initialized vma with every field but "vm_mm" zeroed, you'd have an entirely uninitialized vma with only a couple of fields initialized. And they weren't even fields that the code in question mostly cared about. The flush_tlb_range() function takes a "struct vma" rather than a "struct mm_struct", because a few architectures actually care about what kind of range it is - being able to only do an ITLB flush if it's a range that doesn't have data accesses enabled, for example. And all the normal users already have the vma for doing the range invalidation. But a few people want to call flush_tlb_range() with a range they just made up, so they also end up using a made-up vma. x86 just has a special "flush_tlb_mm_range()" function for this, but other architectures (arm and ia64) do the "use fake vma" thing instead, and thus got caught up in the vma_init() changes. At the same time, the TLB flushing code really doesn't care about most other fields in the vma, so vma_init() is just unnecessary and pointless. This fixes things by having an explicit "this is just an initializer for the TLB flush" initializer macro, which is used by the arm/arm64/ia64 people who mis-use this interface with just a dummy vma. Fixes: 2c4541e2 ("mm: use vma_init() to initialize VMAs on stack and data segments") Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: John Stultz <john.stultz@linaro.org> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 01 8月, 2018 5 次提交
-
-
由 Lorenzo Bianconi 提交于
Add SPI_3WIRE support to spi-gpio controller introducing set_line_direction function pointer in spi_bitbang data structure. Spi-gpio controller has been tested using hts221 temp/rh iio sensor running in 3wire mode and lsm6dsm running in 4wire mode Signed-off-by: NLorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: NMark Brown <broonie@kernel.org>
-
由 Lorenzo Bianconi 提交于
Add the capability to specify the flag parameter used in bitbang_txrx_be_cpha{0,1} through the txrx_word function pointers of spi_bitbang data structure. That feature will be used to add spi-3wire support to the spi-gpio controller Signed-off-by: NLorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: NMark Brown <broonie@kernel.org> -
由 Miquel Raynal 提交于
Now that it is possible to do dynamic allocations during the identification phase, convert the onfi_params structure (which is only needed with ONFI compliant chips) into a pointer that will be allocated only if needed. Signed-off-by: NMiquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: NBoris Brezillon <boris.brezillon@bootlin.com>
-
由 Brian Norris 提交于
Commit 59b356ff ("mtd: m25p80: restore the status of SPI flash when exiting") is the latest from a long history of attempts to add reboot handling to handle stateful addressing modes on SPI flash. Some prior mostly-related discussions: http://lists.infradead.org/pipermail/linux-mtd/2013-March/046343.html [PATCH 1/3] mtd: m25p80: utilize dedicated 4-byte addressing commands http://lists.infradead.org/pipermail/barebox/2014-September/020682.html [RFC] MTD m25p80 3-byte addressing and boot problem http://lists.infradead.org/pipermail/linux-mtd/2015-February/057683.html [PATCH 2/2] m25p80: if supported put chip to deep power down if not used Previously, attempts to add reboot-time software reset handling were rejected, but the latest attempt was not. Quick summary of the problem: Some systems (e.g., boot ROM or bootloader) assume that they can read initial boot code from their SPI flash using 3-byte addressing. If the flash is left in 4-byte mode after reset, these systems won't boot. The above patch provided a shutdown/remove hook to attempt to reset the addressing mode before we reboot. Notably, this patch misses out on huge classes of unexpected reboots (e.g., crashes, watchdog resets). Unfortunately, it is essentially impossible to solve this problem 100%: if your system doesn't know how to reset the SPI flash to power-on defaults at initialization time, no amount of software can really rescue you -- there will always be a chance of some unexpected reset that leaves your flash in an addressing mode that your boot sequence didn't expect. While it is not directly harmful to perform hacks like the aforementioned commit on all 4-byte addressing flash, a properly-designed system should not need the hack -- and in fact, providing this hack may mask the fact that a given system is indeed broken. So this patch attempts to apply this unsound hack more narrowly, providing a strong suggestion to developers and system designers that this is truly a hack. With luck, system designers can catch their errors early on in their development cycle, rather than applying this hack long term. But apparently enough systems are out in the wild that we still have to provide this hack. Document a new device tree property to denote systems that do not have a proper hardware (or software) reset mechanism, and apply the hack (with a loud warning) only in this case. Signed-off-by: NBrian Norris <computersforpeace@gmail.com> Reviewed-by: NGuenter Roeck <linux@roeck-us.net> Signed-off-by: NBoris Brezillon <boris.brezillon@bootlin.com>
-
由 Hari Vyas 提交于
When a PCI device is detected, pdev->is_added is set to 1 and proc and sysfs entries are created. When the device is removed, pdev->is_added is checked for one and then device is detached with clearing of proc and sys entries and at end, pdev->is_added is set to 0. is_added and is_busmaster are bit fields in pci_dev structure sharing same memory location. A strange issue was observed with multiple removal and rescan of a PCIe NVMe device using sysfs commands where is_added flag was observed as zero instead of one while removing device and proc,sys entries are not cleared. This causes issue in later device addition with warning message "proc_dir_entry" already registered. Debugging revealed a race condition between the PCI core setting the is_added bit in pci_bus_add_device() and the NVMe driver reset work-queue setting the is_busmaster bit in pci_set_master(). As these fields are not handled atomically, that clears the is_added bit. Move the is_added bit to a separate private flag variable and use atomic functions to set and retrieve the device addition state. This avoids the race because is_added no longer shares a memory location with is_busmaster. Link: https://bugzilla.kernel.org/show_bug.cgi?id=200283Signed-off-by: NHari Vyas <hari.vyas@broadcom.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Reviewed-by: NLukas Wunner <lukas@wunner.de> Acked-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 31 7月, 2018 6 次提交
-
-
由 Jens Axboe 提交于
Fixes a link failure whtn BLK_DEV_INTEGRITY isn't defined. Fixes: 10c41ddd ("block: move dif_prepare/dif_complete functions to block layer") Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Miquel Raynal 提交于
Thanks to the migration of all drivers to use nand_scan() and the related nand_controller_ops, we can now allocate data during the detection phase. Let's do it first for the NAND model parameter which is allocated in nand_detect(). Signed-off-by: NMiquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: NBoris Brezillon <boris.brezillon@bootlin.com>
-
由 Miquel Raynal 提交于
Both nand_scan_ident() and nand_scan_tail() helpers used to be called directly from controller drivers that needed to tweak some ECC-related parameters before nand_scan_tail(). This separation prevented dynamic allocations during the phase of NAND identification, which was inconvenient. All controller drivers have been moved to use nand_scan(), in conjunction with the chip->ecc.[attach|detach]_chip() hooks that actually do the required tweaking sequence between both ident/tail calls, allowing programmers to use dynamic allocation as they need all across the scanning sequence. Declare nand_scan_[ident|tail]() statically now. Signed-off-by: NMiquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: NBoris Brezillon <boris.brezillon@bootlin.com>
-
由 Miquel Raynal 提交于
In order to remove the limitation that forbids dynamic allocation in nand_scan_ident(), we must create a path that will be the same for all controller drivers. The idea is to use nand_scan() instead of the widely used nand_scan_ident()/nand_scan_tail() couple. In order to achieve this, controller drivers will need to adjust some parameters between these two functions depending on the NAND chip wired on them. This takes the form of two new hooks (->{attach,detach}_chip()) that are placed in a new nand_controller_ops structure, which is then attached to the nand_controller object at driver initialization time. ->attach_chip() is called between nand_scan_ident() and nand_scan_tail(), and ->detach_chip() is called in the error path of nand_scan() and in nand_cleanup(). Note that some NAND controller drivers don't have a dedicated nand_controller object and instead rely on the default/dummy one embedded in nand_chip. If you're in this case and still want to initialize the controller ops, you'll have to manipulate chip->dummy_controller directly. Last but not least, it's worth mentioning that we plan to move some of the controller related hooks placed in nand_chip into nand_controller_ops to make the separation between NAND chip and NAND controller methods clearer. Signed-off-by: NMiquel Raynal <miquel.raynal@bootlin.com> Acked-by: NBoris Brezillon <boris.brezillon@bootlin.com> -
由 Miquel Raynal 提交于
In the raw NAND core, a NAND chip is described by a nand_chip structure, while a NAND controller is described with a nand_hw_control structure which is not very meaningful. Rename this structure nand_controller. As the structure gets renamed, it is logical to also rename the core function initializing it from nand_hw_control_init() to nand_controller_init(). Lastly, the 'hwcontrol' entry of the nand_chip structure is not meaningful neither while it has the role of fallback when no controller structure is provided by the driver (the controller driver is dumb and can only control a single chip). Thus, it is renamed dummy_controller. Signed-off-by: NMiquel Raynal <miquel.raynal@bootlin.com> Acked-by: NBoris Brezillon <boris.brezillon@bootlin.com>
-
由 Miquel Raynal 提交于
A report from Colin Ian King pointed a CoverityScan issue where error values on these helpers where not checked in the drivers. These helpers can error out only in case of a software bug in driver code, not because of a runtime/hardware error. Hence, let's WARN_ON() in this case and return 0 which is harmless anyway. Fixes: 8878b126 ("mtd: nand: add ->exec_op() implementation") Signed-off-by: NMiquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: NBoris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: NMiquel Raynal <miquel.raynal@bootlin.com>
-
- 30 7月, 2018 3 次提交
-
-
由 Max Gurtovoy 提交于
Currently these functions are implemented in the scsi layer, but their actual place should be the block layer since T10-PI is a general data integrity feature that is used in the nvme protocol as well. Also, use the tuple size from the integrity profile since it may vary between integrity types. Suggested-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Signed-off-by: NMax Gurtovoy <maxg@mellanox.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Max Gurtovoy 提交于
Currently this function is implemented in the scsi layer, but it's actual place should be the block layer since T10-PI is a general data integrity feature that is used in the nvme protocol as well. Suggested-by: NChristoph Hellwig <hch@lst.de> Cc: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: NMax Gurtovoy <maxg@mellanox.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Josef Bacik 提交于
We need to check in blkcg_bio_issue_check if the bio is flagged as QUEUE_ENTERED, because if it is then we've already accounted for the size of the IO in the cgroup stats. We can still however account for the extra IO since it'll be another request. Reported-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJosef Bacik <josef@toxicpanda.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 28 7月, 2018 3 次提交
-
-
由 Christoph Hellwig 提交于
Add various defintions from NVMe 1.3 TP 4004. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NHannes Reinecke <hare@suse.com> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
-
由 Christoph Hellwig 提交于
NVMe 1.3 added a new log specific field to the get log page CQ defintion, add it to our get_log_page SQ structure. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NHannes Reinecke <hare@suse.com> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
-
由 Robin Murphy 提交于
Whilst the notion of an upstream DMA restriction is most commonly seen in PCI host bridges saddled with a 32-bit native interface, a more general version of the same issue can exist on complex SoCs where a bus or point-to-point interconnect link from a device's DMA master interface to another component along the path to memory (often an IOMMU) may carry fewer address bits than the interfaces at both ends nominally support. In order to properly deal with this, the first step is to expand the dma_32bit_limit flag into an arbitrary mask. To minimise the impact on existing code, we'll make sure to only consider this new mask valid if set. That makes sense anyway, since a mask of zero would represent DMA not being wired up at all, and that would be better handled by not providing valid ops in the first place. Signed-off-by: NRobin Murphy <robin.murphy@arm.com> Acked-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 27 7月, 2018 5 次提交
-
-
由 Arnd Bergmann 提交于
The new gasket staging driver ran into a randconfig build failure when CONFIG_EVENTFD is disabled: In file included from drivers/staging/gasket/gasket_interrupt.h:11, from drivers/staging/gasket/gasket_interrupt.c:4: include/linux/eventfd.h: In function 'eventfd_ctx_fdget': include/linux/eventfd.h:51:9: error: implicit declaration of function 'ERR_PTR' [-Werror=implicit-function-declaration] I can't see anything wrong with including eventfd.h before err.h, so the easiest fix is to make it possible to do this by including the file where it is needed. Link: http://lkml.kernel.org/r/20180724110737.3985088-1-arnd@arndb.de Fixes: 9a69f508 ("drivers/staging: Gasket driver framework + Apex driver") Signed-off-by: NArnd Bergmann <arnd@arndb.de> Cc: Eric Biggers <ebiggers@google.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> -
由 Kirill A. Shutemov 提交于
vma_is_anonymous() relies on ->vm_ops being NULL to detect anonymous VMA. This is unreliable as ->mmap may not set ->vm_ops. False-positive vma_is_anonymous() may lead to crashes: next ffff8801ce5e7040 prev ffff8801d20eca50 mm ffff88019c1e13c0 prot 27 anon_vma ffff88019680cdd8 vm_ops 0000000000000000 pgoff 0 file ffff8801b2ec2d00 private_data 0000000000000000 flags: 0xff(read|write|exec|shared|mayread|maywrite|mayexec|mayshare) ------------[ cut here ]------------ kernel BUG at mm/memory.c:1422! invalid opcode: 0000 [#1] SMP KASAN CPU: 0 PID: 18486 Comm: syz-executor3 Not tainted 4.18.0-rc3+ #136 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:zap_pmd_range mm/memory.c:1421 [inline] RIP: 0010:zap_pud_range mm/memory.c:1466 [inline] RIP: 0010:zap_p4d_range mm/memory.c:1487 [inline] RIP: 0010:unmap_page_range+0x1c18/0x2220 mm/memory.c:1508 Call Trace: unmap_single_vma+0x1a0/0x310 mm/memory.c:1553 zap_page_range_single+0x3cc/0x580 mm/memory.c:1644 unmap_mapping_range_vma mm/memory.c:2792 [inline] unmap_mapping_range_tree mm/memory.c:2813 [inline] unmap_mapping_pages+0x3a7/0x5b0 mm/memory.c:2845 unmap_mapping_range+0x48/0x60 mm/memory.c:2880 truncate_pagecache+0x54/0x90 mm/truncate.c:800 truncate_setsize+0x70/0xb0 mm/truncate.c:826 simple_setattr+0xe9/0x110 fs/libfs.c:409 notify_change+0xf13/0x10f0 fs/attr.c:335 do_truncate+0x1ac/0x2b0 fs/open.c:63 do_sys_ftruncate+0x492/0x560 fs/open.c:205 __do_sys_ftruncate fs/open.c:215 [inline] __se_sys_ftruncate fs/open.c:213 [inline] __x64_sys_ftruncate+0x59/0x80 fs/open.c:213 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe Reproducer: #include <stdio.h> #include <stddef.h> #include <stdint.h> #include <stdlib.h> #include <string.h> #include <sys/types.h> #include <sys/stat.h> #include <sys/ioctl.h> #include <sys/mman.h> #include <unistd.h> #include <fcntl.h> #define KCOV_INIT_TRACE _IOR('c', 1, unsigned long) #define KCOV_ENABLE _IO('c', 100) #define KCOV_DISABLE _IO('c', 101) #define COVER_SIZE (1024<<10) #define KCOV_TRACE_PC 0 #define KCOV_TRACE_CMP 1 int main(int argc, char **argv) { int fd; unsigned long *cover; system("mount -t debugfs none /sys/kernel/debug"); fd = open("/sys/kernel/debug/kcov", O_RDWR); ioctl(fd, KCOV_INIT_TRACE, COVER_SIZE); cover = mmap(NULL, COVER_SIZE * sizeof(unsigned long), PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); munmap(cover, COVER_SIZE * sizeof(unsigned long)); cover = mmap(NULL, COVER_SIZE * sizeof(unsigned long), PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0); memset(cover, 0, COVER_SIZE * sizeof(unsigned long)); ftruncate(fd, 3UL << 20); return 0; } This can be fixed by assigning anonymous VMAs own vm_ops and not relying on it being NULL. If ->mmap() failed to set ->vm_ops, mmap_region() will set it to dummy_vm_ops. This way we will have non-NULL ->vm_ops for all VMAs. Link: http://lkml.kernel.org/r/20180724121139.62570-4-kirill.shutemov@linux.intel.comSigned-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Reported-by: syzbot+3f84280d52be9b7083cc@syzkaller.appspotmail.com Acked-by: NLinus Torvalds <torvalds@linux-foundation.org> Reviewed-by: NAndrew Morton <akpm@linux-foundation.org> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Kirill A. Shutemov 提交于
Not all VMAs allocated with vm_area_alloc(). Some of them allocated on stack or in data segment. The new helper can be use to initialize VMA properly regardless where it was allocated. Link: http://lkml.kernel.org/r/20180724121139.62570-2-kirill.shutemov@linux.intel.comSigned-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: NLinus Torvalds <torvalds@linux-foundation.org> Reviewed-by: NAndrew Morton <akpm@linux-foundation.org> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Tejun Heo 提交于
While forking, if delayacct init fails due to memory shortage, it continues expecting all delayacct users to check task->delays pointer against NULL before dereferencing it, which all of them used to do. Commit c96f5471 ("delayacct: Account blkio completion on the correct task"), while updating delayacct_blkio_end() to take the target task instead of always using %current, made the function test NULL on %current->delays and then continue to operated on @p->delays. If %current succeeded init while @p didn't, it leads to the following crash. BUG: unable to handle kernel NULL pointer dereference at 0000000000000004 IP: __delayacct_blkio_end+0xc/0x40 PGD 8000001fd07e1067 P4D 8000001fd07e1067 PUD 1fcffbb067 PMD 0 Oops: 0000 [#1] SMP PTI CPU: 4 PID: 25774 Comm: QIOThread0 Not tainted 4.16.0-9_fbk1_rc2_1180_g6b593215b4d7 #9 RIP: 0010:__delayacct_blkio_end+0xc/0x40 Call Trace: try_to_wake_up+0x2c0/0x600 autoremove_wake_function+0xe/0x30 __wake_up_common+0x74/0x120 wake_up_page_bit+0x9c/0xe0 mpage_end_io+0x27/0x70 blk_update_request+0x78/0x2c0 scsi_end_request+0x2c/0x1e0 scsi_io_completion+0x20b/0x5f0 blk_mq_complete_request+0xa2/0x100 ata_scsi_qc_complete+0x79/0x400 ata_qc_complete_multiple+0x86/0xd0 ahci_handle_port_interrupt+0xc9/0x5c0 ahci_handle_port_intr+0x54/0xb0 ahci_single_level_irq_intr+0x3b/0x60 __handle_irq_event_percpu+0x43/0x190 handle_irq_event_percpu+0x20/0x50 handle_irq_event+0x2a/0x50 handle_edge_irq+0x80/0x1c0 handle_irq+0xaf/0x120 do_IRQ+0x41/0xc0 common_interrupt+0xf/0xf Fix it by updating delayacct_blkio_end() check @p->delays instead. Link: http://lkml.kernel.org/r/20180724175542.GP1934745@devbig577.frc2.facebook.com Fixes: c96f5471 ("delayacct: Account blkio completion on the correct task") Signed-off-by: NTejun Heo <tj@kernel.org> Reported-by: NDave Jones <dsj@fb.com> Debugged-by: NDave Jones <dsj@fb.com> Reviewed-by: NAndrew Morton <akpm@linux-foundation.org> Cc: Josh Snyder <joshs@netflix.com> Cc: <stable@vger.kernel.org> [4.15+] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Greg Edwards 提交于
This allows bio_integrity_bytes() to be called from drivers instead of open coding it. Acked-by: NMartin K. Petersen <martin.petersen@oracle.com> Signed-off-by: NGreg Edwards <gedwards@ddn.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 26 7月, 2018 1 次提交
-
-
由 Waiman Long 提交于
There are use cases where it can be useful to have a cpus_read_trylock() function to work around circular lock dependency problem involving the cpu_hotplug_lock. Signed-off-by: NWaiman Long <longman@redhat.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-