- 10 12月, 2015 2 次提交
-
-
由 Brian Norris 提交于
For some of the core partitioning code, it helps to keep info about the parsed partition (and who parsed them) together in one place. Signed-off-by: NBrian Norris <computersforpeace@gmail.com>
-
由 Brian Norris 提交于
The use of kmemdup() complicates the error handling a bit. We don't actually need to allocate new memory, since this reference is treated as const, and it is copied into new memory by the partition registration code anyway. So remove it. Signed-off-by: NBrian Norris <computersforpeace@gmail.com> Reviewed-by: NBoris Brezillon <boris.brezillon@free-electrons.com>
-
- 13 11月, 2015 1 次提交
-
-
由 Brian Norris 提交于
We now stick the device node representing the current MTD (if any) into sysfs, so let's make sure we have a reference to it before doing that. Suggested-by: NBoris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: NBrian Norris <computersforpeace@gmail.com> Reviewed-by: NBoris Brezillon <boris.brezillon@free-electrons.com>
-
- 07 11月, 2015 2 次提交
-
-
由 Mel Gorman 提交于
mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd __GFP_WAIT has been used to identify atomic context in callers that hold spinlocks or are in interrupts. They are expected to be high priority and have access one of two watermarks lower than "min" which can be referred to as the "atomic reserve". __GFP_HIGH users get access to the first lower watermark and can be called the "high priority reserve". Over time, callers had a requirement to not block when fallback options were available. Some have abused __GFP_WAIT leading to a situation where an optimisitic allocation with a fallback option can access atomic reserves. This patch uses __GFP_ATOMIC to identify callers that are truely atomic, cannot sleep and have no alternative. High priority users continue to use __GFP_HIGH. __GFP_DIRECT_RECLAIM identifies callers that can sleep and are willing to enter direct reclaim. __GFP_KSWAPD_RECLAIM to identify callers that want to wake kswapd for background reclaim. __GFP_WAIT is redefined as a caller that is willing to enter direct reclaim and wake kswapd for background reclaim. This patch then converts a number of sites o __GFP_ATOMIC is used by callers that are high priority and have memory pools for those requests. GFP_ATOMIC uses this flag. o Callers that have a limited mempool to guarantee forward progress clear __GFP_DIRECT_RECLAIM but keep __GFP_KSWAPD_RECLAIM. bio allocations fall into this category where kswapd will still be woken but atomic reserves are not used as there is a one-entry mempool to guarantee progress. o Callers that are checking if they are non-blocking should use the helper gfpflags_allow_blocking() where possible. This is because checking for __GFP_WAIT as was done historically now can trigger false positives. Some exceptions like dm-crypt.c exist where the code intent is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to flag manipulations. o Callers that built their own GFP flags instead of starting with GFP_KERNEL and friends now also need to specify __GFP_KSWAPD_RECLAIM. The first key hazard to watch out for is callers that removed __GFP_WAIT and was depending on access to atomic reserves for inconspicuous reasons. In some cases it may be appropriate for them to use __GFP_HIGH. The second key hazard is callers that assembled their own combination of GFP flags instead of starting with something like GFP_KERNEL. They may now wish to specify __GFP_KSWAPD_RECLAIM. It's almost certainly harmless if it's missed in most cases as other activity will wake kswapd. Signed-off-by: NMel Gorman <mgorman@techsingularity.net> Acked-by: NVlastimil Babka <vbabka@suse.cz> Acked-by: NMichal Hocko <mhocko@suse.com> Acked-by: NJohannes Weiner <hannes@cmpxchg.org> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Vitaly Wool <vitalywool@gmail.com> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Brian Norris 提交于
There are multiple types of users of mtd->reboot_notifier.notifier_call: (1) A while back, the cfi_cmdset_000{1,2} chip drivers implemented a reboot notifier to (on a best effort basis) attempt to reset their flash chips before rebooting. (2) More recently, we implemented a common _reboot() hook so that MTD drivers (particularly, NAND flash) could better halt I/O operations without having to reimplement the same notifier boilerplate. Currently, the WARN_ONCE() condition here was written to handle (2), but at the same time it mis-diagnosed case (1) as an already-registered MTD. Let's fix this by having the WARN_ONCE() condition better imitate the condition that immediately follows it. (Wow, I don't know how I missed that one.) (Side note: Unfortunately, we can't yet combine the reboot notifier code for (1) and (2) with a patch like [1], because some users of (1) also use mtdconcat, and so the mtd_info struct from cfi_cmdset_000{1,2} won't actually get registered with mtdcore, and therefore their reboot notifier won't get registered.) [1] http://patchwork.ozlabs.org/patch/417981/Suggested-by: NGuenter Roeck <linux@roeck-us.net> Signed-off-by: NBrian Norris <computersforpeace@gmail.com> Cc: Jesper Nilsson <jespern@axis.com> Cc: linux-cris-kernel@axis.com Tested-by: NEzequiel Garcia <ezequiel@vanguardiasur.com.ar> Tested-by: NGuenter Roeck <linux@roeck-us.net> Signed-off-by: NBrian Norris <computersforpeace@gmail.com>
-
- 27 10月, 2015 3 次提交
-
-
由 Brian Norris 提交于
Due to wrong assumption in ofpart ofpart fails on Exynos on SPI chips with no partitions because the subnode containing controller data confuses the ofpart parser. Thus compiling in ofpart support automatically fails probing any SPI NOR flash without partitions on Exynos. Compiling in a partitioning scheme should not cause probe of otherwise valid device to fail. Instead, let's do the following: * try parsers until one succeeds * if no parser succeeds, report the first error we saw * even in the failure case, allow MTD to probe, with fallback partitions or no partitions at all -- the master device will still be registered Issue report and comments initially by Michal Suchanek. Reported-by: NMichal Suchanek <hramrach@gmail.com> Signed-off-by: NBrian Norris <computersforpeace@gmail.com>
-
由 Brian Norris 提交于
When CONFIG_MTD_PARTITIONED_MASTER=y, it is fatal to call mtd_device_parse_register() twice on the same MTD, as we try to register the same device/kobject multipile times. When CONFIG_MTD_PARTITIONED_MASTER=n, calling mtd_device_parse_register() is more of just a nuisance, as we can mostly navigate around any conflicting actions. But anyway, doing so is a Bad Thing (TM), and we should complain loudly for any drivers that try to do this. Signed-off-by: NBrian Norris <computersforpeace@gmail.com> Reviewed-by: NRichard Weinberger <richard@nod.at>
-
由 Brian Norris 提交于
Since commit 3efe41be ("mtd: implement common reboot notifier boilerplate"), we might try to register a reboot notifier for an MTD that failed to register. Let's avoid this by making the error path clearer. Signed-off-by: NBrian Norris <computersforpeace@gmail.com> Reviewed-by: NRichard Weinberger <richard@nod.at>
-
- 14 10月, 2015 2 次提交
-
-
由 Frans Klaver 提交于
If a parent device is set, add_mtd_device() has enough knowledge to fill in some sane default values for the module name and owner. Do so if they aren't already set. Signed-off-by: NFrans Klaver <fransklaver@gmail.com> Signed-off-by: NBrian Norris <computersforpeace@gmail.com>
-
由 Frans Klaver 提交于
add_mtd_device() has a comment suggesting that the caller should have set dev.parent. This is required to have the parent device symlink show up in sysfs, but not for proper operation of the mtd device itself. Currently we have five drivers registering mtd devices during module initialization, so they don't actually provide a parent device to link to. That means we cannot WARN_ON() here, as it would trigger false positives. Make the comment a bit less firm in its assertion that dev.parent should be set. Signed-off-by: NFrans Klaver <fransklaver@gmail.com> Signed-off-by: NBrian Norris <computersforpeace@gmail.com>
-
- 29 9月, 2015 1 次提交
-
-
由 Johannes Thumshirn 提交于
Destroy mtd_idr on module_exit, reclaiming the allocated memory. This was detected by the following semantic patch (written by Luis Rodriguez <mcgrof@suse.com>) <SmPL> @ defines_module_init @ declarer name module_init, module_exit; declarer name DEFINE_IDR; identifier init; @@ module_init(init); @ defines_module_exit @ identifier exit; @@ module_exit(exit); @ declares_idr depends on defines_module_init && defines_module_exit @ identifier idr; @@ DEFINE_IDR(idr); @ on_exit_calls_destroy depends on declares_idr && defines_module_exit @ identifier declares_idr.idr, defines_module_exit.exit; @@ exit(void) { ... idr_destroy(&idr); ... } @ missing_module_idr_destroy depends on declares_idr && defines_module_exit && !on_exit_calls_destroy @ identifier declares_idr.idr, defines_module_exit.exit; @@ exit(void) { ... +idr_destroy(&idr); } </SmPL> Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NBrian Norris <computersforpeace@gmail.com>
-
- 17 6月, 2015 1 次提交
-
-
由 Brian Norris 提交于
It makes more sense to return error statuses, not 1/0. Signed-off-by: NBrian Norris <computersforpeace@gmail.com> Reviewed-by: NRichard Weinberger <richard@nod.at>
-
- 07 5月, 2015 1 次提交
-
-
由 Lars-Peter Clausen 提交于
Use dev_pm_ops instead of the legacy suspend/resume callbacks for the MTD class suspend and resume operations. While we are at it slightly reorder things to avoid the need for forward declarations. Signed-off-by: NLars-Peter Clausen <lars@metafoo.de> Signed-off-by: NBrian Norris <computersforpeace@gmail.com>
-
- 06 4月, 2015 1 次提交
-
-
由 Dan Ehrenberg 提交于
For many use cases, it helps to have a device node for the entire MTD device as well as device nodes for the individual partitions. For example, this allows querying the entire device's properties. A common idiom is to create an additional partition which spans over the whole device. This patch makes a config option, CONFIG_MTD_PARTITIONED_MASTER, which makes the master partition present even when the device is partitioned. This isn't turned on by default since it presents a backwards-incompatible device numbering. The patch also makes the parent of a partition device be the master, if the config flag is set, now that the master is a full device. Signed-off-by: NDan Ehrenberg <dehrenberg@chromium.org> Signed-off-by: NBrian Norris <computersforpeace@gmail.com>
-
- 08 2月, 2015 1 次提交
-
-
由 Niklas Cassel 提交于
Calling mtd_device_parse_register with the same mtd_info (e.g. registering several partitions on a single device) would add the same reboot notifier twice, causing an infinte loop in notifier_chain_register during boot up. Signed-off-by: NNiklas Cassel <nks@flawful.org> [Brian: add FIXME comments] Signed-off-by: NBrian Norris <computersforpeace@gmail.com> Signed-off-by: NBrian Norris <computersforpeace@gmail.com>
-
- 29 1月, 2015 1 次提交
-
-
由 Arnd Bergmann 提交于
The recently added mtd_mmap_capabilities can be used from loadable modules, in particular romfs, but is not exported, so we get ERROR: "mtd_mmap_capabilities" [fs/romfs/romfs.ko] undefined! This adds the missing export. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Fixes: b4caecd4 ("fs: introduce f_op->mmap_capabilities for nommu mmap support") Acked-by: NChristoph Hellwig <hch@lst.de> Acked-by: NBrian Norris <computersforpeace@gmail.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 21 1月, 2015 1 次提交
-
-
由 Christoph Hellwig 提交于
Since "BDI: Provide backing device capability information [try #3]" the backing_dev_info structure also provides flags for the kind of mmap operation available in a nommu environment, which is entirely unrelated to it's original purpose. Introduce a new nommu-only file operation to provide this information to the nommu mmap code instead. Splitting this from the backing_dev_info structure allows to remove lots of backing_dev_info instance that aren't otherwise needed, and entirely gets rid of the concept of providing a backing_dev_info for a character device. It also removes the need for the mtd_inodefs filesystem. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NTejun Heo <tj@kernel.org> Acked-by: NBrian Norris <computersforpeace@gmail.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 08 1月, 2015 1 次提交
-
-
由 Brian Norris 提交于
cfi_cmdset_000{1,2}.c already implement their own reboot notifiers, and we're going to add one for NAND. Let's put the boilerplate in one place. Signed-off-by: NBrian Norris <computersforpeace@gmail.com> Tested-by: NScott Branden <sbranden@broadcom.com>
-
- 20 8月, 2014 2 次提交
-
-
由 Brian Norris 提交于
MTD used to allow compiling out character device support. This was dropped in the following commit, but some of the accompanying logic was never dropped: commit 660685d9 Author: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Date: Thu Mar 14 13:27:40 2013 +0200 mtd: merge mtdchar module with mtdcore The weird logic was flagged by Coverity. Signed-off-by: NBrian Norris <computersforpeace@gmail.com> Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
-
由 Brian Norris 提交于
When checking the upper boundary (i.e., whether an address is higher than the maximum size of the MTD), we should be doing an inclusive check (greater or equal). For instance, an address of 16MB (0x1000000) on a 16MB device is invalid. The strengthening of this bounds check is redundant for those which already have a address+length check and ensure that the length is non-zero, but let's just fix them all, for completeness. Signed-off-by: NBrian Norris <computersforpeace@gmail.com>
-
- 09 7月, 2014 2 次提交
-
-
由 Ezequiel Garcia 提交于
In addition to mtd_block_isbad(), which checks if a block is bad or reserved, it's needed to check if a block is reserved only (but not bad). This commit adds an MTD interface for it, in a similar fashion to mtd_block_isbad(). While here, fix mtd_block_isbad() so the out-of-bounds checking is done before the callback check. Signed-off-by: NEzequiel Garcia <ezequiel.garcia@free-electrons.com> Tested-by: NPekon Gupta <pekon@ti.com> Signed-off-by: NBrian Norris <computersforpeace@gmail.com>
-
由 Ezequiel Garcia 提交于
These new sysfs device attributes allow us to retrieve the ECC and bad block stats by poking a sysfs file, which is often more convenient than using the ioctl. Signed-off-by: NEzequiel Garcia <ezequiel.garcia@free-electrons.com> Tested-by: NPekon Gupta <pekon@ti.com> Signed-off-by: NBrian Norris <computersforpeace@gmail.com>
-
- 11 3月, 2014 2 次提交
-
-
由 Christian Riesch 提交于
If a write to one time programmable memory (OTP) hits the end of this memory area, no more data can be written. The count variable in mtdchar_write() in drivers/mtd/mtdchar.c is not decreased anymore. We are trapped in the loop forever, mtdchar_write() will never return in this case. The desired behavior of a write in such a case is described in [1]: - Try to write as much data as possible, truncate the write to fit into the available memory and return the number of bytes that actually have been written. - If no data could be written at all, return -ENOSPC. This patch fixes the behavior of OTP write if there is not enough space for all data: 1) mtd_write_user_prot_reg() in drivers/mtd/mtdcore.c is modified to return -ENOSPC if no data could be written at all. 2) mtdchar_write() is modified to handle -ENOSPC correctly. Exit if a write returned -ENOSPC and yield the correct return value, either then number of bytes that could be written, or -ENOSPC, if no data could be written at all. Furthermore the patch harmonizes the behavior of the OTP memory write in drivers/mtd/devices/mtd_dataflash.c with the other implementations and the requirements from [1]. Instead of returning -EINVAL if the data does not fit into the OTP memory, we try to write as much data as possible/truncate the write. [1] http://pubs.opengroup.org/onlinepubs/9699919799/functions/write.htmlSigned-off-by: NChristian Riesch <christian.riesch@omicron.at> Signed-off-by: NBrian Norris <computersforpeace@gmail.com>
-
由 Christian Riesch 提交于
Signed-off-by: NChristian Riesch <christian.riesch@omicron.at> Cc: Artem Bityutskiy <Artem.Bityutskiy@linux.intel.com> Signed-off-by: NBrian Norris <computersforpeace@gmail.com>
-
- 04 1月, 2014 1 次提交
-
-
由 Axel Lin 提交于
Use new ATTRIBUTE_GROUPS macro to declare attribute groups. Signed-off-by: NAxel Lin <axel.lin@ingics.com> Signed-off-by: NBrian Norris <computersforpeace@gmail.com>
-
- 07 11月, 2013 2 次提交
-
-
由 Wei Yongjun 提交于
Remove duplicated include. Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn> Acked-by: NEzequiel Garcia <ezequiel.garcia@free-electrons.com> Signed-off-by: NBrian Norris <computersforpeace@gmail.com>
-
由 Ezequiel Garcia 提交于
This patch moves the char and block major number definitions to major.h to be with the rest of the major numbers. While doing this, include major.h in the files that need it. Signed-off-by: NEzequiel Garcia <ezequiel.garcia@free-electrons.com> Signed-off-by: NBrian Norris <computersforpeace@gmail.com>
-
- 28 10月, 2013 1 次提交
-
-
由 Huang Shijie 提交于
The current mtd_type_show() misses the MTD_MLCNANDFLASH case. This patch adds the case for it, and also updates the ABI. Signed-off-by: NHuang Shijie <b32955@freescale.com> Signed-off-by: NBrian Norris <computersforpeace@gmail.com>
-
- 31 8月, 2013 1 次提交
-
-
由 Huang Shijie 提交于
Add a new sys node to show the ecc step size. The application then can uses this node to get the ecc step size. Signed-off-by: NHuang Shijie <b32955@freescale.com> Signed-off-by: NArtem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: NBrian Norris <computersforpeace@gmail.com> Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
-
- 04 7月, 2013 1 次提交
-
-
由 Kees Cook 提交于
Calling dev_set_name with a single paramter causes it to be handled as a format string. Many callers are passing potentially dynamic string content, so use "%s" in those cases to avoid any potential accidents, including wrappers like device_create*() and bdi_register(). Signed-off-by: NKees Cook <keescook@chromium.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 30 4月, 2013 1 次提交
-
-
由 David Howells 提交于
Include missing linux/slab.h inclusions where the source file is currently expecting to get kmalloc() and co. through linux/proc_fs.h. Signed-off-by: NDavid Howells <dhowells@redhat.com> Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> cc: linux-s390@vger.kernel.org cc: sparclinux@vger.kernel.org cc: linux-efi@vger.kernel.org cc: linux-mtd@lists.infradead.org cc: devel@driverdev.osuosl.org cc: x86@kernel.org Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 05 4月, 2013 3 次提交
-
-
由 Artem Bityutskiy 提交于
The MTD subsystem has historically tried to be as configurable as possible. The side-effect of this is that its configuration menu is rather large, and we are gradually shrinking it. For example, we recently merged partitions support with the mtdcore. This patch does the next step - it merges the mtdchar module to mtdcore. And in this case this is not only about eliminating too fine-grained separation and simplifying the configuration menu. This is also about eliminating seemingly useless kernel module. Indeed, mtdchar is a module that allows user-space making use of MTD devices via /dev/mtd* character devices. If users do not enable it, they simply cannot use MTD devices at all. They cannot read or write the flash contents. Is it a sane and useful setup? I believe not. And everyone just enables mtdchar. Having mtdchar separate is also a little bit harmful. People sometimes miss the fact that they need to enable an additional configuration option to have user-space MTD interfaces, and then they wonder why on earth the kernel does not allow using the flash? They spend time asking around. Thus, let's just get rid of this module and make it part of mtd core. Note, mtdchar had additional configuration option to enable OTP interfaces, which are present on some flashes. I removed that option as well - it saves a really tiny amount space. [dwmw2: Strictly speaking, you can mount file systems on MTD devices just fine without the mtdchar (or mtdblock) devices; you just can't do other manipulations directly on the underlying device. But still I agree that it makes sense to make this unconditional. And Yay! we get to kill off an instance of checking CONFIG_foo_MODULE, which is an abomination that should never happen.] Signed-off-by: NArtem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
-
由 Artem Bityutskiy 提交于
Remove a couple of useles '#ifdef CONFIG_PROC_FS's around procfs functions which anyway turn into empty function in 'proc_fs.h' file when CONFIG_PROC_FS is not defined. Signed-off-by: NArtem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
-
由 Artem Bityutskiy 提交于
'mtd_device_parse_register()' and 'parse_mtd_partitions()' functions accept a an array of character pointers. These functions modify neither the pointers nor the characters they point to. The characters are actually names of the MTD parsers. At the moment, the argument type is 'const char **', which means that only the names of the parsers are constant. Let's turn the argument type into 'const char * const *', which means that both names and the pointers which point to them are constant. Signed-off-by: NArtem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
-
- 28 2月, 2013 1 次提交
-
-
由 Tejun Heo 提交于
Convert to the much saner new idr interface. Signed-off-by: NTejun Heo <tj@kernel.org> Tested-by: NEzequiel Garcia <ezequiel.garcia@free-electrons.com> Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 11 12月, 2012 1 次提交
-
-
由 Linus Torvalds 提交于
This reverts commits a5091539 and d7c3b937. This is a revert of a revert of a revert. In addition, it reverts the even older i915 change to stop using the __GFP_NO_KSWAPD flag due to the original commits in linux-next. It turns out that the original patch really was bogus, and that the original revert was the correct thing to do after all. We thought we had fixed the problem, and then reverted the revert, but the problem really is fundamental: waking up kswapd simply isn't the right thing to do, and direct reclaim sometimes simply _is_ the right thing to do. When certain allocations fail, we simply should try some direct reclaim, and if that fails, fail the allocation. That's the right thing to do for THP allocations, which can easily fail, and the GPU allocations want to do that too. So starting kswapd is sometimes simply wrong, and removing the flag that said "don't start kswapd" was a mistake. Let's hope we never revisit this mistake again - and certainly not this many times ;) Acked-by: NMel Gorman <mgorman@suse.de> Acked-by: NJohannes Weiner <hannes@cmpxchg.org> Cc: Rik van Riel <riel@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 01 12月, 2012 1 次提交
-
-
由 Andrew Morton 提交于
It apepars that this patch was innocent, and we hope that "mm: avoid waking kswapd for THP allocations when compaction is deferred or contended" will fix the final kswapd-spinning cause. Cc: Zdenek Kabelac <zkabelac@redhat.com> Cc: Seth Jennings <sjenning@linux.vnet.ibm.com> Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu> Cc: Jiri Slaby <jirislaby@gmail.com> Cc: Rik van Riel <riel@redhat.com> Cc: Robert Jennings <rcj@linux.vnet.ibm.com> Cc: Mel Gorman <mgorman@suse.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 27 11月, 2012 1 次提交
-
-
由 Mel Gorman 提交于
With "mm: vmscan: scale number of pages reclaimed by reclaim/compaction based on failures" reverted, Zdenek Kabelac reported the following Hmm, so it's just took longer to hit the problem and observe kswapd0 spinning on my CPU again - it's not as endless like before - but still it easily eats minutes - it helps to turn off Firefox or TB (memory hungry apps) so kswapd0 stops soon - and restart those apps again. (And I still have like >1GB of cached memory) kswapd0 R running task 0 30 2 0x00000000 Call Trace: preempt_schedule+0x42/0x60 _raw_spin_unlock+0x55/0x60 put_super+0x31/0x40 drop_super+0x22/0x30 prune_super+0x149/0x1b0 shrink_slab+0xba/0x510 The sysrq+m indicates the system has no swap so it'll never reclaim anonymous pages as part of reclaim/compaction. That is one part of the problem but not the root cause as file-backed pages could also be reclaimed. The likely underlying problem is that kswapd is woken up or kept awake for each THP allocation request in the page allocator slow path. If compaction fails for the requesting process then compaction will be deferred for a time and direct reclaim is avoided. However, if there are a storm of THP requests that are simply rejected, it will still be the the case that kswapd is awake for a prolonged period of time as pgdat->kswapd_max_order is updated each time. This is noticed by the main kswapd() loop and it will not call kswapd_try_to_sleep(). Instead it will loopp, shrinking a small number of pages and calling shrink_slab() on each iteration. The temptation is to supply a patch that checks if kswapd was woken for THP and if so ignore pgdat->kswapd_max_order but it'll be a hack and not backed up by proper testing. As 3.7 is very close to release and this is not a bug we should release with, a safer path is to revert "mm: remove __GFP_NO_KSWAPD" for now and revisit it with the view to ironing out the balance_pgdat() logic in general. Signed-off-by: NMel Gorman <mgorman@suse.de> Cc: Zdenek Kabelac <zkabelac@redhat.com> Cc: Seth Jennings <sjenning@linux.vnet.ibm.com> Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu> Cc: Jiri Slaby <jirislaby@gmail.com> Cc: Rik van Riel <riel@redhat.com> Cc: Robert Jennings <rcj@linux.vnet.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 09 10月, 2012 1 次提交
-
-
由 Rik van Riel 提交于
When transparent huge pages were introduced, memory compaction and swap storms were an issue, and the kernel had to be careful to not make THP allocations cause pageout or compaction. Now that we have working compaction deferral, kswapd is smart enough to invoke compaction and the quadratic behaviour around isolate_free_pages has been fixed, it should be safe to remove __GFP_NO_KSWAPD. [minchan@kernel.org: Comment fix] [mgorman@suse.de: Avoid direct reclaim for deferred compaction] Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: NRik van Riel <riel@redhat.com> Signed-off-by: NMel Gorman <mgorman@suse.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 29 9月, 2012 1 次提交
-
-
由 Brian Norris 提交于
mtd_read_oob() has some unexpected similarities to mtd_read(). For instance, when ops->datbuf != NULL, nand_base.c might return max_bitflips; however, when ops->datbuf == NULL, nand_base's code potentially could return -EUCLEAN (no in-tree drivers do this yet). In any case where the driver might return max_bitflips, we should translate this into an appropriate return code using the bitflip_threshold. Essentially, mtd_read_oob() duplicates the logic from mtd_read(). This prevents users of mtd_read_oob() from receiving a positive return value (i.e., from max_bitflips) and interpreting it as an unknown error. Artem: amend comments. Signed-off-by: NBrian Norris <computersforpeace@gmail.com> Reviewed-by: NMike Dunn <mikedunn@newsguy.com> Reviewed-by: NShmulik Ladkani <shmulik.ladkani@gmail.com> Signed-off-by: NArtem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
-