- 14 12月, 2012 1 次提交
-
-
由 Jeff Garzik 提交于
This reverts commit de90cd71. Shane Huang writes: Please suspend this patch because I just received two new DevSlp drives but found word 78 bit 5 is _not_ set. I'm checking with the drive vendor whether he gave me the wrong information. If bit 5 is not the necessary and sufficient condition, I will implement another patch to replace ata_device->sata_settings into ->devslp_timing. Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
- 04 12月, 2012 4 次提交
-
-
由 Brian Norris 提交于
I failed to include <linux/libata.h>, causing this error: drivers/ata/pata_of_platform.c:93: error: 'ata_platform_remove_one' undeclared here (not in a function) Signed-off-by: NBrian Norris <computersforpeace@gmail.com> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Brian Norris 提交于
Signed-off-by: NBrian Norris <computersforpeace@gmail.com> Acked-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Brian Norris 提交于
This driver does not detach and remove its ata_host properly on device removal. Add the common .remove helper. Note: I do not know this driver well enough to ensure this is the right thing to do. Merge this patch with caution. Signed-off-by: NBrian Norris <computersforpeace@gmail.com> Acked-by: NTejun Heo <tj@kernel.org> Cc: David Daney <david.daney@cavium.com> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Brian Norris 提交于
Signed-off-by: NBrian Norris <computersforpeace@gmail.com> Acked-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
- 03 12月, 2012 15 次提交
-
-
由 Brian Norris 提交于
Signed-off-by: NBrian Norris <computersforpeace@gmail.com> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Brian Norris 提交于
All users of __pata_platform_remove() have been converted to utilize the common ata_platform_remove_one(). Signed-off-by: NBrian Norris <computersforpeace@gmail.com> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Brian Norris 提交于
Signed-off-by: NBrian Norris <computersforpeace@gmail.com> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Brian Norris 提交于
Signed-off-by: NBrian Norris <computersforpeace@gmail.com> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Brian Norris 提交于
Signed-off-by: NBrian Norris <computersforpeace@gmail.com> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Brian Norris 提交于
Signed-off-by: NBrian Norris <computersforpeace@gmail.com> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Brian Norris 提交于
Signed-off-by: NBrian Norris <computersforpeace@gmail.com> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Brian Norris 提交于
This relatively simple boiler-plate code is repeated in several platform drivers. We should implement a common version in libata. Signed-off-by: NBrian Norris <computersforpeace@gmail.com> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Brian Norris 提交于
AHCI platform devices may provide an exit() routine, via ahci_platform_data, that powers off the SATA core. Such a routine should be executed from the ata_port_operations host_stop() hook. That way, the ATA subsystem can perform any last-minute hardware cleanup (via devres, for example), then trigger the power-off at the appropriate time. This patch fixes bus errors triggered during module removal or device unbinding, seen on an SoC SATA core. Signed-off-by: NBrian Norris <computersforpeace@gmail.com> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Brian Norris 提交于
The ahci_platform driver can now use the module_platform_driver() macro. Signed-off-by: NBrian Norris <computersforpeace@gmail.com> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Brian Norris 提交于
platform_driver_probe() should be used for registering this driver only if we want to "...remove its run-once probe() infrastructure from memory after the driver has bound to the device." However, we may want to leave the probe infrastructure in place in order to support binding/unbinding a device dynamically. This is useful, for instance, as a power management mechanism, where a device can be totally powered down when unbound (whereas with runtime power management, powering down the SATA core would incur unacceptable loss of functionality). Thus, convert this driver to use platform_driver_register(). Signed-off-by: NBrian Norris <computersforpeace@gmail.com> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Aaron Lu 提交于
ata_device->dma_mode's initial value is zero, which is not a valid dma mode, but ata_dma_enabled will return true for this value. This patch sets dma_mode to 0xff in reset function, so that ata_dma_enabled will not return true for this case, or it will cause problem for pata_acpi. The corrsponding bugzilla page is at: https://bugzilla.kernel.org/show_bug.cgi?id=49151Reported-by: NPhillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: NAaron Lu <aaron.lu@intel.com> Tested-by: NSzymon Janc <szymon@janc.net.pl> Tested-by: NDutra Julio <dutra.julio@gmail.com> Acked-by: NAlan Cox <alan@linux.intel.com> Cc: <stable@kernel.org> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
Signed-off-by: NBartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: NKyungmin Park <kyungmin.park@samsung.com> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Shane Huang 提交于
NCQ capability was used to check availability of SATA Settings page from Identify Device Data Log, which contains DevSlp timing variables. It does not work on some HDDs and leads to error messages. IDENTIFY word 78 bit 5(Hardware Feature Control) should be used. Quoting SATA spec 3.1: If Hardware Feature Control is supported, then: a) IDENTIFY DEVICE data word 78 bit 5 (see 13.2.1.18) shall be set to one; b) the SET FEATURES Select Hardware Feature Control subcommand shall be supported (see 13.3.8); c) page 08h of the Identify Device Data log (see 13.7.7) shall be supported; This patch is not tested on SATA HDD with DevSlp supported. Reported-by: NBorislav Petkov <bp@amd64.org> Signed-off-by: NShane Huang <shane.huang@amd.com> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Aaron Lu 提交于
Commit 66fa7f21 "libata-acpi: improve ACPI disabling" introdcued the behaviour of disabling ATA ACPI if ata_acpi_on_devcfg failed the 2nd time, but commit 30dcf76a dropped this behaviour and this caused problem for Dimitris Damigos, where his laptop can not resume correctly. The bugzilla page for it is: https://bugzilla.kernel.org/show_bug.cgi?id=49331 The problem is, ata_dev_push_id will fail the 2nd time it is invoked, and due to disabling ACPI code is dropped, ata_acpi_on_devcfg which calls ata_dev_push_id will keep failing and eventually made the device disabled. This patch restores the original behaviour, if acpi failed the 2nd time, disable acpi functionality for the device(and we do not event need to add a debug message for this as it is still there ;-). Reported-by: NDimitris Damigos <damigos@freemail.gr> Signed-off-by: NAaron Lu <aaron.lu@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
- 29 11月, 2012 10 次提交
-
-
由 Joe Perches 提交于
dev_<level> calls take less code than dev_printk(KERN_<LEVEL> and reducing object size is good. Coalesce formats for easier grep. Signed-off-by: NJoe Perches <joe@perches.com> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Sergei Shtylyov 提交于
'acdev->qc', 'acdev->qc->ap', and 'acdev->qc->tf' expressions are used multiple times in this function, so it makes sense to use the local variables for them. Signed-off-by: NSergei Shtylyov <sshtylyov@ru.mvista.com> Acked-by: NViresh Kumar <viresh.kumar@linaro.org> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Sergei Shtylyov 提交于
ahci_highbank_hardreset() uses bare number for the BSY bit of the ATA status register, despite it is #define'd in <linux/ata.h> Signed-off-by: NSergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Sergei Shtylyov 提交于
... because those functions don't use this parameter. While at it, correctly align 'total_len' parameter. Signed-off-by: NSergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Wei Yongjun 提交于
The variable addr is initialized but never used otherwise, so remove the unused variable. dpatch engine is used to auto generate this patch. (https://github.com/weiyj/dpatch) Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Wei Yongjun 提交于
The variable addr is initialized but never used otherwise, so remove the unused variable. dpatch engine is used to auto generate this patch. (https://github.com/weiyj/dpatch) Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Wei Yongjun 提交于
The variable port_flags is initialized but never used otherwise, so remove the unused variable. dpatch engine is used to auto generate this patch. (https://github.com/weiyj/dpatch) Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Christian Gmeiner 提交于
I am working on a device which uses the cs5536 pata driver. There are some broken hardware revisions out in the field, which can be detected via DMI. On older versions with an embedded BIOS I used libata.dma=0 to disable dma completely. Now we are switching to a coreboot/seabios based BIOS where we have DMI support and so I think its a good idea to get rid of all those hacky kernel parameters as the same image is used other devices where libata.dma=0 is not a good idea. Signed-off-by: NChristian Gmeiner <christian.gmeiner@gmail.com> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Olaf Hering 提交于
An earlier commit cd006086 ("ata_piix: defer disks to the Hyper-V drivers by default") broke MS Virtual PC guests. Hyper-V guests and Virtual PC guests have nearly identical DMI info. As a result the driver does currently ignore the emulated hardware in Virtual PC guests and defers the handling to hv_blkvsc. Since Virtual PC does not offer paravirtualized drivers no disks will be found in the guest. One difference in the DMI info is the product version. This patch adds a match for MS Virtual PC 2007 and "unignores" the emulated hardware. This was reported for openSuSE 12.1 in bugzilla: https://bugzilla.novell.com/show_bug.cgi?id=737532 Here is a detailed list of DMI info from example guests: hwinfo --bios: virtual pc guest: System Info: #1 Manufacturer: "Microsoft Corporation" Product: "Virtual Machine" Version: "VS2005R2" Serial: "3178-9905-1533-4840-9282-0569-59" UUID: undefined, but settable Wake-up: 0x06 (Power Switch) Board Info: #2 Manufacturer: "Microsoft Corporation" Product: "Virtual Machine" Version: "5.0" Serial: "3178-9905-1533-4840-9282-0569-59" Chassis Info: #3 Manufacturer: "Microsoft Corporation" Version: "5.0" Serial: "3178-9905-1533-4840-9282-0569-59" Asset Tag: "7188-3705-6309-9738-9645-0364-00" Type: 0x03 (Desktop) Bootup State: 0x03 (Safe) Power Supply State: 0x03 (Safe) Thermal State: 0x01 (Other) Security Status: 0x01 (Other) win2k8 guest: System Info: #1 Manufacturer: "Microsoft Corporation" Product: "Virtual Machine" Version: "7.0" Serial: "9106-3420-9819-5495-1514-2075-48" UUID: undefined, but settable Wake-up: 0x06 (Power Switch) Board Info: #2 Manufacturer: "Microsoft Corporation" Product: "Virtual Machine" Version: "7.0" Serial: "9106-3420-9819-5495-1514-2075-48" Chassis Info: #3 Manufacturer: "Microsoft Corporation" Version: "7.0" Serial: "9106-3420-9819-5495-1514-2075-48" Asset Tag: "7076-9522-6699-1042-9501-1785-77" Type: 0x03 (Desktop) Bootup State: 0x03 (Safe) Power Supply State: 0x03 (Safe) Thermal State: 0x01 (Other) Security Status: 0x01 (Other) win2k12 guest: System Info: #1 Manufacturer: "Microsoft Corporation" Product: "Virtual Machine" Version: "7.0" Serial: "8179-1954-0187-0085-3868-2270-14" UUID: undefined, but settable Wake-up: 0x06 (Power Switch) Board Info: #2 Manufacturer: "Microsoft Corporation" Product: "Virtual Machine" Version: "7.0" Serial: "8179-1954-0187-0085-3868-2270-14" Chassis Info: #3 Manufacturer: "Microsoft Corporation" Version: "7.0" Serial: "8179-1954-0187-0085-3868-2270-14" Asset Tag: "8374-0485-4557-6331-0620-5845-25" Type: 0x03 (Desktop) Bootup State: 0x03 (Safe) Power Supply State: 0x03 (Safe) Thermal State: 0x01 (Other) Security Status: 0x01 (Other) Signed-off-by: NOlaf Hering <olaf@aepfle.de> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Mikael Pettersson 提交于
sata_promise's pdc_hard_reset_port() needs to serialize because it flips a port-specific bit in controller register that's shared by all ports. The code takes the ata host lock for this, but that's broken because an interrupt may arrive on our irq during the hard reset sequence, and that too will take the ata host lock. With lockdep enabled a big nasty warning is seen. Fixed by adding private state to the ata host structure, containing a second lock used only for serializing the hard reset sequences. This eliminated the lockdep warnings both on my test rig and on the original reporter's machine. Signed-off-by: NMikael Pettersson <mikpe@it.uu.se> Tested-by: NAdko Branil <adkobranil@yahoo.com> Cc: stable@vger.kernel.org Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
- 27 11月, 2012 10 次提交
-
-
由 Linus Torvalds 提交于
Merge misc fixes from Andrew Morton: "8 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (8 patches) futex: avoid wake_futex() for a PI futex_q watchdog: using u64 in get_sample_period() writeback: put unused inodes to LRU after writeback completion mm: vmscan: check for fatal signals iff the process was throttled Revert "mm: remove __GFP_NO_KSWAPD" proc: check vma->vm_file before dereferencing UAPI: strip the _UAPI prefix from header guards during header installation include/linux/bug.h: fix sparse warning related to BUILD_BUG_ON_INVALID
-
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty由 Linus Torvalds 提交于
Pull TTY fix from Greg Kroah-Hartman: "Here is a single fix for a reported regression in 3.7-rc5 for the tty layer. This fix has been in the linux-next tree and solves the reported problem. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>" * tag 'tty-3.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: tty vt: Fix a regression in command line edition
-
git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6由 Linus Torvalds 提交于
Pull MFD fixes from Samuel Ortiz: - A twl fix preventing a buffer overflow. - A wm5102 register patch fix. - A wm5110 error misreport fix. - Arizona fixes: Use the right array size when adding subdevices, correctly report underclocked events, synchronize register cache after reset. - A twl4030 fix for preventing the system to hang from an interrupt flood. * tag 'mfd-for-linus-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6: mfd: twl4030: Fix chained irq handling on resume from suspend mfd: arizona: Sync regcache after reset mfd: arizona: Correctly report when AIF2/AIF1 is underclocked mfd: arizona: Use correct array for ARRAY_SIZE in mfd_add_devices call mfd: wm5110: Disable control interface error report for WM5110 rev B mfd: wm5102: Update register patch for latest evaluation mfd: twl-core: Fix chip ID for the twl6030-pwm module
-
git://git.linaro.org/people/rmk/linux-arm由 Linus Torvalds 提交于
Pull ARM fixes from Russell King: "Not much here, just a couple minor/cosmetic fixes and a patch for the decompressor which fixes problems with modern GCC and CPUs." * 'fixes' of git://git.linaro.org/people/rmk/linux-arm: ARM: 7583/1: decompressor: Enable unaligned memory access for v6 and above ARM: 7572/1: proc-v6.S: fix comment ARM: 7570/1: quiet down the non make -s output
-
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs由 Linus Torvalds 提交于
Pull ext3 regression fix from Jan Kara: "Fix an ext3 regression introduced during 3.7 merge window. It leads to deadlock if you stress the filesystem in the right way (luckily only if blocksize < pagesize)." * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: jbd: Fix lock ordering bug in journal_unmap_buffer()
-
由 Darren Hart 提交于
Dave Jones reported a bug with futex_lock_pi() that his trinity test exposed. Sometime between queue_me() and taking the q.lock_ptr, the lock_ptr became NULL, resulting in a crash. While futex_wake() is careful to not call wake_futex() on futex_q's with a pi_state or an rt_waiter (which are either waiting for a futex_unlock_pi() or a PI futex_requeue()), futex_wake_op() and futex_requeue() do not perform the same test. Update futex_wake_op() and futex_requeue() to test for q.pi_state and q.rt_waiter and abort with -EINVAL if detected. To ensure any future breakage is caught, add a WARN() to wake_futex() if the same condition is true. This fix has seen 3 hours of testing with "trinity -c futex" on an x86_64 VM with 4 CPUS. [akpm@linux-foundation.org: tidy up the WARN()] Signed-off-by: NDarren Hart <dvhart@linux.intel.com> Reported-by: NDave Jones <davej@redat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: John Kacur <jkacur@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Chuansheng Liu 提交于
In get_sample_period(), unsigned long is not enough: watchdog_thresh * 2 * (NSEC_PER_SEC / 5) case1: watchdog_thresh is 10 by default, the sample value will be: 0xEE6B2800 case2: set watchdog_thresh is 20, the sample value will be: 0x1 DCD6 5000 In case2, we need use u64 to express the sample period. Otherwise, changing the threshold thru proc often can not be successful. Signed-off-by: Nliu chuansheng <chuansheng.liu@intel.com> Acked-by: NDon Zickus <dzickus@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Jan Kara 提交于
Commit 169ebd90 ("writeback: Avoid iput() from flusher thread") removed iget-iput pair from inode writeback. As a side effect, inodes that are dirty during iput_final() call won't be ever added to inode LRU (iput_final() doesn't add dirty inodes to LRU and later when the inode is cleaned there's noone to add the inode there). Thus inodes are effectively unreclaimable until someone looks them up again. The practical effect of this bug is limited by the fact that inodes are pinned by a dentry for long enough that the inode gets cleaned. But still the bug can have nasty consequences leading up to OOM conditions under certain circumstances. Following can easily reproduce the problem: for (( i = 0; i < 1000; i++ )); do mkdir $i for (( j = 0; j < 1000; j++ )); do touch $i/$j echo 2 > /proc/sys/vm/drop_caches done done then one needs to run 'sync; ls -lR' to make inodes reclaimable again. We fix the issue by inserting unused clean inodes into the LRU after writeback finishes in inode_sync_complete(). Signed-off-by: NJan Kara <jack@suse.cz> Reported-by: NOGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: <stable@vger.kernel.org> [3.5+] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mel Gorman 提交于
Commit 5515061d ("mm: throttle direct reclaimers if PF_MEMALLOC reserves are low and swap is backed by network storage") introduced a check for fatal signals after a process gets throttled for network storage. The intention was that if a process was throttled and got killed that it should not trigger the OOM killer. As pointed out by Minchan Kim and David Rientjes, this check is in the wrong place and too broad. If a system is in am OOM situation and a process is exiting, it can loop in __alloc_pages_slowpath() and calling direct reclaim in a loop. As the fatal signal is pending it returns 1 as if it is making forward progress and can effectively deadlock. This patch moves the fatal_signal_pending() check after throttling to throttle_direct_reclaim() where it belongs. If the process is killed while throttled, it will return immediately without direct reclaim except now it will have TIF_MEMDIE set and will use the PFMEMALLOC reserves. Minchan pointed out that it may be better to direct reclaim before returning to avoid using the reserves because there may be pages that can easily reclaim that would avoid using the reserves. However, we do no such targetted reclaim and there is no guarantee that suitable pages are available. As it is expected that this throttling happens when swap-over-NFS is used there is a possibility that the process will instead swap which may allocate network buffers from the PFMEMALLOC reserves. Hence, in the swap-over-nfs case where a process can be throtted and be killed it can use the reserves to exit or it can potentially use reserves to swap a few pages and then exit. This patch takes the option of using the reserves if necessary to allow the process exit quickly. If this patch passes review it should be considered a -stable candidate for 3.6. Signed-off-by: NMel Gorman <mgorman@suse.de> Cc: David Rientjes <rientjes@google.com> Cc: Luigi Semenzato <semenzato@google.com> Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Sonny Rao <sonnyrao@google.com> Cc: Minchan Kim <minchan@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mel Gorman 提交于
With "mm: vmscan: scale number of pages reclaimed by reclaim/compaction based on failures" reverted, Zdenek Kabelac reported the following Hmm, so it's just took longer to hit the problem and observe kswapd0 spinning on my CPU again - it's not as endless like before - but still it easily eats minutes - it helps to turn off Firefox or TB (memory hungry apps) so kswapd0 stops soon - and restart those apps again. (And I still have like >1GB of cached memory) kswapd0 R running task 0 30 2 0x00000000 Call Trace: preempt_schedule+0x42/0x60 _raw_spin_unlock+0x55/0x60 put_super+0x31/0x40 drop_super+0x22/0x30 prune_super+0x149/0x1b0 shrink_slab+0xba/0x510 The sysrq+m indicates the system has no swap so it'll never reclaim anonymous pages as part of reclaim/compaction. That is one part of the problem but not the root cause as file-backed pages could also be reclaimed. The likely underlying problem is that kswapd is woken up or kept awake for each THP allocation request in the page allocator slow path. If compaction fails for the requesting process then compaction will be deferred for a time and direct reclaim is avoided. However, if there are a storm of THP requests that are simply rejected, it will still be the the case that kswapd is awake for a prolonged period of time as pgdat->kswapd_max_order is updated each time. This is noticed by the main kswapd() loop and it will not call kswapd_try_to_sleep(). Instead it will loopp, shrinking a small number of pages and calling shrink_slab() on each iteration. The temptation is to supply a patch that checks if kswapd was woken for THP and if so ignore pgdat->kswapd_max_order but it'll be a hack and not backed up by proper testing. As 3.7 is very close to release and this is not a bug we should release with, a safer path is to revert "mm: remove __GFP_NO_KSWAPD" for now and revisit it with the view to ironing out the balance_pgdat() logic in general. Signed-off-by: NMel Gorman <mgorman@suse.de> Cc: Zdenek Kabelac <zkabelac@redhat.com> Cc: Seth Jennings <sjenning@linux.vnet.ibm.com> Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu> Cc: Jiri Slaby <jirislaby@gmail.com> Cc: Rik van Riel <riel@redhat.com> Cc: Robert Jennings <rcj@linux.vnet.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-