- 07 2月, 2014 2 次提交
-
-
由 Rafael J. Wysocki 提交于
To avoid the need to install a hotplug notify handler for each ACPI namespace node representing a device and having a matching scan handler, move the check whether or not the ejection of the given device is enabled through its scan handler from acpi_hotplug_notify_cb() to acpi_generic_hotplug_event(). Also, move the execution of ACPI_OST_SC_EJECT_IN_PROGRESS _OST to acpi_generic_hotplug_event(), because in acpi_hotplug_notify_cb() or in acpi_eject_store() we really don't know whether or not the eject is going to be in progress (for example, acpi_hotplug_execute() may still fail without queuing up the work item). Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: NMika Westerberg <mika.westerberg@linux.intel.com>
-
由 Rafael J. Wysocki 提交于
The ACPI-based PCI hotplug (ACPIPHP) code currently attaches its hotplug context objects directly to ACPI namespace nodes representing hotplug devices. However, after recent changes causing struct acpi_device to be created for every namespace node representing a device (regardless of its status), that is not necessary any more. Moreover, it's vulnerable to the theoretical issue that the ACPI handle passed in the context between handle_hotplug_event() and hotplug_event_work() may become invalid in the meantime (as a result of a concurrent table unload). In principle, this issue might be addressed by adding a non-empty release handler for ACPIPHP hotplug context objects analogous to acpi_scan_drop_device(), but that would duplicate the code in that function and in acpi_device_del_work_fn(). For this reason, it's better to modify ACPIPHP to attach its device hotplug contexts to struct device objects representing hotplug devices and make it use acpi_hotplug_notify_cb() as its notify handler. At the same time, acpi_device_hotplug() can be modified to dispatch the new .hp.event() callback pointing to acpiphp_hotplug_event() from ACPI device objects associated with PCI devices or use the generic ACPI device hotplug code for device objects with matching scan handlers. This allows the existing code duplication between ACPIPHP and the ACPI core to be reduced too and makes further ACPI-based device hotplug consolidation possible. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 06 2月, 2014 14 次提交
-
-
由 Rafael J. Wysocki 提交于
Subsequent changes will require the ACPI core to acquire the lock protecting the ACPIPHP hotplug contexts, so move the definition of the lock to the core and change its name to be more generic. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: NMika Westerberg <mika.westerberg@linux.intel.com>
-
由 Rafael J. Wysocki 提交于
There is a slight possibility for the ACPI device object pointed to by adev in acpi_hotplug_notify_cb() to become invalid between the acpi_bus_get_device() that it comes from and the subsequent dereference of that pointer under get_device(). Namely, if acpi_scan_drop_device() runs in parallel with acpi_hotplug_notify_cb(), acpi_device_del_work_fn() queued up by it may delete the device object in question right after a successful execution of acpi_bus_get_device() in acpi_bus_notify(). An analogous problem is present in acpi_bus_notify() where the device pointer coming from acpi_bus_get_device() may become invalid before it subsequent dereference in the "if" block. To prevent that from happening, introduce a new function, acpi_bus_get_acpi_device(), working analogously to acpi_bus_get_device() except that it will grab a reference to the ACPI device object returned by it and it will do that under the ACPICA's namespace mutex. Then, make both acpi_hotplug_notify_cb() and acpi_bus_notify() use acpi_bus_get_acpi_device() instead of acpi_bus_get_device() so as to ensure that the pointers used by them will not become stale at one point. In addition to that, introduce acpi_bus_put_acpi_device() as a wrapper around put_device() to be used along with acpi_bus_get_acpi_device() and make the (new) users of the latter use acpi_bus_put_acpi_device() too. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: NMika Westerberg <mika.westerberg@linux.intel.com>
-
由 Rafael J. Wysocki 提交于
Introduce a new function, acpi_get_data_full(), working in analogy with acpi_get_data() except that it can execute a callback provided as its 4th argument right after acpi_ns_get_attached_data() has returned a success. That will allow Linux to reference count the object pointed to by *data before the namespace mutex is released so as to ensure that it will not be freed going forward until the reference to it acquired by acpi_get_data_full() is dropped. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: NMika Westerberg <mika.westerberg@linux.intel.com>
-
由 Rafael J. Wysocki 提交于
Since hotplug_event() can get the ACPI handle needed for debug printouts from its context argument, there's no need to pass the handle to it. Moreover, the second argument's type may be changed to (struct acpiphp_context *), because that's what is always passed to hotplug_event() as the second argument anyway. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: NMika Westerberg <mika.westerberg@linux.intel.com>
-
由 Rafael J. Wysocki 提交于
Make hotplug_event() use acpi_handle_debug() instead of an open-coded debug message printing and clean up the messages printed by it. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: NMika Westerberg <mika.westerberg@linux.intel.com>
-
由 Rafael J. Wysocki 提交于
A few lines of code can be cut from hotplug_event() by defining and initializing the slot variable at the top of the function, so do that. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: NMika Westerberg <mika.westerberg@linux.intel.com>
-
由 Rafael J. Wysocki 提交于
After recent PCI core changes related to the rescan/remove locking, the code sections under crit_sect mutexes from ACPIPHP slot objects are always executed under the general PCI rescan/remove lock. For this reason, the crit_sect mutexes are simply redundant, so drop them. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: NMika Westerberg <mika.westerberg@linux.intel.com>
-
由 Rafael J. Wysocki 提交于
acpiphp_bus_add() is only called from one place, so move the code out of it into that place and drop it. Also make that code use func_to_acpi_device() to get the struct acpi_device pointer it needs instead of calling acpi_bus_get_device() which may be costly. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: NMika Westerberg <mika.westerberg@linux.intel.com>
-
由 Rafael J. Wysocki 提交于
After recent modifications of the ACPI core making it create a struct acpi_device object for every namespace node representing a device regardless of the current status of that device the ACPIPHP code can store a struct acpi_device pointer instead of an ACPI handle in struct acpiphp_context. This immediately makes it possible to avoid making potentially costly calls to acpi_bus_get_device() in two places and allows some more simplifications to be made going forward. The reason why that is correct is because ACPIPHP only installs hotify handlers for namespace nodes that exist when acpiphp_enumerate_slots() is called for their parent bridge. That only happens if the parent bridge has an ACPI companion associated with it, which means that the ACPI namespace scope in question has been scanned already at that point. That, in turn, means that struct acpi_device objects have been created for all namespace nodes in that scope and pointers to those objects can be stored directly instead of their ACPI handles. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: NMika Westerberg <mika.westerberg@linux.intel.com>
-
由 Rafael J. Wysocki 提交于
If a struct acpi_device pointer is passed to acpiphp_no_hotplug() instead of an ACPI handle, the function won't need to call acpi_bus_get_device(), which may be costly, any more. Then, trim_stale_devices() can call acpiphp_no_hotplug() passing the struct acpi_device object it already has directly to that function. Make those changes and update slot_no_hotplug() accordingly. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: NMika Westerberg <mika.westerberg@linux.intel.com>
-
由 Rafael J. Wysocki 提交于
If trim_stale_devices() calls acpi_bus_trim() directly, we can save a potentially costly acpi_bus_get_device() invocation. After making that change acpiphp_bus_trim() would only be called from one place, so move the code from it to that place and drop it. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: NMika Westerberg <mika.westerberg@linux.intel.com>
-
由 Rafael J. Wysocki 提交于
The err label in register_slot() is only jumped to from one place, so move the code under the label to that place and drop the label. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: NMika Westerberg <mika.westerberg@linux.intel.com>
-
由 Rafael J. Wysocki 提交于
Add proper kerneldoc comments describing acpiphp_enumerate_slots() and acpiphp_remove_slots(). Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: NMika Westerberg <mika.westerberg@linux.intel.com>
-
由 Rafael J. Wysocki 提交于
After recent PCI core changes related to the rescan/remove locking, the ACPIPHP's disable_slot() function is only called under the general PCI rescan/remove lock, so it doesn't have to use dev_in_slot() any more to avoid race conditions. Make it simply walk the devices on the bus and drop the ones in the slot being disabled and drop dev_in_slot() which has no more users. Moreover, to avoid problems described in the changelog of commit 29ed1f29 (PCI: pciehp: Fix null pointer deref when hot-removing SR-IOV device), make disable_slot() carry out the list walk in reverse order. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: NMika Westerberg <mika.westerberg@linux.intel.com>
-
- 04 2月, 2014 5 次提交
-
-
由 Rafael J. Wysocki 提交于
If a PCI bridge with an ACPIPHP context attached is removed via sysfs, the code path executed as a result is the following: pci_stop_and_remove_bus_device_locked pci_remove_bus pcibios_remove_bus acpi_pci_remove_bus acpiphp_remove_slots cleanup_bridge unregister_hotplug_dock_device (drops dock references to the bridge) put_bridge free_bridge acpiphp_put_context (for each child, under context lock) kfree (context) Now, if a dock event affecting one of the bridge's child devices occurs (roughly at the same time), it will lead to the following code path: acpi_dock_deferred_cb dock_notify handle_eject_request hot_remove_dock_devices dock_hotplug_event hotplug_event (dereferences context) That may lead to a kernel crash in hotplug_event() if it is executed after the last kfree() in the bridge removal code path. To prevent that from happening, add a wrapper around hotplug_event() called dock_event() and point the .handler pointer in acpiphp_dock_ops to it. Make that wrapper retrieve the device's ACPIPHP context using acpiphp_get_context() (instead of taking it from the data argument) under acpiphp_context_lock and check if the parent bridge's is_going_away flag is set. If that flag is set, it will return immediately and if it is not set it will grab a reference to the device's parent bridge before executing hotplug_event(). Then, in the above scenario, the reference to the parent bridge held by dock_event() will prevent free_bridge() from being executed for it until hotplug_event() returns. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Rafael J. Wysocki 提交于
If a PCI bridge with an ACPIPHP context attached is removed via sysfs, the code path executed as a result is the following: pci_stop_and_remove_bus_device_locked pci_remove_bus pcibios_remove_bus acpi_pci_remove_bus acpiphp_remove_slots cleanup_bridge put_bridge free_bridge acpiphp_put_context (for each child, under context lock) kfree (child context) Now, if a hotplug notify is dispatched for one of the bridge's children and the timing is such that handle_hotplug_event() for that notify is executed while free_bridge() above is running, the get_bridge(context->func.parent) in handle_hotplug_event() will not really help, because it is too late to prevent the bridge from going away and the child's context may be freed before hotplug_event_work() scheduled from handle_hotplug_event() dereferences the pointer to it passed via the data argument. That will cause a kernel crash to happpen in hotplug_event_work(). To prevent that from happening, make handle_hotplug_event() check the is_going_away flag of the function's parent bridge (under acpiphp_context_lock) and bail out if it's set. Also, make cleanup_bridge() set the bridge's is_going_away flag under acpiphp_context_lock so that it cannot be changed between the check and the subsequent get_bridge(context->func.parent) in handle_hotplug_event(). Then, in the above scenario, handle_hotplug_event() will notice that context->func.parent->is_going_away is already set and it will exit immediately preventing the crash from happening. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Rafael J. Wysocki 提交于
Since acpiphp_check_bridge() called by acpiphp_check_host_bridge() does things that require PCI rescan-remove locking around it, make acpiphp_check_host_bridge() use that locking. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: NMika Westerberg <mika.westerberg@linux.intel.com>
-
由 Rafael J. Wysocki 提交于
Commit 9217a984 (ACPI / hotplug / PCI: Use global PCI rescan-remove locking) modified ACPIPHP to protect its PCI device removal and addition code paths from races against sysfs-driven rescan and remove operations with the help of PCI rescan-remove locking. However, it overlooked the fact that hotplug_event_work() is not the only caller of hotplug_event() which may also be called by dock_hotplug_event() and that code path is missing the PCI rescan-remove locking. This means that, although the PCI rescan-remove lock is held as appropriate during the handling of events originating from handle_hotplug_event(), the ACPIPHP's operations resulting from dock events may still suffer the race conditions that commit 9217a984 was supposed to eliminate. To address that problem, move the PCI rescan-remove locking from hotplug_event_work() to hotplug_event() so that it is used regardless of the way that function is invoked. Revamps: 9217a984 (ACPI / hotplug / PCI: Use global PCI rescan-remove locking) Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: NMika Westerberg <mika.westerberg@linux.intel.com>
-
由 Rafael J. Wysocki 提交于
According to the changelog of commit 29ed1f29 (PCI: pciehp: Fix null pointer deref when hot-removing SR-IOV device) it is unsafe to walk the bus->devices list of a PCI bus and remove devices from it in direct order, because that may lead to NULL pointer dereferences related to virtual functions. For this reason, change all of the bus->devices list walks in acpiphp_glue.c during which devices may be removed to be carried out in reverse order. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: NMika Westerberg <mika.westerberg@linux.intel.com>
-
- 03 2月, 2014 3 次提交
-
-
由 Helge Deller 提交于
The built-in ROM fonts lack many necessary ASCII characters, which is why it makes sens to prefer the Linux fonts instead if they are available. This makes consoles on STI graphics cards which are not supported by the stifb driver (e.g. Visualize FXe) looks much nicer. Signed-off-by: NHelge Deller <deller@gmx.de> Cc: stable@vger.kernel.org # v3.13
-
由 Jean Delvare 提交于
Similar to what was done for the lm75 driver. Add depends on THERMAL since that is what provides the register/unregister functions above, but only if THERMAL_OF was selected as this is an optional feature of the driver. Signed-off-by: NJean Delvare <khali@linux-fr.org> Cc: Randy Dunlap <rdunlap@infradead.org> Acked-by: NEduardo Valentin <eduardo.valentin@ti.com> Reviewed-by: NGuenter Roeck <linux@roeck-us.net>
-
由 Jean Delvare 提交于
Based on an earlier attempt by Randy Dunlap. Fix SENSORS_LM75 dependencies to eliminate build errors: drivers/built-in.o: In function `lm75_remove': lm75.c:(.text+0x12bd8c): undefined reference to `thermal_zone_of_sensor_unregister' drivers/built-in.o: In function `lm75_probe': lm75.c:(.text+0x12c123): undefined reference to `thermal_zone_of_sensor_register' Add depends on THERMAL since that is what provides the register/unregister functions above, but only if THERMAL_OF was selected as this is an optional feature of the driver. Signed-off-by: NJean Delvare <khali@linux-fr.org> Cc: Randy Dunlap <rdunlap@infradead.org> Acked-by: NEduardo Valentin <eduardo.valentin@ti.com> Reviewed-by: NGuenter Roeck <linux@roeck-us.net>
-
- 02 2月, 2014 1 次提交
-
-
由 Rafael J. Wysocki 提交于
Revert commit ef83b078 "PCI: Remove from bus_list and release resources in pci_release_dev()" that made some nasty race conditions become possible. For example, if a Thunderbolt link is unplugged and then replugged immediately, the pci_release_dev() resulting from the hot-remove code path may be racing with the hot-add code path which after that commit causes various kinds of breakage to happen (up to and including a hard crash of the whole system). Moreover, the problem that commit ef83b078 attempted to address cannot happen any more after commit 8a4c5c32 "PCI: Check parent kobject in pci_destroy_dev()", because pci_destroy_dev() will now return immediately if it has already been executed for the given device. Note, however, that the invocation of msi_remove_pci_irq_vectors() removed by commit ef83b078 from pci_free_resources() along with the other changes made by it is not added back because of subsequent code changes depending on that modification. Fixes: ef83b078 (PCI: Remove from bus_list and release resources in pci_release_dev()) Reported-by: NMika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 01 2月, 2014 2 次提交
-
-
由 Tim Kryger 提交于
When an clock is specified in the device tree, enable it and use it to determine the external clock frequency. Signed-off-by: NTim Kryger <tim.kryger@linaro.org> Reviewed-by: NMarkus Mayer <markus.mayer@linaro.org> Reviewed-by: NMatt Porter <matt.porter@linaro.org> Reviewed-by: NChristian Daudt <bcm@fixthebug.org> Acked-by: NDaniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: NChristian Daudt <bcm@fixthebug.org> Signed-off-by: NOlof Johansson <olof@lixom.net>
-
由 Lorenzo Pieralisi 提交于
This patch fixes a bug/typo in the CCI driver kcalloc usage that inadvertently swapped the parameters order in the kcalloc call and went unnoticed. Reported-by: NXia Feng <xiafeng@allwinnertech.com> Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: NOlof Johansson <olof@lixom.net>
-
- 31 1月, 2014 13 次提交
-
-
由 Bob Liu 提交于
Current xen-selfballoon driver is too aggressive which may cause OOM be triggered more often. Eg. this bug reported by James: https://lkml.org/lkml/2013/11/21/158 There are two mainly reasons: 1) The original goal_page didn't consider some pages used by kernel space, like slab pages and pages used by device drivers. 2) The balloon driver may not give back memory to guest OS fast enough when the workload suddenly aquries a lot of physical memory. In both cases, the guest OS will suffer from memory pressure and OOM may be triggered. The fix is make xen-selfballoon driver not that aggressive by adding extra 10% of total ram pages to goal_page. It's more valuable to keep the guest system reliable and response faster than balloon out these 10% pages to XEN. Signed-off-by: NBob Liu <bob.liu@oracle.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
由 Zoltan Kiss 提交于
The grant mapping API does m2p_override unnecessarily: only gntdev needs it, for blkback and future netback patches it just cause a lock contention, as those pages never go to userspace. Therefore this series does the following: - the original functions were renamed to __gnttab_[un]map_refs, with a new parameter m2p_override - based on m2p_override either they follow the original behaviour, or just set the private flag and call set_phys_to_machine - gnttab_[un]map_refs are now a wrapper to call __gnttab_[un]map_refs with m2p_override false - a new function gnttab_[un]map_refs_userspace provides the old behaviour It also removes a stray space from page.h and change ret to 0 if XENFEAT_auto_translated_physmap, as that is the only possible return value there. v2: - move the storing of the old mfn in page->index to gnttab_map_refs - move the function header update to a separate patch v3: - a new approach to retain old behaviour where it needed - squash the patches into one v4: - move out the common bits from m2p* functions, and pass pfn/mfn as parameter - clear page->private before doing anything with the page, so m2p_find_override won't race with this v5: - change return value handling in __gnttab_[un]map_refs - remove a stray space in page.h - add detail why ret = 0 now at some places v6: - don't pass pfn to m2p* functions, just get it locally Signed-off-by: NZoltan Kiss <zoltan.kiss@citrix.com> Suggested-by: NDavid Vrabel <david.vrabel@citrix.com> Acked-by: NDavid Vrabel <david.vrabel@citrix.com> Acked-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
由 Minchan Kim 提交于
Finally, we separated zram->lock dependency from 32bit stat/ table handling so there is no reason to use rw_semaphore between read and write path so this patch removes the lock from read path totally and changes rw_semaphore with mutex. So, we could do old: read-read: OK read-write: NO write-write: NO Now: read-read: OK read-write: OK write-write: NO The below data proves mixed workload performs well 11 times and there is also enhance on write-write path because current rw-semaphore doesn't support SPIN_ON_OWNER. It's side effect but anyway good thing for us. Write-related tests perform better (from 61% to 1058%) but read path has good/bad(from -2.22% to 1.45%) but they are all marginal within stddev. CPU 12 iozone -t -T -l 12 -u 12 -r 16K -s 60M -I +Z -V 0 ==Initial write ==Initial write records: 10 records: 10 avg: 516189.16 avg: 839907.96 std: 22486.53 (4.36%) std: 47902.17 (5.70%) max: 546970.60 max: 909910.35 min: 481131.54 min: 751148.38 ==Rewrite ==Rewrite records: 10 records: 10 avg: 509527.98 avg: 1050156.37 std: 45799.94 (8.99%) std: 40695.44 (3.88%) max: 611574.27 max: 1111929.26 min: 443679.95 min: 980409.62 ==Read ==Read records: 10 records: 10 avg: 4408624.17 avg: 4472546.76 std: 281152.61 (6.38%) std: 163662.78 (3.66%) max: 4867888.66 max: 4727351.03 min: 4058347.69 min: 4126520.88 ==Re-read ==Re-read records: 10 records: 10 avg: 4462147.53 avg: 4363257.75 std: 283546.11 (6.35%) std: 247292.63 (5.67%) max: 4912894.44 max: 4677241.75 min: 4131386.50 min: 4035235.84 ==Reverse Read ==Reverse Read records: 10 records: 10 avg: 4565865.97 avg: 4485818.08 std: 313395.63 (6.86%) std: 248470.10 (5.54%) max: 5232749.16 max: 4789749.94 min: 4185809.62 min: 3963081.34 ==Stride read ==Stride read records: 10 records: 10 avg: 4515981.80 avg: 4418806.01 std: 211192.32 (4.68%) std: 212837.97 (4.82%) max: 4889287.28 max: 4686967.22 min: 4210362.00 min: 4083041.84 ==Random read ==Random read records: 10 records: 10 avg: 4410525.23 avg: 4387093.18 std: 236693.22 (5.37%) std: 235285.23 (5.36%) max: 4713698.47 max: 4669760.62 min: 4057163.62 min: 3952002.16 ==Mixed workload ==Mixed workload records: 10 records: 10 avg: 243234.25 avg: 2818677.27 std: 28505.07 (11.72%) std: 195569.70 (6.94%) max: 288905.23 max: 3126478.11 min: 212473.16 min: 2484150.69 ==Random write ==Random write records: 10 records: 10 avg: 555887.07 avg: 1053057.79 std: 70841.98 (12.74%) std: 35195.36 (3.34%) max: 683188.28 max: 1096125.73 min: 437299.57 min: 992481.93 ==Pwrite ==Pwrite records: 10 records: 10 avg: 501745.93 avg: 810363.09 std: 16373.54 (3.26%) std: 19245.01 (2.37%) max: 518724.52 max: 833359.70 min: 464208.73 min: 765501.87 ==Pread ==Pread records: 10 records: 10 avg: 4539894.60 avg: 4457680.58 std: 197094.66 (4.34%) std: 188965.60 (4.24%) max: 4877170.38 max: 4689905.53 min: 4226326.03 min: 4095739.72 Signed-off-by: NMinchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Tested-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Jerome Marchand <jmarchan@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Minchan Kim 提交于
Commit a0c516cb ("zram: don't grab mutex in zram_slot_free_noity") introduced free request pending code to avoid scheduling by mutex under spinlock and it was a mess which made code lenghty and increased overhead. Now, we don't need zram->lock any more to free slot so this patch reverts it and then, tb_lock should protect it. Signed-off-by: NMinchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Tested-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Jerome Marchand <jmarchan@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Minchan Kim 提交于
Currently, the zram table is protected by zram->lock but it's rather coarse-grained lock and it makes hard for scalibility. Let's use own rwlock instead of depending on zram->lock. This patch adds new locking so obviously, it would make slow but this patch is just prepartion for removing coarse-grained rw_semaphore(ie, zram->lock) which is hurdle about zram scalability. Final patch in this patchset series will remove the lock from read-path and change rw_semaphore with mutex in write path. With bonus, we could drop pending slot free mess in next patch. Signed-off-by: NMinchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Tested-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Jerome Marchand <jmarchan@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Minchan Kim 提交于
Some of fields in zram->stats are protected by zram->lock which is rather coarse-grained so let's use atomic operation without explict locking. This patch is ready for removing dependency of zram->lock in read path which is very coarse-grained rw_semaphore. Of course, this patch adds new atomic operation so it might make slow but my 12CPU test couldn't spot any regression. All gain/lose is marginal within stddev. iozone -t -T -l 12 -u 12 -r 16K -s 60M -I +Z -V 0 ==Initial write ==Initial write records: 50 records: 50 avg: 412875.17 avg: 415638.23 std: 38543.12 (9.34%) std: 36601.11 (8.81%) max: 521262.03 max: 502976.72 min: 343263.13 min: 351389.12 ==Rewrite ==Rewrite records: 50 records: 50 avg: 416640.34 avg: 397914.33 std: 60798.92 (14.59%) std: 46150.42 (11.60%) max: 543057.07 max: 522669.17 min: 304071.67 min: 316588.77 ==Read ==Read records: 50 records: 50 avg: 4147338.63 avg: 4070736.51 std: 179333.25 (4.32%) std: 223499.89 (5.49%) max: 4459295.28 max: 4539514.44 min: 3753057.53 min: 3444686.31 ==Re-read ==Re-read records: 50 records: 50 avg: 4096706.71 avg: 4117218.57 std: 229735.04 (5.61%) std: 171676.25 (4.17%) max: 4430012.09 max: 4459263.94 min: 2987217.80 min: 3666904.28 ==Reverse Read ==Reverse Read records: 50 records: 50 avg: 4062763.83 avg: 4078508.32 std: 186208.46 (4.58%) std: 172684.34 (4.23%) max: 4401358.78 max: 4424757.22 min: 3381625.00 min: 3679359.94 ==Stride read ==Stride read records: 50 records: 50 avg: 4094933.49 avg: 4082170.22 std: 185710.52 (4.54%) std: 196346.68 (4.81%) max: 4478241.25 max: 4460060.97 min: 3732593.23 min: 3584125.78 ==Random read ==Random read records: 50 records: 50 avg: 4031070.04 avg: 4074847.49 std: 192065.51 (4.76%) std: 206911.33 (5.08%) max: 4356931.16 max: 4399442.56 min: 3481619.62 min: 3548372.44 ==Mixed workload ==Mixed workload records: 50 records: 50 avg: 149925.73 avg: 149675.54 std: 7701.26 (5.14%) std: 6902.09 (4.61%) max: 191301.56 max: 175162.05 min: 133566.28 min: 137762.87 ==Random write ==Random write records: 50 records: 50 avg: 404050.11 avg: 393021.47 std: 58887.57 (14.57%) std: 42813.70 (10.89%) max: 601798.09 max: 524533.43 min: 325176.99 min: 313255.34 ==Pwrite ==Pwrite records: 50 records: 50 avg: 411217.70 avg: 411237.96 std: 43114.99 (10.48%) std: 33136.29 (8.06%) max: 530766.79 max: 471899.76 min: 320786.84 min: 317906.94 ==Pread ==Pread records: 50 records: 50 avg: 4154908.65 avg: 4087121.92 std: 151272.08 (3.64%) std: 219505.04 (5.37%) max: 4459478.12 max: 4435857.38 min: 3730512.41 min: 3101101.67 Signed-off-by: NMinchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Tested-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Jerome Marchand <jmarchan@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Minchan Kim 提交于
Commit a0c516cb ("zram: don't grab mutex in zram_slot_free_noity") introduced pending zram slot free in zram's write path in case of missing slot free by memory allocation failure in zram_slot_free_notify but it is not necessary because we have already freed the slot right before overwriting. Signed-off-by: NMinchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Jerome Marchand <jmarchan@redhat.com> Tested-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Minchan Kim 提交于
Sergey reported we don't need to handle pending free request every I/O so that this patch removes it in read path while we remain it in write path. Let's consider below example. Swap subsystem ask to zram "A" block free by swap_slot_free_notify but zram had been pended it without real freeing. Swap subsystem allocates "A" block for new data but request pended for a long time just handled and zram blindly free new data on the "A" block. :( That's why we couldn't remove handle pending free request right before zram-write. Signed-off-by: NMinchan Kim <minchan@kernel.org> Reported-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com> Tested-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Jerome Marchand <jmarchan@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Minchan Kim 提交于
Dan and Sergey reported that there is a racy between reset and flushing of pending work so that it could make oops by freeing zram->meta in reset while zram_slot_free can access zram->meta if new request is adding during the race window. This patch moves flush after taking init_lock so it prevents new request so that it closes the race. Signed-off-by: NMinchan Kim <minchan@kernel.org> Reported-by: NDan Carpenter <dan.carpenter@oracle.com> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Jerome Marchand <jmarchan@redhat.com> Tested-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Minchan Kim 提交于
Add my copyright to the zram source code which I maintain. Signed-off-by: NMinchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Minchan Kim 提交于
Remove the old private compcache project address so upcoming patches should be sent to LKML because we Linux kernel community will take care. Signed-off-by: NMinchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Minchan Kim 提交于
Zram has lived in staging for a LONG LONG time and have been fixed/improved by many contributors so code is clean and stable now. Of course, there are lots of product using zram in real practice. The major TV companys have used zram as swap since two years ago and recently our production team released android smart phone with zram which is used as swap, too and recently Android Kitkat start to use zram for small memory smart phone. And there was a report Google released their ChromeOS with zram, too and cyanogenmod have been used zram long time ago. And I heard some disto have used zram block device for tmpfs. In addition, I saw many report from many other peoples. For example, Lubuntu start to use it. The benefit of zram is very clear. With my experience, one of the benefit was to remove jitter of video application with backgroud memory pressure. It would be effect of efficient memory usage by compression but more issue is whether swap is there or not in the system. Recent mobile platforms have used JAVA so there are many anonymous pages. But embedded system normally are reluctant to use eMMC or SDCard as swap because there is wear-leveling and latency issues so if we do not use swap, it means we can't reclaim anoymous pages and at last, we could encounter OOM kill. :( Although we have real storage as swap, it was a problem, too. Because it sometime ends up making system very unresponsible caused by slow swap storage performance. Quote from Luigi on Google "Since Chrome OS was mentioned: the main reason why we don't use swap to a disk (rotating or SSD) is because it doesn't degrade gracefully and leads to a bad interactive experience. Generally we prefer to manage RAM at a higher level, by transparently killing and restarting processes. But we noticed that zram is fast enough to be competitive with the latter, and it lets us make more efficient use of the available RAM. " and he announced. http://www.spinics.net/lists/linux-mm/msg57717.html Other uses case is to use zram for block device. Zram is block device so anyone can format the block device and mount on it so some guys on the internet start zram as /var/tmp. http://forums.gentoo.org/viewtopic-t-838198-start-0.html Let's promote zram and enhance/maintain it instead of removing. Signed-off-by: NMinchan Kim <minchan@kernel.org> Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: NNitin Gupta <ngupta@vflare.org> Acked-by: NPekka Enberg <penberg@kernel.org> Cc: Bob Liu <bob.liu@oracle.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Hugh Dickins <hughd@google.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Luigi Semenzato <semenzato@google.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: Seth Jennings <sjenning@linux.vnet.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Minchan Kim 提交于
This patch moves zsmalloc under mm directory. Before that, description will explain why we have needed custom allocator. Zsmalloc is a new slab-based memory allocator for storing compressed pages. It is designed for low fragmentation and high allocation success rate on large object, but <= PAGE_SIZE allocations. zsmalloc differs from the kernel slab allocator in two primary ways to achieve these design goals. zsmalloc never requires high order page allocations to back slabs, or "size classes" in zsmalloc terms. Instead it allows multiple single-order pages to be stitched together into a "zspage" which backs the slab. This allows for higher allocation success rate under memory pressure. Also, zsmalloc allows objects to span page boundaries within the zspage. This allows for lower fragmentation than could be had with the kernel slab allocator for objects between PAGE_SIZE/2 and PAGE_SIZE. With the kernel slab allocator, if a page compresses to 60% of it original size, the memory savings gained through compression is lost in fragmentation because another object of the same size can't be stored in the leftover space. This ability to span pages results in zsmalloc allocations not being directly addressable by the user. The user is given an non-dereferencable handle in response to an allocation request. That handle must be mapped, using zs_map_object(), which returns a pointer to the mapped region that can be used. The mapping is necessary since the object data may reside in two different noncontigious pages. The zsmalloc fulfills the allocation needs for zram perfectly [sjenning@linux.vnet.ibm.com: borrow Seth's quote] Signed-off-by: NMinchan Kim <minchan@kernel.org> Acked-by: NNitin Gupta <ngupta@vflare.org> Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Bob Liu <bob.liu@oracle.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Hugh Dickins <hughd@google.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Luigi Semenzato <semenzato@google.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Pekka Enberg <penberg@kernel.org> Cc: Rik van Riel <riel@redhat.com> Cc: Seth Jennings <sjenning@linux.vnet.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-