- 07 12月, 2013 4 次提交
-
-
由 Eric Dumazet 提交于
With the introduction of TCP Small Queues, TSO auto sizing, and TCP pacing, we can implement Automatic Corking in the kernel, to help applications doing small write()/sendmsg() to TCP sockets. Idea is to change tcp_push() to check if the current skb payload is under skb optimal size (a multiple of MSS bytes) If under 'size_goal', and at least one packet is still in Qdisc or NIC TX queues, set the TCP Small Queue Throttled bit, so that the push will be delayed up to TX completion time. This delay might allow the application to coalesce more bytes in the skb in following write()/sendmsg()/sendfile() system calls. The exact duration of the delay is depending on the dynamics of the system, and might be zero if no packet for this flow is actually held in Qdisc or NIC TX ring. Using FQ/pacing is a way to increase the probability of autocorking being triggered. Add a new sysctl (/proc/sys/net/ipv4/tcp_autocorking) to control this feature and default it to 1 (enabled) Add a new SNMP counter : nstat -a | grep TcpExtTCPAutoCorking This counter is incremented every time we detected skb was under used and its flush was deferred. Tested: Interesting effects when using line buffered commands under ssh. Excellent performance results in term of cpu usage and total throughput. lpq83:~# echo 1 >/proc/sys/net/ipv4/tcp_autocorking lpq83:~# perf stat ./super_netperf 4 -t TCP_STREAM -H lpq84 -- -m 128 9410.39 Performance counter stats for './super_netperf 4 -t TCP_STREAM -H lpq84 -- -m 128': 35209.439626 task-clock # 2.901 CPUs utilized 2,294 context-switches # 0.065 K/sec 101 CPU-migrations # 0.003 K/sec 4,079 page-faults # 0.116 K/sec 97,923,241,298 cycles # 2.781 GHz [83.31%] 51,832,908,236 stalled-cycles-frontend # 52.93% frontend cycles idle [83.30%] 25,697,986,603 stalled-cycles-backend # 26.24% backend cycles idle [66.70%] 102,225,978,536 instructions # 1.04 insns per cycle # 0.51 stalled cycles per insn [83.38%] 18,657,696,819 branches # 529.906 M/sec [83.29%] 91,679,646 branch-misses # 0.49% of all branches [83.40%] 12.136204899 seconds time elapsed lpq83:~# echo 0 >/proc/sys/net/ipv4/tcp_autocorking lpq83:~# perf stat ./super_netperf 4 -t TCP_STREAM -H lpq84 -- -m 128 6624.89 Performance counter stats for './super_netperf 4 -t TCP_STREAM -H lpq84 -- -m 128': 40045.864494 task-clock # 3.301 CPUs utilized 171 context-switches # 0.004 K/sec 53 CPU-migrations # 0.001 K/sec 4,080 page-faults # 0.102 K/sec 111,340,458,645 cycles # 2.780 GHz [83.34%] 61,778,039,277 stalled-cycles-frontend # 55.49% frontend cycles idle [83.31%] 29,295,522,759 stalled-cycles-backend # 26.31% backend cycles idle [66.67%] 108,654,349,355 instructions # 0.98 insns per cycle # 0.57 stalled cycles per insn [83.34%] 19,552,170,748 branches # 488.244 M/sec [83.34%] 157,875,417 branch-misses # 0.81% of all branches [83.34%] 12.130267788 seconds time elapsed Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jeff Kirsher 提交于
Several files refer to an old address for the Free Software Foundation in the file header comment. Resolve by replacing the address with the URL <http://www.gnu.org/licenses/> so that we do not have to keep updating the header comments anytime the address changes. CC: netfilter@vger.kernel.org CC: Pablo Neira Ayuso <pablo@netfilter.org> CC: Patrick McHardy <kaber@trash.net> CC: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jeff Kirsher 提交于
Several files refer to an old address for the Free Software Foundation in the file header comment. Resolve by replacing the address with the URL <http://www.gnu.org/licenses/> so that we do not have to keep updating the header comments anytime the address changes. Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jeff Kirsher 提交于
Several files refer to an old address for the Free Software Foundation in the file header comment. Resolve by replacing the address with the URL <http://www.gnu.org/licenses/> so that we do not have to keep updating the header comments anytime the address changes. CC: Vlad Yasevich <vyasevich@gmail.com> CC: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 03 12月, 2013 2 次提交
-
-
由 David S. Miller 提交于
This reverts commit 018c5bba. It causes regressions for people using chips driven by the sungem driver. Suspicion is that the skb->csum value isn't being adjusted properly. The change also has a bug in that if __pskb_trim() fails, we'll leave a corruped skb->csum value in there. We would really need to revert it to it's original value in that case. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Rafael J. Wysocki 提交于
Modify tg3_chip_reset() and tg3_close() to check if the PCI network adapter device is accessible at all in order to skip poking it or trying to handle a carrier loss in vain when that's not the case. Introduce a special PCI helper function pci_device_is_present() for this purpose. Of course, this uncovers the lack of the appropriate RTNL locking in tg3_suspend() and tg3_resume(), so add that locking in there too. These changes prevent tg3 from burning a CPU at 100% load level for solid several seconds after the Thunderbolt link is disconnected from a Matrox DS1 docking station. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: NMichael Chan <mchan@broadcom.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 01 12月, 2013 1 次提交
-
-
由 Arvid Brodin 提交于
This implements the rtnl_link_ops fill_info routine for HSR. Signed-off-by: NArvid Brodin <arvid.brodin@alten.se> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 29 11月, 2013 3 次提交
-
-
由 Xufeng Zhang 提交于
Currently retransmitted DATA chunks could also be used for RTT measurements since there are no flag to identify whether the transmitted DATA chunk is a new one or a retransmitted one. This problem is introduced by commit ae19c548 ("sctp: remove 'resent' bit from the chunk") which inappropriately removed the 'resent' bit completely, instead of doing this, we should set the resent bit only for the retransmitted DATA chunks. Signed-off-by: NXufeng Zhang <xufeng.zhang@windriver.com> Acked-by: NVlad Yasevich <vyasevich@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Johannes Berg 提交于
The pmcraid driver is abusing the genetlink API and is using its family ID as the multicast group ID, which is invalid and may belong to somebody else (and likely will.) Make it use the correct API, but since this may already be used as-is by userspace, reserve a family ID for this code and also reserve that group ID to not break userspace assumptions. My previous patch broke event delivery in the driver as I missed that it wasn't using the right API and forgot to update it later in my series. While changing this, I noticed that the genetlink code could use the static group ID instead of a strcmp(), so also do that for the VFS_DQUOT family. Cc: Anil Ravindranath <anil_ravindranath@pmc-sierra.com> Cc: "James E.J. Bottomley" <JBottomley@parallels.com> Signed-off-by: NJohannes Berg <johannes.berg@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Nicolas Dichtel 提交于
The first netlink attribute (value 0) must always be defined as none/unspec. This is correctly done in inet_diag.h, but other diag interfaces are wrong. Because we cannot change an existing API, I add a comment to point the mistake and avoid to propagate it in a new diag API in the future. CC: Thomas Graf <tgraf@suug.ch> Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: NThomas Graf <tgraf@suug.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 26 11月, 2013 2 次提交
-
-
由 Steven Rostedt (Red Hat) 提交于
If an TRACE_EVENT() uses __assign_str() or __get_str on a NULL pointer then the following oops will happen: BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<c127a17b>] strlen+0x10/0x1a *pde = 00000000 ^M Oops: 0000 [#1] PREEMPT SMP Modules linked in: CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.13.0-rc1-test+ #2 Hardware name: /DG965MQ, BIOS MQ96510J.86A.0372.2006.0605.1717 06/05/2006^M task: f5cde9f0 ti: f5e5e000 task.ti: f5e5e000 EIP: 0060:[<c127a17b>] EFLAGS: 00210046 CPU: 1 EIP is at strlen+0x10/0x1a EAX: 00000000 EBX: c2472da8 ECX: ffffffff EDX: c2472da8 ESI: c1c5e5fc EDI: 00000000 EBP: f5e5fe84 ESP: f5e5fe80 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 CR0: 8005003b CR2: 00000000 CR3: 01f32000 CR4: 000007d0 Stack: f5f18b90 f5e5feb8 c10687a8 0759004f 00000005 00000005 00000005 00200046 00000002 00000000 c1082a93 f56c7e28 c2472da8 c1082a93 f5e5fee4 c106bc61^M 00000000 c1082a93 00000000 00000000 00000001 00200046 00200082 00000000 Call Trace: [<c10687a8>] ftrace_raw_event_lock+0x39/0xc0 [<c1082a93>] ? ktime_get+0x29/0x69 [<c1082a93>] ? ktime_get+0x29/0x69 [<c106bc61>] lock_release+0x57/0x1a5 [<c1082a93>] ? ktime_get+0x29/0x69 [<c10824dd>] read_seqcount_begin.constprop.7+0x4d/0x75 [<c1082a93>] ? ktime_get+0x29/0x69^M [<c1082a93>] ktime_get+0x29/0x69 [<c108a46a>] __tick_nohz_idle_enter+0x1e/0x426 [<c10690e8>] ? lock_release_holdtime.part.19+0x48/0x4d [<c10bc184>] ? time_hardirqs_off+0xe/0x28 [<c1068c82>] ? trace_hardirqs_off_caller+0x3f/0xaf [<c108a8cb>] tick_nohz_idle_enter+0x59/0x62 [<c1079242>] cpu_startup_entry+0x64/0x192 [<c102299c>] start_secondary+0x277/0x27c Code: 90 89 c6 89 d0 88 c4 ac 38 e0 74 09 84 c0 75 f7 be 01 00 00 00 89 f0 48 5e 5d c3 55 89 e5 57 66 66 66 66 90 83 c9 ff 89 c7 31 c0 <f2> ae f7 d1 8d 41 ff 5f 5d c3 55 89 e5 57 66 66 66 66 90 31 ff EIP: [<c127a17b>] strlen+0x10/0x1a SS:ESP 0068:f5e5fe80 CR2: 0000000000000000 ---[ end trace 01bc47bf519ec1b2 ]--- New tracepoints have been added that have allowed for NULL pointers being assigned to strings. To fix this, change the TRACE_EVENT() code to check for NULL and if it is, it will assign "(null)" to it instead (similar to what glibc printf does). Reported-by: NShuah Khan <shuah.kh@samsung.com> Reported-by: NJovi Zhangwei <jovi.zhangwei@gmail.com> Link: http://lkml.kernel.org/r/CAGdX0WFeEuy+DtpsJzyzn0343qEEjLX97+o1VREFkUEhndC+5Q@mail.gmail.com Link: http://lkml.kernel.org/r/528D6972.9010702@samsung.com Fixes: 9cbf1176 ("tracing/events: provide string with undefined size support") Cc: stable@vger.kernel.org # 2.6.31+ Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Thierry Reding 提交于
In order to support increased build test coverage for drivers, implement dummies for the powergate implementation. This will allow the drivers to be built without requiring support for Tegra to be selected. This patch solves the following build errors, which can be triggered in v3.13-rc1 by selecting DRM_TEGRA without ARCH_TEGRA: drivers/built-in.o: In function `gr3d_remove': drivers/gpu/drm/tegra/gr3d.c:321: undefined reference to `tegra_powergate_power_off' drivers/gpu/drm/tegra/gr3d.c:325: undefined reference to `tegra_powergate_power_off' drivers/built-in.o: In function `gr3d_probe': drivers/gpu/drm/tegra/gr3d.c:266: undefined reference to `tegra_powergate_sequence_power_up' drivers/gpu/drm/tegra/gr3d.c:273: undefined reference to `tegra_powergate_sequence_power_up' Signed-off-by: NThierry Reding <treding@nvidia.com> [swarren, updated commit description] Signed-off-by: NStephen Warren <swarren@nvidia.com> Signed-off-by: NOlof Johansson <olof@lixom.net>
-
- 25 11月, 2013 2 次提交
-
-
由 Alexandre Courbot 提交于
GPIO mapping properties were defined using the GPIOF_* flags, which are declared in linux/gpio.h. This file is not included when using the GPIO descriptor interface. This patch declares the flags that can be used as GPIO mappings properties in linux/gpio/driver.h, and uses them in gpiolib, so that no deprecated declarations are used by the GPIO descriptor interface. This patch also allows GPIO_OPEN_DRAIN and GPIO_OPEN_SOURCE to be specified as GPIO mapping properties. Signed-off-by: NAlexandre Courbot <acourbot@nvidia.com> Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
-
由 Randy Dunlap 提交于
Fix kernel-doc warning for duplicate definition of 'kmalloc': Documentation/DocBook/kernel-api.xml:9483: element refentry: validity error : ID API-kmalloc already defined <refentry id="API-kmalloc"> Also combine the kernel-doc info from the 2 kmalloc definitions into one block and remove the "see kcalloc" comment since kmalloc now contains the @flags info. Signed-off-by: NRandy Dunlap <rdunlap@infradead.org> Acked-by: NChristoph Lameter <cl@linux.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 24 11月, 2013 1 次提交
-
-
由 Hannes Frederic Sowa 提交于
Commit bceaa902 ("inet: prevent leakage of uninitialized memory to user in recv syscalls") conditionally updated addr_len if the msg_name is written to. The recv_error and rxpmtu functions relied on the recvmsg functions to set up addr_len before. As this does not happen any more we have to pass addr_len to those functions as well and set it to the size of the corresponding sockaddr length. This broke traceroute and such. Fixes: bceaa902 ("inet: prevent leakage of uninitialized memory to user in recv syscalls") Reported-by: NBrad Spengler <spender@grsecurity.net> Reported-by: Tom Labanowski Cc: mpb <mpb.mail@gmail.com> Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 22 11月, 2013 5 次提交
-
-
由 Kirill A. Shutemov 提交于
I don't know what went wrong, mis-merge or something, but ->pmd_huge_pte placed in wrong union within struct page. In original patch[1] it's placed to union with ->lru and ->slab, but in commit e009bb30 ("mm: implement split page table lock for PMD level") it's in union with ->index and ->freelist. That union seems also unused for pages with table tables and safe to re-use, but it's not what I've tested. Let's move it to original place. It fixes indentation at least. :) [1] https://lkml.org/lkml/2013/10/7/288Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Reviewed-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Andrea Arcangeli 提交于
Commit 7cb2ef56 ("mm: fix aio performance regression for database caused by THP") can cause dereference of a dangling pointer if split_huge_page runs during PageHuge() if there are updates to the tail_page->private field. Also it is repeating compound_head twice for hugetlbfs and it is running compound_head+compound_trans_head for THP when a single one is needed in both cases. The new code within the PageSlab() check doesn't need to verify that the THP page size is never bigger than the smallest hugetlbfs page size, to avoid memory corruption. A longstanding theoretical race condition was found while fixing the above (see the change right after the skip_unlock label, that is relevant for the compound_lock path too). By re-establishing the _mapcount tail refcounting for all compound pages, this also fixes the below problem: echo 0 >/sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages BUG: Bad page state in process bash pfn:59a01 page:ffffea000139b038 count:0 mapcount:10 mapping: (null) index:0x0 page flags: 0x1c00000000008000(tail) Modules linked in: CPU: 6 PID: 2018 Comm: bash Not tainted 3.12.0+ #25 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Call Trace: dump_stack+0x55/0x76 bad_page+0xd5/0x130 free_pages_prepare+0x213/0x280 __free_pages+0x36/0x80 update_and_free_page+0xc1/0xd0 free_pool_huge_page+0xc2/0xe0 set_max_huge_pages.part.58+0x14c/0x220 nr_hugepages_store_common.isra.60+0xd0/0xf0 nr_hugepages_store+0x13/0x20 kobj_attr_store+0xf/0x20 sysfs_write_file+0x189/0x1e0 vfs_write+0xc5/0x1f0 SyS_write+0x55/0xb0 system_call_fastpath+0x16/0x1b Signed-off-by: NKhalid Aziz <khalid.aziz@oracle.com> Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com> Tested-by: NKhalid Aziz <khalid.aziz@oracle.com> Cc: Pravin Shelar <pshelar@nicira.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Ben Hutchings <bhutchings@solarflare.com> Cc: Christoph Lameter <cl@linux.com> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Minchan Kim <minchan@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Dave Hansen 提交于
Right now, the migration code in migrate_page_copy() uses copy_huge_page() for hugetlbfs and thp pages: if (PageHuge(page) || PageTransHuge(page)) copy_huge_page(newpage, page); So, yay for code reuse. But: void copy_huge_page(struct page *dst, struct page *src) { struct hstate *h = page_hstate(src); and a non-hugetlbfs page has no page_hstate(). This works 99% of the time because page_hstate() determines the hstate from the page order alone. Since the page order of a THP page matches the default hugetlbfs page order, it works. But, if you change the default huge page size on the boot command-line (say default_hugepagesz=1G), then we might not even *have* a 2MB hstate so page_hstate() returns null and copy_huge_page() oopses pretty fast since copy_huge_page() dereferences the hstate: void copy_huge_page(struct page *dst, struct page *src) { struct hstate *h = page_hstate(src); if (unlikely(pages_per_huge_page(h) > MAX_ORDER_NR_PAGES)) { ... Mel noticed that the migration code is really the only user of these functions. This moves all the copy code over to migrate.c and makes copy_huge_page() work for THP by checking for it explicitly. I believe the bug was introduced in commit b32967ff ("mm: numa: Add THP migration for the NUMA working set scanning fault case") [akpm@linux-foundation.org: fix coding-style and comment text, per Naoya Horiguchi] Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com> Acked-by: NMel Gorman <mgorman@suse.de> Reviewed-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Hillf Danton <dhillf@gmail.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Tested-by: NDave Jiang <dave.jiang@intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Johannes Berg 提交于
Fix another really stupid bug - I introduced genl_set_err() precisely to be able to adjust the group and reject invalid ones, but then forgot to do so. Signed-off-by: NJohannes Berg <johannes.berg@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Johannes Berg 提交于
Unfortunately, I introduced a tremendously stupid bug into genlmsg_multicast() when doing all those multicast group changes: it adjusts the group number, but then passes it to genlmsg_multicast_netns() which does that again. Somehow, my tests failed to catch this, so add a warning into genlmsg_multicast_netns() and remove the offending group ID adjustment. Also add a warning to the similar code in other functions so people who misuse them are more loudly warned. Signed-off-by: NJohannes Berg <johannes.berg@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 21 11月, 2013 8 次提交
-
-
由 Bob Moore 提交于
Version 20131115. Signed-off-by: NBob Moore <robert.moore@intel.com> Signed-off-by: NLv Zheng <lv.zheng@intel.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Lv Zheng 提交于
This patch adds protection around ACPI_CHECKSUM_ABORT so that ACPI user space test utilities can re-define it for their own purposes (currently used by ASLTS build environment). This patch doesn't affect Linux kernel behavior. Lv Zheng. Signed-off-by: NLv Zheng <lv.zheng@intel.com> Signed-off-by: NBob Moore <robert.moore@intel.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Madalin Bucur 提交于
Add auto-MDI/MDI-X capability for forced (autonegotiation disabled) 10/100 Mbps speeds on Vitesse VSC82x4 PHYs. Exported previously static function genphy_setup_forced() required by the new config_aneg handler in the Vitesse PHY module. Signed-off-by: NMadalin Bucur <madalin.bucur@freescale.com> Signed-off-by: NShruti Kanetkar <Shruti@freescale.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Hannes Frederic Sowa 提交于
This patch now always passes msg->msg_namelen as 0. recvmsg handlers must set msg_namelen to the proper size <= sizeof(struct sockaddr_storage) to return msg_name to the user. This prevents numerous uninitialized memory leaks we had in the recvmsg handlers and makes it harder for new code to accidentally leak uninitialized memory. Optimize for the case recvfrom is called with NULL as address. We don't need to copy the address at all, so set it to NULL before invoking the recvmsg handler. We can do so, because all the recvmsg handlers must cope with the case a plain read() is called on them. read() also sets msg_name to NULL. Also document these changes in include/linux/net.h as suggested by David Miller. Changes since RFC: Set msg->msg_name = NULL if user specified a NULL in msg_name but had a non-null msg_namelen in verify_iovec/verify_compat_iovec. This doesn't affect sendto as it would bail out earlier while trying to copy-in the address. It also more naturally reflects the logic by the callers of verify_iovec. With this change in place I could remove " if (!uaddr || msg_sys->msg_namelen == 0) msg->msg_name = NULL ". This change does not alter the user visible error logic as we ignore msg_namelen as long as msg_name is NULL. Also remove two unnecessary curly brackets in ___sys_recvmsg and change comments to netdev style. Cc: David Miller <davem@davemloft.net> Suggested-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Steven Rostedt 提交于
Doing an if statement to test some condition to know if we should trigger a tracepoint is pointless when tracing is disabled. This just adds overhead and wastes a branch prediction. This is why the TRACE_EVENT_CONDITION() was created. It places the check inside the jump label so that the branch does not happen unless tracing is enabled. That is, instead of doing: if (em) trace_btrfs_get_extent(root, em); Which is basically this: if (em) if (static_key(trace_btrfs_get_extent)) { Using a TRACE_EVENT_CONDITION() we can just do: trace_btrfs_get_extent(root, em); And the condition trace event will do: if (static_key(trace_btrfs_get_extent)) { if (em) { ... The static key is a non conditional jump (or nop) that is faster than having to check if em is NULL or not. Signed-off-by: NSteven Rostedt <rostedt@goodmis.org> Signed-off-by: NJosef Bacik <jbacik@fusionio.com> Signed-off-by: NChris Mason <chris.mason@fusionio.com>
-
由 Linus Torvalds 提交于
This reverts commit ea1e7ed3. Al points out that while the commit *does* actually create a separate slab for the page->ptl allocation, that slab is never actually used, and the code continues to use kmalloc/kfree. Damien Wyart points out that the original patch did have the conversion to use kmem_cache_alloc/free, so it got lost somewhere on its way to me. Revert the half-arsed attempt that didn't do anything. If we really do want the special slab (remember: this is all relevant just for debug builds, so it's not necessarily all that critical) we might as well redo the patch fully. Reported-by: NAl Viro <viro@zeniv.linux.org.uk> Acked-by: NAndrew Morton <akpm@linux-foundation.org> Cc: Kirill A Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Hannes Reinecke 提交于
The supported ALUA states might be different for individual devices, so store it in a separate field. (nab: Remove unnecessary line continuation) Signed-off-by: NHannes Reinecke <hare@suse.de> Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>
-
由 Hannes Reinecke 提交于
Signed-off-by: NHannes Reinecke <hare@suse.de> Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>
-
- 20 11月, 2013 12 次提交
-
-
由 Toshi Kani 提交于
The PCI host bridge scan handler installs its own notify handler, handle_hotplug_event_root(), by itself. Nevertheless, the ACPI hotplug framework also installs the common notify handler, acpi_hotplug_notify_cb(), for PCI root bridges. This causes acpi_hotplug_notify_cb() to call _OST method with unsupported error as hotplug.enabled is not set. To address this issue, introduce hotplug.ignore flag, which indicates that the scan handler installs its own notify handler by itself. The ACPI hotplug framework does not install the common notify handler when this flag is set. Signed-off-by: NToshi Kani <toshi.kani@hp.com> [rjw: Changed the name of the new flag] Cc: 3.9+ <stable@vger.kernel.org> # 3.9+ Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Thomas Hellstrom 提交于
Addresses "[BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE". In the first occurence it was used to try to be nice while releasing the mmap_sem and retrying the fault to work around a locking inversion. The second occurence was never used. There has been some discussion whether we should change the locking order to mmap_sem -> bo_reserve. This patch doesn't address that issue, and leaves that locking order undefined. The solution that we release the mmap_sem if tryreserve fails and wait for the buffer to become unreserved is something we want in any case, and follows how the core vm system waits for pages to be come unlocked while releasing the mmap_sem. The code also outlines what needs to be changed if we want to establish the locking order as mmap_sem -> bo::reserve. One slight issue that remains with this code is that the fault handler might be prone to starvation if another thread countinously reserves the buffer. IMO that usage pattern is highly unlikely. Signed-off-by: NThomas Hellstrom <thellstrom@vmware.com>
-
由 Johannes Berg 提交于
Register generic netlink multicast groups as an array with the family and give them contiguous group IDs. Then instead of passing the global group ID to the various functions that send messages, pass the ID relative to the family - for most families that's just 0 because the only have one group. This avoids the list_head and ID in each group, adding a new field for the mcast group ID offset to the family. At the same time, this allows us to prevent abusing groups again like the quota and dropmon code did, since we can now check that a family only uses a group it owns. Signed-off-by: NJohannes Berg <johannes.berg@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Johannes Berg 提交于
This doesn't really change anything, but prepares for the next patch that will change the APIs to pass the group ID within the family, rather than the global group ID. Signed-off-by: NJohannes Berg <johannes.berg@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Johannes Berg 提交于
Add a static inline to generic netlink to wrap netlink_set_err() to make it easier to use here - use it in openvswitch (the only generic netlink user of netlink_set_err()). Signed-off-by: NJohannes Berg <johannes.berg@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Johannes Berg 提交于
There's no reason to have the family pointer there since it can just be passed internally where needed, so remove it. Signed-off-by: NJohannes Berg <johannes.berg@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Johannes Berg 提交于
There are no users of this API remaining, and we'll soon change group registration to be static (like ops are now) Signed-off-by: NJohannes Berg <johannes.berg@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Johannes Berg 提交于
The quota code is abusing the genetlink API and is using its family ID as the multicast group ID, which is invalid and may belong to somebody else (and likely will.) Make the quota code use the correct API, but since this is already used as-is by userspace, reserve a family ID for this code and also reserve that group ID to not break userspace assumptions. Acked-by: NJan Kara <jack@suse.cz> Signed-off-by: NJohannes Berg <johannes.berg@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Johannes Berg 提交于
As suggested by David Miller, make genl_register_family_with_ops() a macro and pass only the array, evaluating ARRAY_SIZE() in the macro, this is a little safer. The openvswitch has some indirection, assing ops/n_ops directly in that code. This might ultimately just assign the pointers in the family initializations, saving the struct genl_family_and_ops and code (once mcast groups are handled differently.) Signed-off-by: NJohannes Berg <johannes.berg@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ben Hutchings 提交于
SIOCSHWTSTAMP returns the real configuration to the application using it, but there is currently no way for any other application to find out the configuration non-destructively. Add a new ioctl for this, making it unprivileged. Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
-
由 Geert Uytterhoeven 提交于
commit bedd30d9 ("genirq: make irqreturn_t an enum") blindly replaced "0" by "IRQ_NONE" in the "IRQ_RETVAL(x)" macro definition. However, as "x" is a condition, "0" meant "boolean false", not an irqreturn_t value. All of this worked, and kept working after the addition of IRQ_WAKE_THREAD, as - both "boolean false" and "IRQ_NONE" are "0" (for the comparison), - "boolean true" and "boolean false" nicely map to the correct values of "IRQ_HANDLED" and "IRQ_NONE" (for the return value). Correct the macro definition for clarity and future-proofness. Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Jens Axboe 提交于
If disk stats are enabled on the queue, a request needs to be marked with REQ_IO_STAT for accounting to be active on that request. This fixes an issue with virtio-blk not showing up in /proc/diskstats after the conversion to blk-mq. Add QUEUE_FLAG_MQ_DEFAULT, setting stats and same cpu-group completion on by default. Reported-by: NDave Chinner <david@fromorbit.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-