- 06 8月, 2018 2 次提交
-
-
由 Peter Oskolkov 提交于
Tested: see the next patch is the series. Suggested-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NPeter Oskolkov <posk@google.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Florian Westphal <fw@strlen.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Yangbo Lu 提交于
This patch is to support automatic configuration for ptp timer. If required ptp dts properties are not provided, driver could try to calculate a set of default configurations to initialize the ptp timer. This makes the driver work for many boards which don't have the required ptp dts properties in current kernel. Also the users could set dts properties by themselves according to their requirement. Signed-off-by: NYangbo Lu <yangbo.lu@nxp.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 8月, 2018 3 次提交
-
-
由 Máté Eckl 提交于
Netfilter exposes standard hook priorities in case of ipv4, ipv6 and arp but not in case of bridge. This patch exposes the hook priority values of the bridge family (which are different from the formerly mentioned) via uapi so that they can be used by user-space applications just like the others. Signed-off-by: NMáté Eckl <ecklm94@gmail.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
The first client of the nf_osf.h userspace header is nft_osf, coming in this batch, rename it to nfnetlink_osf.h as there are no userspace clients for this yet, hence this looks consistent with other nfnetlink subsystem. Suggested-by: NJan Engelhardt <jengelh@inai.de> Signed-off-by: NFernando Fernandez Mancera <ffmancera@riseup.net> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
All warnings (new ones prefixed by >>): >> ./usr/include/linux/netfilter/nf_osf.h:73: userspace cannot reference function or variable defined in the kernel Fixes: f9324952 ("netfilter: nfnetlink_osf: extract nfnetlink_subsystem code from xt_osf.c") Reported-by: Nkbuild test robot <lkp@intel.com> Signed-off-by: NFernando Fernandez Mancera <ffmancera@riseup.net> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
- 02 8月, 2018 5 次提交
-
-
由 Linus Torvalds 提交于
Commit 2c4541e2 ("mm: use vma_init() to initialize VMAs on stack and data segments") tried to initialize various left-over ad-hoc vma's "properly", but actually made things worse for the temporary vma's used for TLB flushing. vma_init() doesn't actually initialize all of the vma, just a few fields, so doing something like - struct vm_area_struct vma = { .vm_mm = tlb->mm, }; + struct vm_area_struct vma; + + vma_init(&vma, tlb->mm); was actually very bad: instead of having a nicely initialized vma with every field but "vm_mm" zeroed, you'd have an entirely uninitialized vma with only a couple of fields initialized. And they weren't even fields that the code in question mostly cared about. The flush_tlb_range() function takes a "struct vma" rather than a "struct mm_struct", because a few architectures actually care about what kind of range it is - being able to only do an ITLB flush if it's a range that doesn't have data accesses enabled, for example. And all the normal users already have the vma for doing the range invalidation. But a few people want to call flush_tlb_range() with a range they just made up, so they also end up using a made-up vma. x86 just has a special "flush_tlb_mm_range()" function for this, but other architectures (arm and ia64) do the "use fake vma" thing instead, and thus got caught up in the vma_init() changes. At the same time, the TLB flushing code really doesn't care about most other fields in the vma, so vma_init() is just unnecessary and pointless. This fixes things by having an explicit "this is just an initializer for the TLB flush" initializer macro, which is used by the arm/arm64/ia64 people who mis-use this interface with just a dummy vma. Fixes: 2c4541e2 ("mm: use vma_init() to initialize VMAs on stack and data segments") Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: John Stultz <john.stultz@linaro.org> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Wei Wang 提交于
Introduce a new TCP stats to record the number of reordering events seen and expose it in both tcp_info (TCP_INFO) and opt_stats (SOF_TIMESTAMPING_OPT_STATS). Application can use this stats to track the frequency of the reordering events in addition to the existing reordering stats which tracks the magnitude of the latest reordering event. Note: this new stats tracks reordering events triggered by ACKs, which could often be fewer than the actual number of packets being delivered out-of-order. Signed-off-by: NWei Wang <weiwan@google.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Acked-by: NNeal Cardwell <ncardwell@google.com> Acked-by: NSoheil Hassas Yeganeh <soheil@google.com> Acked-by: NYuchung Cheng <ycheng@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Wei Wang 提交于
Introduce a new TCP stat to record the number of DSACK blocks received (RFC4989 tcpEStatsStackDSACKDups) and expose it in both tcp_info (TCP_INFO) and opt_stats (SOF_TIMESTAMPING_OPT_STATS). Signed-off-by: NWei Wang <weiwan@google.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Acked-by: NNeal Cardwell <ncardwell@google.com> Acked-by: NSoheil Hassas Yeganeh <soheil@google.com> Acked-by: NYuchung Cheng <ycheng@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Wei Wang 提交于
Introduce a new TCP stat to record the number of bytes retransmitted (RFC4898 tcpEStatsPerfOctetsRetrans) and expose it in both tcp_info (TCP_INFO) and opt_stats (SOF_TIMESTAMPING_OPT_STATS). Signed-off-by: NWei Wang <weiwan@google.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Acked-by: NNeal Cardwell <ncardwell@google.com> Acked-by: NSoheil Hassas Yeganeh <soheil@google.com> Acked-by: NYuchung Cheng <ycheng@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Wei Wang 提交于
Introduce a new TCP stat to record the number of bytes sent (RFC4898 tcpEStatsPerfHCDataOctetsOut) and expose it in both tcp_info (TCP_INFO) and opt_stats (SOF_TIMESTAMPING_OPT_STATS). Signed-off-by: NWei Wang <weiwan@google.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Acked-by: NNeal Cardwell <ncardwell@google.com> Acked-by: NSoheil Hassas Yeganeh <soheil@google.com> Acked-by: NYuchung Cheng <ycheng@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 01 8月, 2018 4 次提交
-
-
由 Christoph Hellwig 提交于
We never use RCU protection for it, just a lot of cargo-cult rcu_deference_protects calls. Note that we do keep the kfree_rcu call for it, as the references through struct sock are RCU protected and thus might require a grace period before freeing. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NEric Dumazet <edumazet@google.com> Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jakub Kicinski 提交于
Currently drivers have to check if they already have a umem installed for a given queue and return an error if so. Make better use of XDP_QUERY_XSK_UMEM and move this functionality to the core. We need to keep rtnl across the calls now. Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: NQuentin Monnet <quentin.monnet@netronome.com> Acked-by: NBjörn Töpel <bjorn.topel@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jakub Kicinski 提交于
We used to depend on real_num_rx_queues as a upper bound for sanity checks. For AF_XDP socket validation it's useful if the check behaves the same regardless of CONFIG_SYSFS setting. Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: NQuentin Monnet <quentin.monnet@netronome.com> Acked-by: NJesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Hari Vyas 提交于
When a PCI device is detected, pdev->is_added is set to 1 and proc and sysfs entries are created. When the device is removed, pdev->is_added is checked for one and then device is detached with clearing of proc and sys entries and at end, pdev->is_added is set to 0. is_added and is_busmaster are bit fields in pci_dev structure sharing same memory location. A strange issue was observed with multiple removal and rescan of a PCIe NVMe device using sysfs commands where is_added flag was observed as zero instead of one while removing device and proc,sys entries are not cleared. This causes issue in later device addition with warning message "proc_dir_entry" already registered. Debugging revealed a race condition between the PCI core setting the is_added bit in pci_bus_add_device() and the NVMe driver reset work-queue setting the is_busmaster bit in pci_set_master(). As these fields are not handled atomically, that clears the is_added bit. Move the is_added bit to a separate private flag variable and use atomic functions to set and retrieve the device addition state. This avoids the race because is_added no longer shares a memory location with is_busmaster. Link: https://bugzilla.kernel.org/show_bug.cgi?id=200283Signed-off-by: NHari Vyas <hari.vyas@broadcom.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Reviewed-by: NLukas Wunner <lukas@wunner.de> Acked-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 30 7月, 2018 3 次提交
-
-
由 Pablo Neira Ayuso 提交于
This new function returns the OS genre as a string. Plan is to use to from the new nft_osf extension. Note that this doesn't yet support ttl options, but it could be easily extended to do so. Tested-by: NFernando Fernandez Mancera <ffmancera@riseup.net> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Stephen Hemminger 提交于
If an invalid MTU value is set through rtnetlink return extra error information instead of putting message in kernel log. For other cases where there is no visible API, keep the error report in the log. Example: # ip li set dev enp12s0 mtu 10000 Error: mtu greater than device maximum. # ifconfig enp12s0 mtu 10000 SIOCSIFMTU: Invalid argument # dmesg | tail -1 [ 2047.795467] enp12s0: mtu greater than device maximum Signed-off-by: NStephen Hemminger <stephen@networkplumber.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Xin Long 提交于
This patch implements the feature described in rfc1812#section-5.3.5.2 and rfc2644. It allows the router to forward directed broadcast when sysctl bc_forwarding is enabled. Note that this feature could be done by iptables -j TEE, but it would cause some problems: - target TEE's gateway param has to be set with a specific address, and it's not flexible especially when the route wants forward all directed broadcasts. - this duplicates the directed broadcasts so this may cause side effects to applications. Besides, to keep consistent with other os router like BSD, it's also necessary to implement it in the route rx path. Note that route cache needs to be flushed when bc_forwarding is changed. Signed-off-by: NXin Long <lucien.xin@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 28 7月, 2018 2 次提交
-
-
由 Saeed Mahameed 提交于
Move vxlan logic and objects to mlx5 core dirver. Since it going to be used from different mlx5 interfaces. e.g. mlx5e PF NIC netdev and mlx5e E-Switch representors. Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com> Reviewed-by: NOr Gerlitz <ogerlitz@mellanox.com>
-
由 Gal Pressman 提交于
The NIC has a limited number of offloaded VXLAN UDP ports (usually 4). Instead of letting the firmware fail when trying to add more ports than it can handle, let the driver check it on its own. Signed-off-by: NGal Pressman <galp@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
-
- 27 7月, 2018 5 次提交
-
-
由 Zhu Yi 提交于
The existing SocketCAN implementation provides alloc_candev() to allocate a CAN device using a single Tx and Rx queue. This can lead to priority inversion in case the single Tx queue is already full with low priority messages and a high priority message needs to be sent while the bus is fully loaded with medium priority messages. This problem can be solved by using the existing multi-queue support of the network subsytem. The commit makes it possible to use multi-queue in the CAN subsystem in the same way it is used in the Ethernet subsystem by adding an alloc_candev_mqs() call and accompanying macros. With this support a CAN device can use multi-queue qdisc (e.g. mqprio) to avoid the aforementioned priority inversion. The exisiting functionality of alloc_candev() is the same as before. CAN devices need to have prioritized multiple hardware queues or are able to abort waiting for arbitration to make sensible use of multi-queues. Signed-off-by: NZhu Yi <yi.zhu5@cn.bosch.com> Signed-off-by: NMark Jonas <mark.jonas@de.bosch.com> Reviewed-by: NHeiko Schocher <hs@denx.de> Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>
-
由 Arnd Bergmann 提交于
The new gasket staging driver ran into a randconfig build failure when CONFIG_EVENTFD is disabled: In file included from drivers/staging/gasket/gasket_interrupt.h:11, from drivers/staging/gasket/gasket_interrupt.c:4: include/linux/eventfd.h: In function 'eventfd_ctx_fdget': include/linux/eventfd.h:51:9: error: implicit declaration of function 'ERR_PTR' [-Werror=implicit-function-declaration] I can't see anything wrong with including eventfd.h before err.h, so the easiest fix is to make it possible to do this by including the file where it is needed. Link: http://lkml.kernel.org/r/20180724110737.3985088-1-arnd@arndb.de Fixes: 9a69f508 ("drivers/staging: Gasket driver framework + Apex driver") Signed-off-by: NArnd Bergmann <arnd@arndb.de> Cc: Eric Biggers <ebiggers@google.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Kirill A. Shutemov 提交于
vma_is_anonymous() relies on ->vm_ops being NULL to detect anonymous VMA. This is unreliable as ->mmap may not set ->vm_ops. False-positive vma_is_anonymous() may lead to crashes: next ffff8801ce5e7040 prev ffff8801d20eca50 mm ffff88019c1e13c0 prot 27 anon_vma ffff88019680cdd8 vm_ops 0000000000000000 pgoff 0 file ffff8801b2ec2d00 private_data 0000000000000000 flags: 0xff(read|write|exec|shared|mayread|maywrite|mayexec|mayshare) ------------[ cut here ]------------ kernel BUG at mm/memory.c:1422! invalid opcode: 0000 [#1] SMP KASAN CPU: 0 PID: 18486 Comm: syz-executor3 Not tainted 4.18.0-rc3+ #136 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:zap_pmd_range mm/memory.c:1421 [inline] RIP: 0010:zap_pud_range mm/memory.c:1466 [inline] RIP: 0010:zap_p4d_range mm/memory.c:1487 [inline] RIP: 0010:unmap_page_range+0x1c18/0x2220 mm/memory.c:1508 Call Trace: unmap_single_vma+0x1a0/0x310 mm/memory.c:1553 zap_page_range_single+0x3cc/0x580 mm/memory.c:1644 unmap_mapping_range_vma mm/memory.c:2792 [inline] unmap_mapping_range_tree mm/memory.c:2813 [inline] unmap_mapping_pages+0x3a7/0x5b0 mm/memory.c:2845 unmap_mapping_range+0x48/0x60 mm/memory.c:2880 truncate_pagecache+0x54/0x90 mm/truncate.c:800 truncate_setsize+0x70/0xb0 mm/truncate.c:826 simple_setattr+0xe9/0x110 fs/libfs.c:409 notify_change+0xf13/0x10f0 fs/attr.c:335 do_truncate+0x1ac/0x2b0 fs/open.c:63 do_sys_ftruncate+0x492/0x560 fs/open.c:205 __do_sys_ftruncate fs/open.c:215 [inline] __se_sys_ftruncate fs/open.c:213 [inline] __x64_sys_ftruncate+0x59/0x80 fs/open.c:213 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe Reproducer: #include <stdio.h> #include <stddef.h> #include <stdint.h> #include <stdlib.h> #include <string.h> #include <sys/types.h> #include <sys/stat.h> #include <sys/ioctl.h> #include <sys/mman.h> #include <unistd.h> #include <fcntl.h> #define KCOV_INIT_TRACE _IOR('c', 1, unsigned long) #define KCOV_ENABLE _IO('c', 100) #define KCOV_DISABLE _IO('c', 101) #define COVER_SIZE (1024<<10) #define KCOV_TRACE_PC 0 #define KCOV_TRACE_CMP 1 int main(int argc, char **argv) { int fd; unsigned long *cover; system("mount -t debugfs none /sys/kernel/debug"); fd = open("/sys/kernel/debug/kcov", O_RDWR); ioctl(fd, KCOV_INIT_TRACE, COVER_SIZE); cover = mmap(NULL, COVER_SIZE * sizeof(unsigned long), PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); munmap(cover, COVER_SIZE * sizeof(unsigned long)); cover = mmap(NULL, COVER_SIZE * sizeof(unsigned long), PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0); memset(cover, 0, COVER_SIZE * sizeof(unsigned long)); ftruncate(fd, 3UL << 20); return 0; } This can be fixed by assigning anonymous VMAs own vm_ops and not relying on it being NULL. If ->mmap() failed to set ->vm_ops, mmap_region() will set it to dummy_vm_ops. This way we will have non-NULL ->vm_ops for all VMAs. Link: http://lkml.kernel.org/r/20180724121139.62570-4-kirill.shutemov@linux.intel.comSigned-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Reported-by: syzbot+3f84280d52be9b7083cc@syzkaller.appspotmail.com Acked-by: NLinus Torvalds <torvalds@linux-foundation.org> Reviewed-by: NAndrew Morton <akpm@linux-foundation.org> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Kirill A. Shutemov 提交于
Not all VMAs allocated with vm_area_alloc(). Some of them allocated on stack or in data segment. The new helper can be use to initialize VMA properly regardless where it was allocated. Link: http://lkml.kernel.org/r/20180724121139.62570-2-kirill.shutemov@linux.intel.comSigned-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: NLinus Torvalds <torvalds@linux-foundation.org> Reviewed-by: NAndrew Morton <akpm@linux-foundation.org> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Tejun Heo 提交于
While forking, if delayacct init fails due to memory shortage, it continues expecting all delayacct users to check task->delays pointer against NULL before dereferencing it, which all of them used to do. Commit c96f5471 ("delayacct: Account blkio completion on the correct task"), while updating delayacct_blkio_end() to take the target task instead of always using %current, made the function test NULL on %current->delays and then continue to operated on @p->delays. If %current succeeded init while @p didn't, it leads to the following crash. BUG: unable to handle kernel NULL pointer dereference at 0000000000000004 IP: __delayacct_blkio_end+0xc/0x40 PGD 8000001fd07e1067 P4D 8000001fd07e1067 PUD 1fcffbb067 PMD 0 Oops: 0000 [#1] SMP PTI CPU: 4 PID: 25774 Comm: QIOThread0 Not tainted 4.16.0-9_fbk1_rc2_1180_g6b593215b4d7 #9 RIP: 0010:__delayacct_blkio_end+0xc/0x40 Call Trace: try_to_wake_up+0x2c0/0x600 autoremove_wake_function+0xe/0x30 __wake_up_common+0x74/0x120 wake_up_page_bit+0x9c/0xe0 mpage_end_io+0x27/0x70 blk_update_request+0x78/0x2c0 scsi_end_request+0x2c/0x1e0 scsi_io_completion+0x20b/0x5f0 blk_mq_complete_request+0xa2/0x100 ata_scsi_qc_complete+0x79/0x400 ata_qc_complete_multiple+0x86/0xd0 ahci_handle_port_interrupt+0xc9/0x5c0 ahci_handle_port_intr+0x54/0xb0 ahci_single_level_irq_intr+0x3b/0x60 __handle_irq_event_percpu+0x43/0x190 handle_irq_event_percpu+0x20/0x50 handle_irq_event+0x2a/0x50 handle_edge_irq+0x80/0x1c0 handle_irq+0xaf/0x120 do_IRQ+0x41/0xc0 common_interrupt+0xf/0xf Fix it by updating delayacct_blkio_end() check @p->delays instead. Link: http://lkml.kernel.org/r/20180724175542.GP1934745@devbig577.frc2.facebook.com Fixes: c96f5471 ("delayacct: Account blkio completion on the correct task") Signed-off-by: NTejun Heo <tj@kernel.org> Reported-by: NDave Jones <dsj@fb.com> Debugged-by: NDave Jones <dsj@fb.com> Reviewed-by: NAndrew Morton <akpm@linux-foundation.org> Cc: Josh Snyder <joshs@netflix.com> Cc: <stable@vger.kernel.org> [4.15+] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 26 7月, 2018 1 次提交
-
-
由 Heiner Kallweit 提交于
Add a helper for checking whether polling is used to detect PHY status changes. Signed-off-by: NHeiner Kallweit <hkallweit1@gmail.com> Reviewed-by: NAndrew Lunn <andrew@lunn.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 25 7月, 2018 5 次提交
-
-
由 Masami Hiramatsu 提交于
Maintain the tracing on/off setting of the ring_buffer when switching to the trace buffer snapshot. Taking a snapshot is done by swapping the backup ring buffer (max_tr_buffer). But since the tracing on/off setting is defined by the ring buffer, when swapping it, the tracing on/off setting can also be changed. This causes a strange result like below: /sys/kernel/debug/tracing # cat tracing_on 1 /sys/kernel/debug/tracing # echo 0 > tracing_on /sys/kernel/debug/tracing # cat tracing_on 0 /sys/kernel/debug/tracing # echo 1 > snapshot /sys/kernel/debug/tracing # cat tracing_on 1 /sys/kernel/debug/tracing # echo 1 > snapshot /sys/kernel/debug/tracing # cat tracing_on 0 We don't touch tracing_on, but snapshot changes tracing_on setting each time. This is an anomaly, because user doesn't know that each "ring_buffer" stores its own tracing-enable state and the snapshot is done by swapping ring buffers. Link: http://lkml.kernel.org/r/153149929558.11274.11730609978254724394.stgit@devbox Cc: Ingo Molnar <mingo@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Tom Zanussi <tom.zanussi@linux.intel.com> Cc: Hiraku Toyooka <hiraku.toyooka@cybertrust.co.jp> Cc: stable@vger.kernel.org Fixes: debdd57f ("tracing: Make a snapshot feature available from userspace") Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org> [ Updated commit log and comment in the code ] Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
-
由 Peter Zijlstra 提交于
Vince reported the perf_fuzzer giving various unwinder warnings and Josh reported: > Deja vu. Most of these are related to perf PEBS, similar to the > following issue: > > b8000586 ("perf/x86/intel: Cure bogus unwind from PEBS entries") > > This is basically the ORC version of that. setup_pebs_sample_data() is > assembling a franken-pt_regs which ORC isn't happy about. RIP is > inconsistent with some of the other registers (like RSP and RBP). And where the previous unwinder only needed BP,SP ORC also requires IP. But we cannot spoof IP because then the sample will get displaced, entirely negating the point of PEBS. So cure the whole thing differently by doing the unwind early; this does however require a means to communicate we did the unwind early. We (ab)use an unused sample_type bit for this, which we set on events that fill out the data->callchain before the normal perf_prepare_sample(). Debugged-by: NJosh Poimboeuf <jpoimboe@redhat.com> Reported-by: NVince Weaver <vincent.weaver@maine.edu> Tested-by: NJosh Poimboeuf <jpoimboe@redhat.com> Tested-by: NPrashant Bhole <bhole_prashant_q7@lab.ntt.co.jp> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Peter Rosin 提交于
Needed for annotating rt_mutex locks. Tested-by: NJohn Sperbeck <jsperbeck@google.com> Signed-off-by: NPeter Rosin <peda@axentia.se> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Deepa Dinamani <deepadinamani@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Chang <dpf@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Philippe Ombredanne <pombredanne@nexb.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: Wolfram Sang <wsa@the-dreams.de> Link: http://lkml.kernel.org/r/20180720083914.1950-2-peda@axentia.seSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Keith Busch 提交于
This is preparing for drivers that want to directly alter the state of their requests. No functional change here. Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Florian Westphal 提交于
->start() is called once when dump is being initialized, there is no need to store it in netlink_cb. Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 24 7月, 2018 2 次提交
-
-
由 Feras Daoud 提交于
The tracer has one event, event 0x26, with two subtypes: - Subtype 0: Ownership change - Subtype 1: Traces available An ownership change occurs in the following cases: 1- Owner releases his ownership, in this case, an event will be sent to inform others to reattempt acquire ownership. 2- Ownership was taken by a higher priority tool, in this case the owner should understand that it lost ownership, and go through tear down flow. The second subtype indicates that there are traces in the trace buffer, in this case, the driver polls the tracer buffer for new traces, parse them and prepares the messages for printing. The HW starts tracing from the first address in the tracer buffer. Driver receives an event notifying that new trace block exists. HW posts a timestamp event at the last 8B of every 256B block. Comparing the timestamp to the last handled timestamp would indicate that this is a new trace block. Once the new timestamp is detected, the entire block is considered valid. Block validation and parsing, should be done after copying the current block to a different location, in order to avoid block overwritten during processing. Signed-off-by: NFeras Daoud <ferasda@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
-
由 Feras Daoud 提交于
Implement FW tracer logic and registers access, initialization and cleanup flows. Initializing the tracer will be part of load one flow, as multiple PFs will try to acquire ownership but only one will succeed and will be the tracer owner. Signed-off-by: NFeras Daoud <ferasda@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
-
- 23 7月, 2018 1 次提交
-
-
由 Al Viro 提交于
kernel_wait4() expects a userland address for status - it's only rusage that goes as a kernel one (and needs a copyout afterwards) [ Also, fix the prototype of kernel_wait4() to have that __user annotation - Linus ] Fixes: 92ebce5a ("osf_wait4: switch to kernel_wait4()") Cc: stable@kernel.org # v4.13+ Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 22 7月, 2018 3 次提交
-
-
由 YueHaibing 提交于
Fix following warning: net/ipv4/bpfilter/sockopt.c:28:5: error: symbol 'bpfilter_ip_set_sockopt' redeclared with different type net/ipv4/bpfilter/sockopt.c:34:5: error: symbol 'bpfilter_ip_get_sockopt' redeclared with different type Signed-off-by: NYueHaibing <yuehaibing@huawei.com> Acked-by: NMartin KaFai Lau <kafai@fb.com> Acked-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Linus Torvalds 提交于
Like vm_area_dup(), it initializes the anon_vma_chain head, and the basic mm pointer. The rest of the fields end up being different for different users, although the plan is to also initialize the 'vm_ops' field to a dummy entry. Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Linus Torvalds 提交于
The vm_area_struct is one of the most fundamental memory management objects, but the management of it is entirely open-coded evertwhere, ranging from allocation and freeing (using kmem_cache_[z]alloc and kmem_cache_free) to initializing all the fields. We want to unify this in order to end up having some unified initialization of the vmas, and the first step to this is to at least have basic allocation functions. Right now those functions are literally just wrappers around the kmem_cache_*() calls. This is a purely mechanical conversion: # new vma: kmem_cache_zalloc(vm_area_cachep, GFP_KERNEL) -> vm_area_alloc() # copy old vma kmem_cache_alloc(vm_area_cachep, GFP_KERNEL) -> vm_area_dup(old) # free vma kmem_cache_free(vm_area_cachep, vma) -> vm_area_free(vma) to the point where the old vma passed in to the vm_area_dup() function isn't even used yet (because I've left all the old manual initialization alone). Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 21 7月, 2018 3 次提交
-
-
由 Dmitry Torokhov 提交于
Plumb in get_ownership() callback for devices belonging to a class so that they can be created with uid/gid different from global root. This will allow network devices in a container to belong to container's root and not global root. Signed-off-by: NDmitry Torokhov <dmitry.torokhov@gmail.com> Reviewed-by: NTyler Hicks <tyhicks@canonical.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Dmitry Torokhov 提交于
Normally kobjects and their sysfs representation belong to global root, however it is not necessarily the case for objects in separate namespaces. For example, objects in separate network namespace logically belong to the container's root and not global root. This change lays groundwork for allowing network namespace objects ownership to be transferred to container's root user by defining get_ownership() callback in ktype structure and using it in sysfs code to retrieve desired uid/gid when creating sysfs objects for given kobject. Co-Developed-by: NTyler Hicks <tyhicks@canonical.com> Signed-off-by: NDmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: NTyler Hicks <tyhicks@canonical.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Dmitry Torokhov 提交于
This change allows creating kernfs files and directories with arbitrary uid/gid instead of always using GLOBAL_ROOT_UID/GID by extending kernfs_create_dir_ns() and kernfs_create_file_ns() with uid/gid arguments. The "simple" kernfs_create_file() and kernfs_create_dir() are left alone and always create objects belonging to the global root. When creating symlinks ownership (uid/gid) is taken from the target kernfs object. Co-Developed-by: NTyler Hicks <tyhicks@canonical.com> Signed-off-by: NDmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: NTyler Hicks <tyhicks@canonical.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 20 7月, 2018 1 次提交
-
-
由 Lu Baolu 提交于
This reverts commit ab96746a. The commit ab96746a ("iommu/vt-d: Clean up pasid quirk for pre-production devices") triggers ECS mode on some platforms which have broken ECS support. As the result, graphic device will be inoperable on boot. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107017 Cc: Ashok Raj <ashok.raj@intel.com> Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-