- 28 4月, 2016 3 次提交
-
-
由 Matt Fleming 提交于
Abolish the poorly named EFI memory map, 'memmap'. It is shadowed by a bunch of local definitions in various files and having two ways to access the EFI memory map ('efi.memmap' vs. 'memmap') is rather confusing. Furthermore, IA64 doesn't even provide this global object, which has caused issues when trying to write generic EFI memmap code. Replace all occurrences with efi.memmap, and convert the remaining iterator code to use for_each_efi_mem_desc(). Signed-off-by: NMatt Fleming <matt@codeblueprint.co.uk> Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Luck, Tony <tony.luck@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/1461614832-17633-8-git-send-email-matt@codeblueprint.co.ukSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Matt Fleming 提交于
Most of the users of for_each_efi_memory_desc() are equally happy iterating over the EFI memory map in efi.memmap instead of 'memmap', since the former is usually a pointer to the latter. For those users that want to specify an EFI memory map other than efi.memmap, that can be done using for_each_efi_memory_desc_in_map(). One such example is in the libstub code where the firmware is queried directly for the memory map, it gets iterated over, and then freed. This change goes part of the way toward deleting the global 'memmap' variable, which is not universally available on all architectures (notably IA64) and is rather poorly named. Signed-off-by: NMatt Fleming <matt@codeblueprint.co.uk> Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Leif Lindholm <leif.lindholm@linaro.org> Cc: Mark Salter <msalter@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/1461614832-17633-7-git-send-email-matt@codeblueprint.co.ukSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Ard Biesheuvel 提交于
The EFI_SYSTEM_TABLES status bit is set by all EFI supporting architectures upon discovery of the EFI system table, but the bit is never tested in any code we have in the tree. So remove it. Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NMatt Fleming <matt@codeblueprint.co.uk> Cc: Borislav Petkov <bp@alien8.de> Cc: Leif Lindholm <leif.lindholm@linaro.org> Cc: Luck, Tony <tony.luck@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/1461614832-17633-2-git-send-email-matt@codeblueprint.co.ukSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 24 4月, 2016 1 次提交
-
-
由 Andre Przywara 提交于
These new syscalls are implemented as generic code, so enable them for architectures like arm64 which use the generic syscall table. Signed-off-by: NAndre Przywara <andre.przywara@arm.com> Signed-off-by: NArnd Bergmann <arnd@arndb.de>
-
- 22 4月, 2016 3 次提交
-
-
由 Huacai Chen 提交于
Signed-off-by: NHuacai Chen <chenhc@lemote.com> Cc: stable@vger.kernel.org Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NDave Airlie <airlied@redhat.com>
-
由 Daniel Jurgens 提交于
Maintain the PCI status and provide wrappers for enabling and disabling the PCI device. Performing the actions more than once without doing its opposite results in warning logs. This occurred when EEH hotplugged the device causing a warning for disabling an already disabled device. Fixes: 2ba5fbd6 ('net/mlx4_core: Handle AER flow properly') Signed-off-by: NDaniel Jurgens <danielj@mellanox.com> Signed-off-by: NYishai Hadas <yishaih@mellanox.com> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Martin KaFai Lau 提交于
After receiving sacks, tcp_shifted_skb() will collapse skbs if possible. tx_flags and tskey also have to be merged. This patch reuses the tcp_skb_collapse_tstamp() to handle them. BPF Output Before: ~~~~~ <no-output-due-to-missing-tstamp-event> BPF Output After: ~~~~~ <...>-2024 [007] d.s. 88.644374: : ee_data:14599 Packetdrill Script: ~~~~~ +0 `sysctl -q -w net.ipv4.tcp_min_tso_segs=10` +0 `sysctl -q -w net.ipv4.tcp_no_metrics_save=1` +0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 +0 bind(3, ..., ...) = 0 +0 listen(3, 1) = 0 0.100 < S 0:0(0) win 32792 <mss 1460,sackOK,nop,nop,nop,wscale 7> 0.100 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 7> 0.200 < . 1:1(0) ack 1 win 257 0.200 accept(3, ..., ...) = 4 +0 setsockopt(4, SOL_TCP, TCP_NODELAY, [1], 4) = 0 0.200 write(4, ..., 1460) = 1460 +0 setsockopt(4, SOL_SOCKET, 37, [2688], 4) = 0 0.200 write(4, ..., 13140) = 13140 0.200 > P. 1:1461(1460) ack 1 0.200 > . 1461:8761(7300) ack 1 0.200 > P. 8761:14601(5840) ack 1 0.300 < . 1:1(0) ack 1 win 257 <sack 1461:14601,nop,nop> 0.300 > P. 1:1461(1460) ack 1 0.400 < . 1:1(0) ack 14601 win 257 0.400 close(4) = 0 0.400 > F. 14601:14601(0) ack 1 0.500 < F. 1:1(0) ack 14602 win 257 0.500 > . 14602:14602(0) ack 2 Signed-off-by: NMartin KaFai Lau <kafai@fb.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Soheil Hassas Yeganeh <soheil@google.com> Cc: Willem de Bruijn <willemb@google.com> Cc: Yuchung Cheng <ycheng@google.com> Acked-by: NSoheil Hassas Yeganeh <soheil@google.com> Tested-by: NSoheil Hassas Yeganeh <soheil@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 21 4月, 2016 3 次提交
-
-
由 Takashi Iwai 提交于
HD-audio driver uses regmap cache bypass feature for reading a raw value without the cache. But this is racy since both the cached and the uncached reads may occur concurrently. The former is done via the normal control API access while the latter comes from the proc file read. Even though the regmap itself has the protection against the concurrent accesses, the flag set/reset is done without the protection, so it may lead to inconsistent state of bypass flag that doesn't match with the current read and occasionally result in a kernel WARNING like: WARNING: CPU: 3 PID: 2731 at drivers/base/regmap/regcache.c:499 regcache_cache_only+0x78/0x93 One way to work around such a problem is to wrap with a mutex. But in this case, the solution is simpler: for the uncached read, we just skip the regmap and directly calls its accessor. The verb execution there is protected by itself, so basically it's safe to call individually. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=116171Signed-off-by: NTakashi Iwai <tiwai@suse.de>
-
由 Romain Perier 提交于
The recent decoupling of pagefault disable and preempt disable added an explicit preempt_disable/enable() pair to the futex_atomic_cmpxchg_inatomic() implementation in asm-generic/futex.h. But it forgot to add preempt_enable() calls to the error handling code pathes, which results in a preemption count imbalance. This is observable on boot when the test for atomic_cmpxchg() is calling futex_atomic_cmpxchg_inatomic() on a NULL pointer. Add the missing preempt_enable() calls to the error handling code pathes. [ tglx: Massaged changelog ] Fixes: d9b9ff8c ("sched/preempt, futex: Disable preemption in UP futex_atomic_cmpxchg_inatomic() explicitly") Signed-off-by: NRomain Perier <romain.perier@free-electrons.com> Cc: linux-arch@vger.kernel.org Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1460640963-690-1-git-send-email-romain.perier@free-electrons.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Wei Ni 提交于
The commit 17e8351a consistently use int for temperature, however it missed a few in trip temperature and thermal_core. In current codes, the trip->temperature used "unsigned long" and zone->temperature used"int", if the temperature is negative value, it will get wrong result when compare temperature with trip temperature. This patch can fix it. Signed-off-by: NWei Ni <wni@nvidia.com> Signed-off-by: NEduardo Valentin <edubezval@gmail.com>
-
- 20 4月, 2016 1 次提交
-
-
由 Konstantin Khlebnikov 提交于
skb->sk could point to timewait or request socket which has no sk_classid. Detected as "BUG: KASAN: slab-out-of-bounds in cls_cgroup_classify". Signed-off-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru> Acked-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 19 4月, 2016 1 次提交
-
-
由 Linus Torvalds 提交于
This gets rid of the horrible notion of having that struct inode *ptmx_inode be the linchpin of the interface between the pty code and devpts. By de-emphasizing the ptmx inode, a lot of things actually get cleaner, and we will have a much saner way forward. In particular, this will allow us to associate with any particular devpts instance at open-time, and not be artificially tied to one particular ptmx inode. The patch itself is actually fairly straightforward, and apart from some locking and return path cleanups it's pretty mechanical: - the interfaces that devpts exposes all take "struct pts_fs_info *" instead of "struct inode *ptmx_inode" now. NOTE! The "struct pts_fs_info" thing is a completely opaque structure as far as the pty driver is concerned: it's still declared entirely internally to devpts. So the pty code can't actually access it in any way, just pass it as a "cookie" to the devpts code. - the "look up the pts fs info" is now a single clear operation, that also does the reference count increment on the pts superblock. So "devpts_add/del_ref()" is gone, and replaced by a "lookup and get ref" operation (devpts_get_ref(inode)), along with a "put ref" op (devpts_put_ref()). - the pty master "tty->driver_data" field now contains the pts_fs_info, not the ptmx inode. - because we don't care about the ptmx inode any more as some kind of base index, the ref counting can now drop the inode games - it just gets the ref on the superblock. - the pts_fs_info now has a back-pointer to the super_block. That's so that we can easily look up the information we actually need. Although quite often, the pts fs info was actually all we wanted, and not having to look it up based on some magical inode makes things more straightforward. In particular, now that "devpts_get_ref(inode)" operation should really be the *only* place we need to look up what devpts instance we're associated with, and we do it exactly once, at ptmx_open() time. The other side of this is that one ptmx node could now be associated with multiple different devpts instances - you could have a single /dev/ptmx node, and then have multiple mount namespaces with their own instances of devpts mounted on /dev/pts/. And that's all perfectly sane in a model where we just look up the pts instance at open time. This will eventually allow us to get rid of our odd single-vs-multiple pts instance model, but this patch in itself changes no semantics, only an internal binding model. Cc: Eric Biederman <ebiederm@xmission.com> Cc: Peter Anvin <hpa@zytor.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Peter Hurley <peter@hurleysoftware.com> Cc: Serge Hallyn <serge.hallyn@ubuntu.com> Cc: Willy Tarreau <w@1wt.eu> Cc: Aurelien Jarno <aurelien@aurel32.net> Cc: Alan Cox <gnomes@lxorguk.ukuu.org.uk> Cc: Jann Horn <jann@thejh.net> Cc: Greg KH <greg@kroah.com> Cc: Jiri Slaby <jslaby@suse.com> Cc: Florian Weimer <fw@deneb.enyo.de> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 16 4月, 2016 1 次提交
-
-
由 Hariprasad Shenai 提交于
After 104daa71 ("PCI: Determine actual VPD size on first access"), the PCI core computes the valid VPD size by parsing the VPD starting at offset 0x0. We don't attempt to read past that valid size because that causes some devices to crash. However, some devices do have data past that valid size. For example, Chelsio adapters contain two VPD structures, and the driver needs both of them. Add pci_set_vpd_size(). If a driver knows it is safe to read past the end of the VPD data structure at offset 0, it can use pci_set_vpd_size() to allow access to as much data as it needs. [bhelgaas: changelog, split patches, rename to pci_set_vpd_size() and return int (not ssize_t)] Fixes: 104daa71 ("PCI: Determine actual VPD size on first access") Tested-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NCasey Leedom <leedom@chelsio.com> Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
- 15 4月, 2016 4 次提交
-
-
由 Craig Gallek 提交于
With the SO_REUSEPORT socket option, it is possible to create sockets in the AF_INET and AF_INET6 domains which are bound to the same IPv4 address. This is only possible with SO_REUSEPORT and when not using IPV6_V6ONLY on the AF_INET6 sockets. Prior to the commits referenced below, an incoming IPv4 packet would always be routed to a socket of type AF_INET when this mixed-mode was used. After those changes, the same packet would be routed to the most recently bound socket (if this happened to be an AF_INET6 socket, it would have an IPv4 mapped IPv6 address). The change in behavior occurred because the recent SO_REUSEPORT optimizations short-circuit the socket scoring logic as soon as they find a match. They did not take into account the scoring logic that favors AF_INET sockets over AF_INET6 sockets in the event of a tie. To fix this problem, this patch changes the insertion order of AF_INET and AF_INET6 addresses in the TCP and UDP socket lists when the sockets have SO_REUSEPORT set. AF_INET sockets will be inserted at the head of the list and AF_INET6 sockets with SO_REUSEPORT set will always be inserted at the tail of the list. This will force AF_INET sockets to always be considered first. Fixes: e32ea7e7 ("soreuseport: fast reuseport UDP socket selection") Fixes: 125e80b88687 ("soreuseport: fast reuseport TCP socket selection") Reported-by: NMaciej Żenczykowski <maze@google.com> Signed-off-by: NCraig Gallek <kraig@google.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Martin KaFai Lau 提交于
This patch adds a release_cb for UDPv6. It does a route lookup and updates sk->sk_dst_cache if it is needed. It picks up the left-over job from ip6_sk_update_pmtu() if the sk was owned by user during the pmtu update. It takes a rcu_read_lock to protect the __sk_dst_get() operations because another thread may do ip6_dst_store() without taking the sk lock (e.g. sendmsg). Fixes: 45e4fd26 ("ipv6: Only create RTF_CACHE routes after encountering pmtu exception") Signed-off-by: NMartin KaFai Lau <kafai@fb.com> Reported-by: NWei Wang <weiwan@google.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Wei Wang <weiwan@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Martin KaFai Lau 提交于
There is a case in connected UDP socket such that getsockopt(IPV6_MTU) will return a stale MTU value. The reproducible sequence could be the following: 1. Create a connected UDP socket 2. Send some datagrams out 3. Receive a ICMPV6_PKT_TOOBIG 4. No new outgoing datagrams to trigger the sk_dst_check() logic to update the sk->sk_dst_cache. 5. getsockopt(IPV6_MTU) returns the mtu from the invalid sk->sk_dst_cache instead of the newly created RTF_CACHE clone. This patch updates the sk->sk_dst_cache for a connected datagram sk during pmtu-update code path. Note that the sk->sk_v6_daddr is used to do the route lookup instead of skb->data (i.e. iph). It is because a UDP socket can become connected after sending out some datagrams in un-connected state. or It can be connected multiple times to different destinations. Hence, iph may not be related to where sk is currently connected to. It is done under '!sock_owned_by_user(sk)' condition because the user may make another ip6_datagram_connect() (i.e changing the sk->sk_v6_daddr) while dst lookup is happening in the pmtu-update code path. For the sock_owned_by_user(sk) == true case, the next patch will introduce a release_cb() which will update the sk->sk_dst_cache. Test: Server (Connected UDP Socket): ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Route Details: [root@arch-fb-vm1 ~]# ip -6 r show | egrep '2fac' 2fac::/64 dev eth0 proto kernel metric 256 pref medium 2fac:face::/64 via 2fac::face dev eth0 metric 1024 pref medium A simple python code to create a connected UDP socket: import socket import errno HOST = '2fac::1' PORT = 8080 s = socket.socket(socket.AF_INET6, socket.SOCK_DGRAM) s.bind((HOST, PORT)) s.connect(('2fac:face::face', 53)) print("connected") while True: try: data = s.recv(1024) except socket.error as se: if se.errno == errno.EMSGSIZE: pmtu = s.getsockopt(41, 24) print("PMTU:%d" % pmtu) break s.close() Python program output after getting a ICMPV6_PKT_TOOBIG: [root@arch-fb-vm1 ~]# python2 ~/devshare/kernel/tasks/fib6/udp-connect-53-8080.py connected PMTU:1300 Cache routes after recieving TOOBIG: [root@arch-fb-vm1 ~]# ip -6 r show table cache 2fac:face::face via 2fac::face dev eth0 metric 0 cache expires 463sec mtu 1300 pref medium Client (Send the ICMPV6_PKT_TOOBIG): ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ scapy is used to generate the TOOBIG message. Here is the scapy script I have used: >>> p=Ether(src='da:75:4d:36:ac:32', dst='52:54:00:12:34:66', type=0x86dd)/IPv6(src='2fac::face', dst='2fac::1')/ICMPv6PacketTooBig(mtu=1300)/IPv6(src='2fac:: 1',dst='2fac:face::face', nh='UDP')/UDP(sport=8080,dport=53) >>> sendp(p, iface='qemubr0') Fixes: 45e4fd26 ("ipv6: Only create RTF_CACHE routes after encountering pmtu exception") Signed-off-by: NMartin KaFai Lau <kafai@fb.com> Reported-by: NWei Wang <weiwan@google.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Wei Wang <weiwan@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Linus Torvalds 提交于
A lot of seqfile users seem to be using things like %pK that uses the credentials of the current process, but that is actually completely wrong for filesystem interfaces. The unix semantics for permission checking files is to check permissions at _open_ time, not at read or write time, and that is not just a small detail: passing off stdin/stdout/stderr to a suid application and making the actual IO happen in privileged context is a classic exploit technique. So if we want to be able to look at permissions at read time, we need to use the file open credentials, not the current ones. Normal file accesses can just use "f_cred" (or any of the helper functions that do that, like file_ns_capable()), but the seqfile interfaces do not have any such options. It turns out that seq_file _does_ save away the user_ns information of the file, though. Since user_ns is just part of the full credential information, replace that special case with saving off the cred pointer instead, and suddenly seq_file has all the permission information it needs. Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 14 4月, 2016 2 次提交
-
-
由 Toshi Kani 提交于
After 'commit fc0c2028 ("x86, pmem: use memcpy_mcsafe() for memcpy_from_pmem()")', probing a PMEM device hits the BUG() error below on X86_32 kernel. kernel BUG at include/linux/pmem.h:48! memcpy_from_pmem() calls arch_memcpy_from_pmem(), which is unimplemented since CONFIG_ARCH_HAS_PMEM_API is undefined on X86_32. Fix the BUG() error by adding default_memcpy_from_pmem(). Acked-by: NDan Williams <dan.j.williams@intel.com> Signed-off-by: NToshi Kani <toshi.kani@hpe.com> Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
-
由 Hans de Goede 提交于
Add a new NO_REPORT_LUNS quirk and set it for Seagate drives with an usb-id of: 0bc2:331a, as these will fail to respond to a REPORT_LUNS command. Cc: stable@vger.kernel.org Reported-and-tested-by: NDavid Webb <djw@noc.ac.uk> Signed-off-by: NHans de Goede <hdegoede@redhat.com> Acked-by: NAlan Stern <stern@rowland.harvard.edu> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 13 4月, 2016 1 次提交
-
-
由 Jaegeuk Kim 提交于
This patch fixes the issue introduced by the ext4 crypto fix in a same manner. For F2FS, however, we flush the pending IOs and wait for a while to acquire free memory. Fixes: c9af28fd ("ext4 crypto: don't let data integrity writebacks fail with ENOMEM") Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
- 12 4月, 2016 2 次提交
-
-
由 stephen hemminger 提交于
The new devlink.h in uapi was not being installed by make headers_install Signed-off-by: NStephen Hemminger <stephen@networkplumber.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David Ahern 提交于
Vivek reported a kernel exception deleting a VRF with an active connection through it. The root cause is that the socket has a cached reference to a dst that is destroyed. Converting the dst_destroy to dst_release and letting proper reference counting kick in does not work as the dst has a reference to the device which needs to be released as well. I talked to Hannes about this at netdev and he pointed out the ipv4 and ipv6 dst handling has dst_ifdown for just this scenario. Rather than continuing with the reinvented dst wheel in VRF just remove it and leverage the ipv4 and ipv6 versions. Fixes: 193125db ("net: Introduce VRF device driver") Fixes: 35402e31 ("net: Add IPv6 support to VRF device") Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 4月, 2016 1 次提交
-
-
由 Marcelo Ricardo Leitner 提交于
Currently on high rate SCTP streams the heartbeat timer refresh can consume quite a lot of resources as timer updates are costly and it contains a random factor, which a) is also costly and b) invalidates mod_timer() optimization for not editing a timer to the same value. It may even cause the timer to be slightly advanced, for no good reason. As suggested by David Laight this patch now removes this timer update from hot path by leaving the timer on and re-evaluating upon its expiration if the heartbeat is still needed or not, similarly to what is done for TCP. If it's not needed anymore the timer is re-scheduled to the new timeout, considering the time already elapsed. For this, we now record the last tx timestamp per transport, updated in the same spots as hb timer was restarted on tx. Also split up sctp_transport_reset_timers into sctp_transport_reset_t3_rtx and sctp_transport_reset_hb_timer, so we can re-arm T3 without re-arming the heartbeat one. On loopback with MTU of 65535 and data chunks with 1636, so that we have a considerable amount of chunks without stressing system calls, netperf -t SCTP_STREAM -l 30, perf looked like this before: Samples: 103K of event 'cpu-clock', Event count (approx.): 25833000000 Overhead Command Shared Object Symbol + 6,15% netperf [kernel.vmlinux] [k] copy_user_enhanced_fast_string - 5,43% netperf [kernel.vmlinux] [k] _raw_write_unlock_irqrestore - _raw_write_unlock_irqrestore - 96,54% _raw_spin_unlock_irqrestore - 36,14% mod_timer + 97,24% sctp_transport_reset_timers + 2,76% sctp_do_sm + 33,65% __wake_up_sync_key + 28,77% sctp_ulpq_tail_event + 1,40% del_timer - 1,84% mod_timer + 99,03% sctp_transport_reset_timers + 0,97% sctp_do_sm + 1,50% sctp_ulpq_tail_event And after this patch, now with netperf -l 60: Samples: 230K of event 'cpu-clock', Event count (approx.): 57707250000 Overhead Command Shared Object Symbol + 5,65% netperf [kernel.vmlinux] [k] memcpy_erms + 5,59% netperf [kernel.vmlinux] [k] copy_user_enhanced_fast_string - 5,05% netperf [kernel.vmlinux] [k] _raw_spin_unlock_irqrestore - _raw_spin_unlock_irqrestore + 49,89% __wake_up_sync_key + 45,68% sctp_ulpq_tail_event - 2,85% mod_timer + 76,51% sctp_transport_reset_t3_rtx + 23,49% sctp_do_sm + 1,55% del_timer + 2,50% netperf [sctp] [k] sctp_datamsg_from_user + 2,26% netperf [sctp] [k] sctp_sendmsg Throughput-wise, from 6800mbps without the patch to 7050mbps with it, ~3.7%. Signed-off-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 08 4月, 2016 1 次提交
-
-
由 Alexander Duyck 提交于
This patch fixes an issue I found in which we were dropping frames if we had enabled checksums on GRE headers that were encapsulated by either FOU or GUE. Without this patch I was barely able to get 1 Gb/s of throughput. With this patch applied I am now at least getting around 6 Gb/s. The issue is due to the fact that with FOU or GUE applied we do not provide a transport offset pointing to the GRE header, nor do we offload it in software as the GRE header is completely skipped by GSO and treated like a VXLAN or GENEVE type header. As such we need to prevent the stack from generating it and also prevent GRE from generating it via any interface we create. Fixes: c3483384 ("gro: Allow tunnel stacking in the case of FOU/GUE") Signed-off-by: NAlexander Duyck <aduyck@mirantis.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 07 4月, 2016 3 次提交
-
-
由 Stefan Hajnoczi 提交于
The VIRTIO 1.0 specification added the DEVICE_NEEDS_RESET device status bit in "VIRTIO-98: Add DEVICE_NEEDS_RESET". This patch defines the device status bit in the uapi header file so that both the kernel and userspace applications can use it. The bit is currently unused by the virtio guest drivers and vhost. According to the spec "a good implementation will try to recover by issuing a reset". This is not attempted here because it requires auditing the virtio drivers to ensure there are no resource leaks or crashes if the device needs to be reset mid-operation. See "2.1 Device Status Field" in the VIRTIO 1.0 specification for details. Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com> Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 Ingo Molnar 提交于
The pkeys changes brought about a truly hideous set of macros in: cde70140 ("mm/gup: Overload get_user_pages() functions") ... which macros are (ab-)using the fact that __VA_ARGS__ can be used to shift parameter positions in macro arguments without breaking the build and so can be used to call separate C functions depending on the number of arguments of the macro. This allowed easy migration of these 3 GUP APIs, as both these variants worked at the C level: old: ret = get_user_pages(current, current->mm, address, 1, 1, 0, &page, NULL); new: ret = get_user_pages(address, 1, 1, 0, &page, NULL); ... while we also generated a (functionally harmless but noticeable) build time warning if the old API was used. As there are over 300 uses of these APIs, this trick eased the migration of the API and avoided excessive migration pain in linux-next. Now, with its work done, get rid of all of that complication and ugliness: 3 files changed, 16 insertions(+), 140 deletions(-) ... where the linecount of the migration hack was further inflated by the fact that there are NOMMU variants of these GUP APIs as well. Much of the conversion was done in linux-next over the past couple of months, and Linus recently removed all remaining old API uses from the upstream tree in the following upstrea commit: cb107161 ("Convert straggling drivers to new six-argument get_user_pages()") There was one more old-API usage in mm/gup.c, in the CONFIG_HAVE_GENERIC_RCU_GUP code path that ARM, ARM64 and PowerPC uses. After this commit any old API usage will break the build. [ Also fixed a PowerPC/HAVE_GENERIC_RCU_GUP warning reported by Stephen Rothwell. ] Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dave Hansen <dave@sr71.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 WANG Cong 提交于
Fixes: ddf97ccd ("net_sched: add network namespace support for tc actions") Reported-by: NDmitry Vyukov <dvyukov@google.com> Tested-by: NDmitry Vyukov <dvyukov@google.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 06 4月, 2016 1 次提交
-
-
由 Marcelo Ricardo Leitner 提交于
Use list_* helpers in sctp_list_dequeue, more readable. Signed-off-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 4月, 2016 8 次提交
-
-
由 Paolo Bonzini 提交于
-ftracer can duplicate asm blocks causing compilation to fail in noclone functions. For example, KVM declares a global variable in an asm like asm("2: ... \n .pushsection data \n .global vmx_return \n vmx_return: .long 2b"); and -ftracer causes a double declaration. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Michal Marek <mmarek@suse.cz> Cc: stable@vger.kernel.org Cc: kvm@vger.kernel.org Reported-by: NLinda Walsh <lkml@tlinx.org> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Arnd Bergmann 提交于
iommu drivers that support the standard DT bindings use a of_xlate callback pointer, but that is only part of struct iommu_ops when CONFIG_OF_IOMMU is enabled, leading to build errors in randconfig builds when that is not provided: drivers/iommu/mtk_iommu.c:497:2: error: unknown field 'of_xlate' specified in initializer .of_xlate = mtk_iommu_of_xlate, ^ drivers/iommu/mtk_iommu.c:497:14: error: initialization from incompatible pointer type [-Werror=incompatible-pointer-types] .of_xlate = mtk_iommu_of_xlate, ^~~~~~~~~~~~~~~~~~ drivers/iommu/mtk_iommu.c:497:14: note: (near initialization for 'mtk_iommu_ops.domain_get_attr') We can work around it by adding more #ifdefs in each driver, but it seems nicer to just allow setting the pointer even if it is unused. This makes the driver code look nicer, and it gives better compile-time coverage when test building on other architectures. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Fixes: 0df4fabe ("iommu/mediatek: Add mt8173 IOMMU driver") Reviewed-by: NRobin Murphy <robin.murphy@arm.com> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Hannes Reinecke 提交于
The patch "scsi: rescan VPD attributes" introduced a regression in which devices that don't support VPD were being scanned for VPD attributes anyway. This could cause issues for some devices and should be avoided so the check for scsi_level has been moved out of scsi_add_lun and into scsi_attach_vpd so that all callers will not scan VPD for devices that don't support it. [mkp: Merge fix] Fixes: 09e2b0b1 ("scsi: rescan VPD attributes") Cc: <stable@vger.kernel.org> #v4.5+ Suggested-by: NAlexander Duyck <aduyck@mirantis.com> Signed-off-by: NHannes Reinecke <hare@suse.com> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Luis de Bethencourt 提交于
Add documentation for the flag for duplication check. Fixes the following warning when running make htmldocs: warning: Enum value 'RX_FLAG_DUP_VALIDATED' not described in enum 'mac80211_rx_flags' Signed-off-by: NLuis de Bethencourt <luisbg@osg.samsung.com> [fix description] Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
-
由 Alex Deucher 提交于
Fixes ttm on platforms like PPC460 where the CPU is in 32-bit mode, but the physical addresses are >32 bits. Extracted from a patch by Hans Verkuil. Tested-by: NJulian Margetson <runaway@candw.ms> Acked-by: NThomas Hellstrom <thellstrom@vmware.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Cc: Thomas Hellstrom <thellstrom@vmware.com> Cc: Julian Margetson <runaway@candw.ms> Cc: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Kirill A. Shutemov 提交于
All users gone. We can remove these macros. Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: NMichal Hocko <mhocko@suse.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Kirill A. Shutemov 提交于
Mostly direct substitution with occasional adjustment or removing outdated comments. Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: NMichal Hocko <mhocko@suse.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Kirill A. Shutemov 提交于
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time ago with promise that one day it will be possible to implement page cache with bigger chunks than PAGE_SIZE. This promise never materialized. And unlikely will. We have many places where PAGE_CACHE_SIZE assumed to be equal to PAGE_SIZE. And it's constant source of confusion on whether PAGE_CACHE_* or PAGE_* constant should be used in a particular case, especially on the border between fs and mm. Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much breakage to be doable. Let's stop pretending that pages in page cache are special. They are not. The changes are pretty straight-forward: - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN}; - page_cache_get() -> get_page(); - page_cache_release() -> put_page(); This patch contains automated changes generated with coccinelle using script below. For some reason, coccinelle doesn't patch header files. I've called spatch for them manually. The only adjustment after coccinelle is revert of changes to PAGE_CAHCE_ALIGN definition: we are going to drop it later. There are few places in the code where coccinelle didn't reach. I'll fix them manually in a separate patch. Comments and documentation also will be addressed with the separate patch. virtual patch @@ expression E; @@ - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ expression E; @@ - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ @@ - PAGE_CACHE_SHIFT + PAGE_SHIFT @@ @@ - PAGE_CACHE_SIZE + PAGE_SIZE @@ @@ - PAGE_CACHE_MASK + PAGE_MASK @@ expression E; @@ - PAGE_CACHE_ALIGN(E) + PAGE_ALIGN(E) @@ expression E; @@ - page_cache_get(E) + get_page(E) @@ expression E; @@ - page_cache_release(E) + put_page(E) Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: NMichal Hocko <mhocko@suse.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 04 4月, 2016 1 次提交
-
-
由 Mark Fasheh 提交于
This patch adds tracepoints to the qgroup code on both the reporting side (insert_dirty_extents) and the accounting side. Taken together it allows us to see what qgroup operations have happened, and what their result was. Signed-off-by: NMark Fasheh <mfasheh@suse.de> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
- 02 4月, 2016 3 次提交
-
-
由 Lucas Stach 提交于
Page isolation has not failed if the fin pfn extends beyond the end pfn and test_pages_isolated checks this correctly. Fix the tracepoint to report the same result as the actual check function. Signed-off-by: NLucas Stach <l.stach@pengutronix.de> Acked-by: NVlastimil Babka <vbabka@suse.cz> Cc: Joonsoo Kim <js1304@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Chen Gang 提交于
The return value of pmd_trans_huge_lock() is a pointer, not a boolean value, so use NULL instead of false as the return value. Signed-off-by: NChen Gang <gang.chen.5i5j@gmail.com> Acked-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Giuseppe CAVALLARO 提交于
Initially the phy_bus_name was added to manipulate the driver name but it was recently just used to manage the fixed-link and then to take some decision at run-time. So the patch uses the is_pseudo_fixed_link and removes the phy_bus_name variable not necessary anymore. The driver can manage the mdio registration by using phy-handle, dwmac-mdio and own parameter e.g. snps,phy-addr. This patch takes care about all these possible configurations and fixes the mdio registration in case of there is a real transceiver or a switch (that needs to be managed by using fixed-link). Signed-off-by: NGiuseppe Cavallaro <peppe.cavallaro@st.com> Reviewed-by: NAndreas Färber <afaerber@suse.de> Tested-by: NFrank Schäfer <fschaefer.oss@googlemail.com> Cc: Gabriel Fernandez <gabriel.fernandez@linaro.org> Cc: Dinh Nguyen <dinh.linux@gmail.com> Cc: David S. Miller <davem@davemloft.net> Cc: Phil Reid <preid@electromag.com.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-