- 06 5月, 2015 3 次提交
-
-
由 Alexander Duyck 提交于
Replace "ntohs(proto) >= ETH_P_802_3_MIN" w/ eth_proto_is_802_3(proto). Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alexander Duyck 提交于
This change does two things. First it fixes a sparse error for the fact that the __be16 degrades to an integer. Since that is actually what I am kind of doing I am simply working around that by forcing both sides of the comparison to u16. Also I realized on some compilers I was generating another instruction for big endian systems such as PowerPC since it was masking the value before doing the comparison. So to resolve that I have simply pulled the mask out and wrapped it in an #ifndef __BIG_ENDIAN. Lastly I pulled this all out into its own function. I notices there are similar checks in a number of other places so this function can be reused there to help reduce overhead in these paths as well. Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
This patch allows a server application to get the TCP SYN headers for its passive connections. This is useful if the server is doing fingerprinting of clients based on SYN packet contents. Two socket options are added: TCP_SAVE_SYN and TCP_SAVED_SYN. The first is used on a socket to enable saving the SYN headers for child connections. This can be set before or after the listen() call. The latter is used to retrieve the SYN headers for passive connections, if the parent listener has enabled TCP_SAVE_SYN. TCP_SAVED_SYN is read once, it frees the saved SYN headers. The data returned in TCP_SAVED_SYN are network (IPv4/IPv6) and TCP headers. Original patch was written by Tom Herbert, I changed it to not hold a full skb (and associated dst and conntracking reference). We have used such patch for about 3 years at Google. Signed-off-by: NEric Dumazet <edumazet@google.com> Acked-by: NNeal Cardwell <ncardwell@google.com> Tested-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 5月, 2015 2 次提交
-
-
由 Linus Lüssing 提交于
With this patch, the IGMP and MLD message validation functions are moved from the bridge code to IPv4/IPv6 multicast files. Some small refactoring was done to enhance readibility and to iron out some differences in behaviour between the IGMP and MLD parsing code (e.g. the skb-cloning of MLD messages is now only done if necessary, just like the IGMP part always did). Finally, these IGMP and MLD message validation functions are exported so that not only the bridge can use it but batman-adv later, too. Signed-off-by: NLinus Lüssing <linus.luessing@c0d3.blue> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jamal Hadi Salim 提交于
improves ingress+u32 performance from 22.4 Mpps to 22.9 Mpps Signed-off-by: NJamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com> Acked-by: NFlorian Westphal <fw@strlen.de> Acked-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NJesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 5月, 2015 4 次提交
-
-
由 Tom Herbert 提交于
Some users of flow keys (well just sch_choke now) need to pass flow_keys in skbuff cb, and use them for exact comparisons of flows so that skb->hash is not sufficient. In order to increase size of the flow_keys structure, we introduce another structure for the purpose of passing flow keys in skbuff cb. We limit this structure to sixteen bytes, and we will technically treat this as a digest of flow_keys struct hence its name flow_keys_digest. In the first incaranation we just copy the flow_keys structure up to 16 bytes-- this is the same information previously passed in the cb. In the future, we'll adapt this for larger flow_keys and could use something like SHA-1 over the whole flow_keys to improve the quality of the digest. Signed-off-by: NTom Herbert <tom@herbertland.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
This calls flow_disect and __skb_get_hash to procure a hash for a packet. Input includes a key to initialize jhash. This function does not set skb->hash. Signed-off-by: NTom Herbert <tom@herbertland.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alexander Duyck 提交于
This change makes it so that we process the address in is_multicast_ether_addr at the same size as the other calls. This allows us to avoid duplicate reads when used with other calls such as is_zero_ether_addr or eth_addr_copy. In addition I have added a 64 bit version of the function so in eth_type_trans we can process the destination address as a 64 bit value throughout. Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
This patch divides the IPv6 flow label space into two ranges: 0-7ffff is reserved for flow label manager, 80000-fffff will be used for creating auto flow labels (per RFC6438). This only affects how labels are set on transmit, it does not affect receive. This range split can be disbaled by systcl. Background: IPv6 flow labels have been an unmitigated disappointment thus far in the lifetime of IPv6. Support in HW devices to use them for ECMP is lacking, and OSes don't turn them on by default. If we had these we could get much better hashing in IPv6 networks without resorting to DPI, possibly eliminating some of the motivations to to define new encaps in UDP just for getting ECMP. Unfortunately, the initial specfications of IPv6 did not clarify how they are to be used. There has always been a vague concept that these can be used for ECMP, flow hashing, etc. and we do now have a good standard how to this in RFC6438. The problem is that flow labels can be either stateful or stateless (as in RFC6438), and we are presented with the possibility that a stateless label may collide with a stateful one. Attempts to split the flow label space were rejected in IETF. When we added support in Linux for RFC6438, we could not turn on flow labels by default due to this conflict. This patch splits the flow label space and should give us a path to enabling auto flow labels by default for all IPv6 packets. This is an API change so we need to consider compatibility with existing deployment. The stateful range is chosen to be the lower values in hopes that most uses would have chosen small numbers. Once we resolve the stateless/stateful issue, we can proceed to look at enabling RFC6438 flow labels by default (starting with scaled testing). Signed-off-by: NTom Herbert <tom@herbertland.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 03 5月, 2015 1 次提交
-
-
由 Florian Westphal 提交于
Not used. pedit sets TC_MUNGED when packet content was altered, but all the core does is unset MUNGED again and then set OK2MUNGE. And the latter isn't tested anywhere. So lets remove both TC_MUNGED and TC_OK2MUNGE. Signed-off-by: NFlorian Westphal <fw@strlen.de> Acked-by: NAlexei Starovoitov <ast@plumgrid.com> Acked-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NJamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 5月, 2015 3 次提交
-
-
由 Stefan Hajnoczi 提交于
Here the "other side" refers to the guest or host. Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Martin KaFai Lau 提交于
_rt6i_peer is no longer needed after the last patch, 'ipv6: Stop rt6_info from using inet_peer's metrics'. DST_METRICS_FORCE_OVERWRITE is added by commit e5fd387a ("ipv6: do not overwrite inetpeer metrics prematurely"). Since inetpeer is no longer used for metrics, this bit is also not needed. Signed-off-by: NMartin KaFai Lau <kafai@fb.com> Reviewed-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Cc: Michal Kubeček <mkubecek@suse.cz> Cc: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Martin KaFai Lau 提交于
inet_peer is indexed by the dst address alone. However, the fib6 tree could have multiple routing entries (rt6_info) for the same dst. For example, 1. A /128 dst via multiple gateways. 2. A RTF_CACHE route cloned from a /128 route. In the above cases, all of them will share the same metrics and step on each other. This patch will steer away from inet_peer's metrics and use dst_cow_metrics_generic() for everything. Change Highlights: 1. Remove rt6_cow_metrics() which currently acquires metrics from inet_peer for DST_HOST route (i.e. /128 route). 2. Add rt6i_pmtu to take care of the pmtu update to avoid creating a full size metrics just to override the RTAX_MTU. 3. After (2), the RTF_CACHE route can also share the metrics with its dst.from route, by: dst_init_metrics(&cache_rt->dst, dst_metrics_ptr(cache_rt->dst.from), true); 4. Stop creating RTF_CACHE route by cloning another RTF_CACHE route. Instead, directly clone from rt->dst. [ Currently, cloning from another RTF_CACHE is only possible during rt6_do_redirect(). Also, the old clone is removed from the tree immediately after the new clone is added. ] In case of cloning from an older redirect RTF_CACHE, it should work as before. In case of cloning from an older pmtu RTF_CACHE, this patch will forget the pmtu and re-learn it (if there is any) from the redirected route. The _rt6i_peer and DST_METRICS_FORCE_OVERWRITE will be removed in the next cleanup patch. Signed-off-by: NMartin KaFai Lau <kafai@fb.com> Reviewed-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Cc: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 30 4月, 2015 5 次提交
-
-
由 Eric Dumazet 提交于
Some Congestion Control modules can provide per flow information, but current way to get this information is to use netlink. Like TCP_INFO, let's add TCP_CC_INFO so that applications can issue a getsockopt() if they have a socket file descriptor, instead of playing complex netlink games. Sample usage would be : union tcp_cc_info info; socklen_t len = sizeof(info); if (getsockopt(fd, SOL_TCP, TCP_CC_INFO, &info, &len) == -1) Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Neal Cardwell <ncardwell@google.com> Acked-by: NNeal Cardwell <ncardwell@google.com> Acked-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NYuchung Cheng <ycheng@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
We would like that optional info provided by Congestion Control modules using netlink can also be read using getsockopt() This patch changes get_info() to put this information in a buffer, instead of skb, like tcp_get_info(), so that following patch can reuse this common infrastructure. Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Neal Cardwell <ncardwell@google.com> Acked-by: NNeal Cardwell <ncardwell@google.com> Acked-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NYuchung Cheng <ycheng@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
This patch tracks total number of payload bytes received on a TCP socket. This is the sum of all changes done to tp->rcv_nxt RFC4898 named this : tcpEStatsAppHCThruOctetsReceived This is a 64bit field, and can be fetched both from TCP_INFO getsockopt() if one has a handle on a TCP socket, or from inet_diag netlink facility (iproute2/ss patch will follow) Note that tp->bytes_received was placed near tp->rcv_nxt for best data locality and minimal performance impact. Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Matt Mathis <mattmathis@google.com> Cc: Eric Salo <salo@google.com> Cc: Martin Lau <kafai@fb.com> Cc: Chris Rapier <rapier@psc.edu> Acked-by: NYuchung Cheng <ycheng@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
This patch tracks total number of bytes acked for a TCP socket. This is the sum of all changes done to tp->snd_una, and allows for precise tracking of delivered data. RFC4898 named this : tcpEStatsAppHCThruOctetsAcked This is a 64bit field, and can be fetched both from TCP_INFO getsockopt() if one has a handle on a TCP socket, or from inet_diag netlink facility (iproute2/ss patch will follow) Note that tp->bytes_acked was placed near tp->snd_una for best data locality and minimal performance impact. Signed-off-by: NEric Dumazet <edumazet@google.com> Acked-by: NYuchung Cheng <ycheng@google.com> Cc: Matt Mathis <mattmathis@google.com> Cc: Eric Salo <salo@google.com> Cc: Martin Lau <kafai@fb.com> Cc: Chris Rapier <rapier@psc.edu> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Nicolas Dichtel 提交于
NLM_F_MULTI must be used only when a NLMSG_DONE message is sent. In fact, it is sent only at the end of a dump. Libraries like libnl will wait forever for NLMSG_DONE. Fixes: e5a55a89 ("net: create generic bridge ops") Fixes: 815cccbf ("ixgbe: add setlink, getlink support to ixgbe and ixgbevf") CC: John Fastabend <john.r.fastabend@intel.com> CC: Sathya Perla <sathya.perla@emulex.com> CC: Subbu Seetharaman <subbu.seetharaman@emulex.com> CC: Ajit Khaparde <ajit.khaparde@emulex.com> CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com> CC: intel-wired-lan@lists.osuosl.org CC: Jiri Pirko <jiri@resnulli.us> CC: Scott Feldman <sfeldma@gmail.com> CC: Stephen Hemminger <stephen@networkplumber.org> CC: bridge@lists.linux-foundation.org Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 29 4月, 2015 2 次提交
-
-
由 Peter Zubaj 提交于
Looks like audigy emu10k2 (probably emu10k1 - sb live too) support two modes for DMA. Second mode is useful for 64 bit os with more then 2 GB of ram (fixes problems with big soundfont loading) 1) 32MB from 2 GB address space using 8192 pages (used now as default) 2) 16MB from 4 GB address space using 4096 pages Mode is set using HCFG_EXPANDED_MEM flag in HCFG register. Also format of emu10k2 page table is then different. Signed-off-by: NPeter Zubaj <pzubaj@marticonet.sk> Tested-by: NTakashi Iwai <tiwai@suse.de> Cc: <stable@vger.kernel.org> Signed-off-by: NTakashi Iwai <tiwai@suse.de>
-
由 Olaf Hering 提交于
During commit e252652f ("ACPICA: acpidump: Remove integer types translation protection.") two 'unsigned char' types got converted to 'u8'. The result does not compile with gcc-4.5, it can not cope with duplicate typedefs. Signed-off-by: NOlaf Hering <olaf@aepfle.de> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 28 4月, 2015 3 次提交
-
-
由 Rajeev Kumar 提交于
rajeev-dlh.kumar@st.com email-id doesn't exist anymore as I have left the company. Replace ST's id with Rajeev Kumar <rajeevkumar.linux@gmail.com> Signed-off-by: NRajeev Kumar <rajeevkumar.linux@gmail.com> Signed-off-by: NMark Brown <broonie@kernel.org>
-
由 Frederic Danis 提交于
This is needed by Bluetooth hci_uart module to be able to change speed of Bluetooth controller and local UART. Signed-off-by: NFrederic Danis <frederic.danis@linux.intel.com> Reviewed-by: NPeter Hurley <peter@hurleysoftware.com> Cc: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Hans de Goede 提交于
The usb-storage driver sets max_sectors = 240 in its scsi-host template, for uas we do not want to do that for all devices, but testing has shown that some devices need it. This commit adds a US_FL_MAX_SECTORS_240 flag for such devices, and implements support for it in uas.c, while at it it also adds support for US_FL_MAX_SECTORS_64 to uas.c. Cc: stable@vger.kernel.org # 3.16 Signed-off-by: NHans de Goede <hdegoede@redhat.com> Acked-by: NAlan Stern <stern@rowland.harvard.edu> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 27 4月, 2015 2 次提交
-
-
由 Paolo Bonzini 提交于
This reverts commits 0a4e6be9 and 80f7fdb1. The task migration notifier was originally introduced in order to support the pvclock vsyscall with non-synchronized TSC, but KVM only supports it with synchronized TSC. Hence, on KVM the race condition is only needed due to a bad implementation on the host side, and even then it's so rare that it's mostly theoretical. As far as KVM is concerned it's possible to fix the host, avoiding the additional complexity in the vDSO and the (re)introduction of the task migration notifier. Xen, on the other hand, hasn't yet implemented vsyscall support at all, so we do not care about its plans for non-synchronized TSC. Reported-by: NPeter Zijlstra <peterz@infradead.org> Suggested-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Matan Barak 提交于
The bonding modules currently defines four macros with general names that pollute the global namespace: DRV_VERSION DRV_RELDATE DRV_NAME DRV_DESCRIPTION Fixing that by defining a private bonding_priv.h header files which includes those defines. Signed-off-by: NMatan Barak <matanb@mellanox.com> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 26 4月, 2015 1 次提交
-
-
由 Eric Dumazet 提交于
When I added pfmemalloc support in build_skb(), I forgot netlink was using build_skb() with a vmalloc() area. In this patch I introduce __build_skb() for netlink use, and build_skb() is a wrapper handling both skb->head_frag and skb->pfmemalloc This means netlink no longer has to hack skb->head_frag [ 1567.700067] kernel BUG at arch/x86/mm/physaddr.c:26! [ 1567.700067] invalid opcode: 0000 [#1] PREEMPT SMP KASAN [ 1567.700067] Dumping ftrace buffer: [ 1567.700067] (ftrace buffer empty) [ 1567.700067] Modules linked in: [ 1567.700067] CPU: 9 PID: 16186 Comm: trinity-c182 Not tainted 4.0.0-next-20150424-sasha-00037-g4796e21 #2167 [ 1567.700067] task: ffff880127efb000 ti: ffff880246770000 task.ti: ffff880246770000 [ 1567.700067] RIP: __phys_addr (arch/x86/mm/physaddr.c:26 (discriminator 3)) [ 1567.700067] RSP: 0018:ffff8802467779d8 EFLAGS: 00010202 [ 1567.700067] RAX: 000041000ed8e000 RBX: ffffc9008ed8e000 RCX: 000000000000002c [ 1567.700067] RDX: 0000000000000004 RSI: 0000000000000000 RDI: ffffffffb3fd6049 [ 1567.700067] RBP: ffff8802467779f8 R08: 0000000000000019 R09: ffff8801d0168000 [ 1567.700067] R10: ffff8801d01680c7 R11: ffffed003a02d019 R12: ffffc9000ed8e000 [ 1567.700067] R13: 0000000000000f40 R14: 0000000000001180 R15: ffffc9000ed8e000 [ 1567.700067] FS: 00007f2a7da3f700(0000) GS:ffff8801d1000000(0000) knlGS:0000000000000000 [ 1567.700067] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1567.700067] CR2: 0000000000738308 CR3: 000000022e329000 CR4: 00000000000007e0 [ 1567.700067] Stack: [ 1567.700067] ffffc9000ed8e000 ffff8801d0168000 ffffc9000ed8e000 ffff8801d0168000 [ 1567.700067] ffff880246777a28 ffffffffad7c0a21 0000000000001080 ffff880246777c08 [ 1567.700067] ffff88060d302e68 ffff880246777b58 ffff880246777b88 ffffffffad9a6821 [ 1567.700067] Call Trace: [ 1567.700067] build_skb (include/linux/mm.h:508 net/core/skbuff.c:316) [ 1567.700067] netlink_sendmsg (net/netlink/af_netlink.c:1633 net/netlink/af_netlink.c:2329) [ 1567.774369] ? sched_clock_cpu (kernel/sched/clock.c:311) [ 1567.774369] ? netlink_unicast (net/netlink/af_netlink.c:2273) [ 1567.774369] ? netlink_unicast (net/netlink/af_netlink.c:2273) [ 1567.774369] sock_sendmsg (net/socket.c:614 net/socket.c:623) [ 1567.774369] sock_write_iter (net/socket.c:823) [ 1567.774369] ? sock_sendmsg (net/socket.c:806) [ 1567.774369] __vfs_write (fs/read_write.c:479 fs/read_write.c:491) [ 1567.774369] ? get_lock_stats (kernel/locking/lockdep.c:249) [ 1567.774369] ? default_llseek (fs/read_write.c:487) [ 1567.774369] ? vtime_account_user (kernel/sched/cputime.c:701) [ 1567.774369] ? rw_verify_area (fs/read_write.c:406 (discriminator 4)) [ 1567.774369] vfs_write (fs/read_write.c:539) [ 1567.774369] SyS_write (fs/read_write.c:586 fs/read_write.c:577) [ 1567.774369] ? SyS_read (fs/read_write.c:577) [ 1567.774369] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63) [ 1567.774369] ? trace_hardirqs_on_caller (kernel/locking/lockdep.c:2594 kernel/locking/lockdep.c:2636) [ 1567.774369] ? trace_hardirqs_on_thunk (arch/x86/lib/thunk_64.S:42) [ 1567.774369] system_call_fastpath (arch/x86/kernel/entry_64.S:261) Fixes: 79930f58 ("net: do not deplete pfmemalloc reserve") Signed-off-by: NEric Dumazet <edumazet@google.com> Reported-by: NSasha Levin <sasha.levin@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 25 4月, 2015 3 次提交
-
-
由 Eric Sandeen 提交于
I_DIO_WAKEUP is never directly used, but fix it up anyway. Signed-off-by: NEric Sandeen <sandeen@redhat.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Jens Axboe 提交于
do_blockdev_direct_IO() increments and decrements the inode ->i_dio_count for each IO operation. It does this to protect against truncate of a file. Block devices don't need this sort of protection. For a capable multiqueue setup, this atomic int is the only shared state between applications accessing the device for O_DIRECT, and it presents a scaling wall for that. In my testing, as much as 30% of system time is spent incrementing and decrementing this value. A mixed read/write workload improved from ~2.5M IOPS to ~9.6M IOPS, with better latencies too. Before: clat percentiles (usec): | 1.00th=[ 33], 5.00th=[ 34], 10.00th=[ 34], 20.00th=[ 34], | 30.00th=[ 34], 40.00th=[ 34], 50.00th=[ 35], 60.00th=[ 35], | 70.00th=[ 35], 80.00th=[ 35], 90.00th=[ 37], 95.00th=[ 80], | 99.00th=[ 98], 99.50th=[ 151], 99.90th=[ 155], 99.95th=[ 155], | 99.99th=[ 165] After: clat percentiles (usec): | 1.00th=[ 95], 5.00th=[ 108], 10.00th=[ 129], 20.00th=[ 149], | 30.00th=[ 155], 40.00th=[ 161], 50.00th=[ 167], 60.00th=[ 171], | 70.00th=[ 177], 80.00th=[ 185], 90.00th=[ 201], 95.00th=[ 270], | 99.00th=[ 390], 99.50th=[ 398], 99.90th=[ 418], 99.95th=[ 422], | 99.99th=[ 438] In other setups, Robert Elliott reported seeing good performance improvements: https://lkml.org/lkml/2015/4/3/557 The more applications accessing the device, the worse it gets. Add a new direct-io flags, DIO_SKIP_DIO_COUNT, which tells do_blockdev_direct_IO() that it need not worry about incrementing or decrementing the inode i_dio_count for this caller. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Elliott, Robert (Server Storage) <elliott@hp.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: NJens Axboe <axboe@fb.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Florian Westphal 提交于
Might not have an outdev yet. We'll oops when iface goes down while skbs are still nfqueue'd: RIP: 0010:[<ffffffff81422a2f>] [<ffffffff81422a2f>] dev_cmp+0x4f/0x80 nfqnl_rcv_dev_event+0xe2/0x150 notifier_call_chain+0x53/0xa0 Fixes: c737b7c4 ("netfilter: bridge: add helpers for fetching physin/outdev") Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
- 24 4月, 2015 6 次提交
-
-
由 Eric Dumazet 提交于
[ 3897.923145] BUG: unable to handle kernel NULL pointer dereference at 0000000000000080 [ 3897.931025] IP: [<ffffffffa9f27686>] reqsk_timer_handler+0x1a6/0x243 There is a race when reqsk_timer_handler() and tcp_check_req() call inet_csk_reqsk_queue_unlink() on the same req at the same time. Before commit fa76ce73 ("inet: get rid of central tcp/dccp listener timer"), listener spinlock was held and race could not happen. To solve this bug, we change reqsk_queue_unlink() to not assume req must be found, and we return a status, to conditionally release a refcount on the request sock. This also means tcp_check_req() in non fastopen case might or not consume req refcount, so tcp_v6_hnd_req() & tcp_v4_hnd_req() have to properly handle this. (Same remark for dccp_check_req() and its callers) inet_csk_reqsk_queue_drop() is now too big to be inlined, as it is called 4 times in tcp and 3 times in dccp. Fixes: fa76ce73 ("inet: get rid of central tcp/dccp listener timer") Signed-off-by: NEric Dumazet <edumazet@google.com> Reported-by: NYuchung Cheng <ycheng@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Johannes Berg 提交于
The conversion of mac80211's station table to rhashtable had a bug that I found by accident in code review, that hadn't been found as rhashtable apparently managed to have a maximum hash chain length of one (!) in all our testing. In order to test the bug and verify the fix I set my rhashtable's max_size very low (4) in order to force getting hash collisions. At that point, rhashtable WARNed in rhashtable_insert_rehash() but didn't actually reject the hash table insertion. This caused it to lose insertions - my master list of stations would have 9 entries, but the rhashtable only had 5. This may warrant a deeper look, but that WARN_ON() just shouldn't happen. Fix this by not returning true from rht_grow_above_100() when the rhashtable's max_size has been reached - in this case the user is explicitly configuring it to be at most that big, so even if it's now above 100% it shouldn't attempt to resize. This fixes the "lost insertion" issue and consequently allows my code to display its error (and verify my fix for it.) Signed-off-by: NJohannes Berg <johannes.berg@intel.com> Acked-by: NThomas Graf <tgraf@suug.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Anna Schumaker 提交于
This file is only used internally to the NFS v4 module, so it doesn't need to be in the global include path. I also renamed it from nfs_idmap.h to nfs4idmap.h to emphasize that it's an NFSv4-only include file. Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Anna Schumaker 提交于
The idmapper is completely internal to the NFS v4 module, so this macro will always evaluate to true. This patch also removes unnecessary includes of this file from the generic NFS client. Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Jeff Layton 提交于
v2: gracefully handle the case where some dentry pointers end up NULL and be more dilligent about zeroing out dentry pointers We currently have a problem that SELinux policy is being enforced when creating debugfs files. If a debugfs file is created as a side effect of doing some syscall, then that creation can fail if the SELinux policy for that process prevents it. This seems wrong. We don't do that for files under /proc, for instance, so Bruce has proposed a patch to fix that. While discussing that patch however, Greg K.H. stated: "No kernel code should care / fail if a debugfs function fails, so please fix up the sunrpc code first." This patch converts all of the sunrpc debugfs setup code to be void return functins, and the callers to not look for errors from those functions. This should allow rpc_clnt and rpc_xprt creation to work, even if the kernel fails to create debugfs files for some reason. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: N"J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: NJeff Layton <jeff.layton@primarydata.com> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Anna Schumaker 提交于
This patch adds a GETATTR to the end of ALLOCATE and DEALLOCATE operations so we can set the updated inode size and change attribute directly. DEALLOCATE will still need to release pagecache pages, so nfs42_proc_deallocate() now calls truncate_pagecache_range() before contacting the server. Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 23 4月, 2015 5 次提交
-
-
由 Johannes Berg 提交于
With the CPU iteration variable called 'i', it's relatively easy to have variable shadowing which sparse will warn about. Avoid that by renaming the variable to __cpu which is less likely to be used in the surrounding context. Signed-off-by: NJohannes Berg <johannes.berg@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Martin Schwidefsky 提交于
Introduce KEXEC_CONTROL_MEMORY_GFP to allow the architecture code to override the gfp flags of the allocation for the kexec control page. The loop in kimage_alloc_normal_control_pages allocates pages with GFP_KERNEL until a page is found that happens to have an address smaller than the KEXEC_CONTROL_MEMORY_LIMIT. On systems with a large memory size but a small KEXEC_CONTROL_MEMORY_LIMIT the loop will keep allocating memory until the oom killer steps in. Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Charles Keepax 提交于
Correct small copy and paste error where autodisable was not being enabled for the SOC_DAPM_SINGLE_TLV_AUTODISABLE control. Signed-off-by: NCharles Keepax <ckeepax@opensource.wolfsonmicro.com> Signed-off-by: NMark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org
-
由 Robert Shearman 提交于
Add per-device MPLS state to supported interfaces. Use the presence of this state in mpls_route_add to determine that this is a supported interface. Use the presence of mpls_dev to drop packets that arrived on an unsupported interface - previously they were allowed through. Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: NRobert Shearman <rshearma@brocade.com> Reviewed-by: N"Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Linus Torvalds 提交于
This reverts commit 8d63d99a. It causes in VM mapping refcount errors: page:ffffea0010a15040 count:0 mapcount:1 mapping: (null) index:0x0 flags: 0x8000000000008014(referenced|dirty|tail) page dumped because: VM_BUG_ON_PAGE(page_mapcount(page) != 0) ------------[ cut here ]------------ kernel BUG at mm/swap.c:134! as reported by Borislav Petkov Reported-and-tested-by: NBorislav Petkov <bp@alien8.de> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Hugh Dickins <hughd@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: linux-mm@kvack.org Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-