- 07 4月, 2016 1 次提交
-
-
由 Jiri Benc 提交于
Implement VXLAN-GPE. Only COLLECT_METADATA is supported for now (it is possible to support static configuration, too, if there is demand for it). The GPE header parsing has to be moved before iptunnel_pull_header, as we need to know the protocol. v2: Removed what was called "L2 mode" in v1 of the patchset. Only "L3 mode" (now called "raw mode") is added by this patch. This mode does not allow Ethernet header to be encapsulated in VXLAN-GPE when using ip route to specify the encapsulation, IP header is encapsulated instead. The patch does support Ethernet to be encapsulated, though, using ETH_P_TEB in skb->protocol. This will be utilized by other COLLECT_METADATA users (openvswitch in particular). If there is ever demand for Ethernet encapsulation with VXLAN-GPE using ip route, it's easy to add a new flag switching the interface to "Ethernet mode" (called "L2 mode" in v1 of this patchset). For now, leave this out, it seems we don't need it. Disallowed more flag combinations, especially RCO with GPE. Added comment explaining that GBP and GPE cannot be set together. Signed-off-by: NJiri Benc <jbenc@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 4月, 2016 2 次提交
-
-
由 Eric Dumazet 提交于
Reporting sk_drops to user space was available for UDP sockets using /proc interface. Add this to sock_diag, so that we can have the same information available to ss users, and we'll be able to add sk_drops indications for TCP sockets as well. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Soheil Hassas Yeganeh 提交于
Accept SO_TIMESTAMPING in control messages of the SOL_SOCKET level as a basis to accept timestamping requests per write. This implementation only accepts TX recording flags (i.e., SOF_TIMESTAMPING_TX_HARDWARE, SOF_TIMESTAMPING_TX_SOFTWARE, SOF_TIMESTAMPING_TX_SCHED, and SOF_TIMESTAMPING_TX_ACK) in control messages. Users need to set reporting flags (e.g., SOF_TIMESTAMPING_OPT_ID) per socket via socket options. This commit adds a tsflags field in sockcm_cookie which is set in __sock_cmsg_send. It only override the SOF_TIMESTAMPING_TX_* bits in sockcm_cookie.tsflags allowing the control message to override the recording behavior per write, yet maintaining the value of other flags. This patch implements validating the control message and setting tsflags in struct sockcm_cookie. Next commits in this series will actually implement timestamping per write for different protocols. Signed-off-by: NSoheil Hassas Yeganeh <soheil@google.com> Acked-by: NWillem de Bruijn <willemb@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 31 3月, 2016 1 次提交
-
-
由 Daniel Borkmann 提交于
Make the 2 byte padding in struct bpf_tunnel_key between tunnel_ttl and tunnel_label members explicit. No issue has been observed, and gcc/llvm does padding for the old struct already, where tunnel_label was not yet present, so the current code works, but since it's part of uapi, make sure we don't introduce holes in structs. Therefore, add tunnel_ext that we can use generically in future (f.e. to flag OAM messages for backends, etc). Also add the offset to the compat tests to be sure should some compilers not padd the tail of the old version of bpf_tunnel_key. Fixes: 4018ab18 ("bpf: support flow label for bpf_skb_{set, get}_tunnel_key") Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 23 3月, 2016 3 次提交
-
-
由 Dmitry Vyukov 提交于
kcov provides code coverage collection for coverage-guided fuzzing (randomized testing). Coverage-guided fuzzing is a testing technique that uses coverage feedback to determine new interesting inputs to a system. A notable user-space example is AFL (http://lcamtuf.coredump.cx/afl/). However, this technique is not widely used for kernel testing due to missing compiler and kernel support. kcov does not aim to collect as much coverage as possible. It aims to collect more or less stable coverage that is function of syscall inputs. To achieve this goal it does not collect coverage in soft/hard interrupts and instrumentation of some inherently non-deterministic or non-interesting parts of kernel is disbled (e.g. scheduler, locking). Currently there is a single coverage collection mode (tracing), but the API anticipates additional collection modes. Initially I also implemented a second mode which exposes coverage in a fixed-size hash table of counters (what Quentin used in his original patch). I've dropped the second mode for simplicity. This patch adds the necessary support on kernel side. The complimentary compiler support was added in gcc revision 231296. We've used this support to build syzkaller system call fuzzer, which has found 90 kernel bugs in just 2 months: https://github.com/google/syzkaller/wiki/Found-Bugs We've also found 30+ bugs in our internal systems with syzkaller. Another (yet unexplored) direction where kcov coverage would greatly help is more traditional "blob mutation". For example, mounting a random blob as a filesystem, or receiving a random blob over wire. Why not gcov. Typical fuzzing loop looks as follows: (1) reset coverage, (2) execute a bit of code, (3) collect coverage, repeat. A typical coverage can be just a dozen of basic blocks (e.g. an invalid input). In such context gcov becomes prohibitively expensive as reset/collect coverage steps depend on total number of basic blocks/edges in program (in case of kernel it is about 2M). Cost of kcov depends only on number of executed basic blocks/edges. On top of that, kernel requires per-thread coverage because there are always background threads and unrelated processes that also produce coverage. With inlined gcov instrumentation per-thread coverage is not possible. kcov exposes kernel PCs and control flow to user-space which is insecure. But debugfs should not be mapped as user accessible. Based on a patch by Quentin Casasnovas. [akpm@linux-foundation.org: make task_struct.kcov_mode have type `enum kcov_mode'] [akpm@linux-foundation.org: unbreak allmodconfig] [akpm@linux-foundation.org: follow x86 Makefile layout standards] Signed-off-by: NDmitry Vyukov <dvyukov@google.com> Reviewed-by: NKees Cook <keescook@chromium.org> Cc: syzkaller <syzkaller@googlegroups.com> Cc: Vegard Nossum <vegard.nossum@oracle.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Tavis Ormandy <taviso@google.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: Kostya Serebryany <kcc@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Kees Cook <keescook@google.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: David Drysdale <drysdale@google.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Alexandre Bounine 提交于
Add mport character device driver to provide user space interface to basic RapidIO subsystem operations. See included Documentation/rapidio/mport_cdev.txt for more details. [akpm@linux-foundation.org: fix printk warning on i386] [dan.carpenter@oracle.com: mport_cdev: fix some error codes] Signed-off-by: NAlexandre Bounine <alexandre.bounine@idt.com> Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Tested-by: NBarry Wood <barry.wood@idt.com> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: Aurelien Jacquiot <a-jacquiot@ti.com> Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com> Cc: Barry Wood <barry.wood@idt.com> Cc: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 David Decotigny 提交于
Updates: commit 793cf87d ("ethtool: Set cmd field in ETHTOOL_GLINKSETTINGS response to wrong nwords") Signed-off-by: NDavid Decotigny <decot@googlers.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 22 3月, 2016 2 次提交
-
-
由 Eli Cohen 提交于
Add two new NLAs to support configuration of Infiniband node or port GUIDs. New applications can choose to use this interface to configure GUIDs with iproute2 with commands such as: ip link set dev ib0 vf 0 node_guid 00:02:c9:03:00:21:6e:70 ip link set dev ib0 vf 0 port_guid 00:02:c9:03:00:21:6e:78 A new ndo, ndo_sef_vf_guid is introduced to notify the net device of the request to change the GUID. Signed-off-by: NEli Cohen <eli@mellanox.com> Reviewed-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Eric Dumazet 提交于
It can be useful to report dev->gso_max_segs and dev->gso_max_size so that "ip -d link" can display them to help debugging. For the moment, these attributes are read-only. Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Petri Gynther <pgynther@google.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 19 3月, 2016 1 次提交
-
-
由 Zhang Shengju 提交于
Fix a comment typo. Signed-off-by: NZhang Shengju <zhangshengju@cmss.chinamobile.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 18 3月, 2016 4 次提交
-
-
由 Jaegeuk Kim 提交于
This patch adds the renamed functions moved from the f2fs crypto files. 1. definitions for per-file encryption used by ext4 and f2fs. 2. crypto.c for encrypt/decrypt functions a. IO preparation: - fscrypt_get_ctx / fscrypt_release_ctx b. before IOs: - fscrypt_encrypt_page - fscrypt_decrypt_page - fscrypt_zeroout_range c. after IOs: - fscrypt_decrypt_bio_pages - fscrypt_pullback_bio_page - fscrypt_restore_control_page 3. policy.c supporting context management. a. For ioctls: - fscrypt_process_policy - fscrypt_get_policy b. For context permission - fscrypt_has_permitted_context - fscrypt_inherit_context 4. keyinfo.c to handle permissions - fscrypt_get_encryption_info - fscrypt_free_encryption_info 5. fname.c to support filename encryption a. general wrapper functions - fscrypt_fname_disk_to_usr - fscrypt_fname_usr_to_disk - fscrypt_setup_filename - fscrypt_free_filename b. specific filename handling functions - fscrypt_fname_alloc_buffer - fscrypt_fname_free_buffer 6. Makefile and Kconfig Cc: Al Viro <viro@ftp.linux.org.uk> Signed-off-by: NMichael Halcrow <mhalcrow@google.com> Signed-off-by: NIldar Muslukhov <ildarm@google.com> Signed-off-by: NUday Savagaonkar <savagaon@google.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Denys Vlasenko 提交于
Sometimes gcc mysteriously doesn't inline very small functions we expect to be inlined. See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66122 With this .config: http://busybox.net/~vda/kernel_config_OPTIMIZE_INLINING_and_Os, the following functions get deinlined many times. Examples of disassembly: <get_unaligned_be16> (12 copies, 51 calls): 66 8b 07 mov (%rdi),%ax 55 push %rbp 48 89 e5 mov %rsp,%rbp 86 e0 xchg %ah,%al 5d pop %rbp c3 retq <get_unaligned_be32> (12 copies, 135 calls): 8b 07 mov (%rdi),%eax 55 push %rbp 48 89 e5 mov %rsp,%rbp 0f c8 bswap %eax 5d pop %rbp c3 retq <get_unaligned_be64> (2 copies, 20 calls): 48 8b 07 mov (%rdi),%rax 55 push %rbp 48 89 e5 mov %rsp,%rbp 48 0f c8 bswap %rax 5d pop %rbp c3 retq <__swab16p> (16 copies, 146 calls): 55 push %rbp 89 f8 mov %edi,%eax 86 e0 xchg %ah,%al 48 89 e5 mov %rsp,%rbp 5d pop %rbp c3 retq <__swab32p> (43 copies, ~560 calls): 55 push %rbp 89 f8 mov %edi,%eax 0f c8 bswap %eax 48 89 e5 mov %rsp,%rbp 5d pop %rbp c3 retq <__swab64p> (21 copies, 119 calls): 55 push %rbp 48 89 f8 mov %rdi,%rax 48 0f c8 bswap %rax 48 89 e5 mov %rsp,%rbp 5d pop %rbp c3 retq <__swab32s> (6 copies, 47 calls): 8b 07 mov (%rdi),%eax 55 push %rbp 48 89 e5 mov %rsp,%rbp 0f c8 bswap %eax 89 07 mov %eax,(%rdi) 5d pop %rbp c3 retq This patch fixes this via s/inline/__always_inline/. Code size decrease after the patch is ~4.5k: text data bss dec hex filename 92202377 20826112 36417536 149446025 8e85d89 vmlinux 92197848 20826112 36417536 149441496 8e84bd8 vmlinux5_swap_after Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com> Acked-by: NIngo Molnar <mingo@kernel.org> Cc: Thomas Graf <tgraf@suug.ch> Cc: Peter Zijlstra <peterz@infradead.org> Cc: David Rientjes <rientjes@google.com> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Rob Landley 提交于
The v850 port was removed by commits f606ddf4 and 07a887d3 in 2008. These #defines are not used in the current kernel. Signed-off-by: NRob Landley <rob@landley.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Igor Redko 提交于
Add a new field, VIRTIO_BALLOON_S_AVAIL, to virtio_balloon memory statistics protocol, corresponding to 'Available' in /proc/meminfo. It indicates to the hypervisor how big the balloon can be inflated without pushing the guest system to swap. Signed-off-by: NIgor Redko <redkoi@virtuozzo.com> Signed-off-by: NDenis V. Lunev <den@openvz.org> Reviewed-by: NRoman Kagan <rkagan@virtuozzo.com> Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 16 3月, 2016 2 次提交
-
-
由 Ian Kent 提交于
Fix some white space format errors. Signed-off-by: NIan Kent <raven@themaw.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ian Kent 提交于
Try and make the coding style completely consistent throughtout the autofs module and inline with kernel coding style recommendations. Signed-off-by: NIan Kent <raven@themaw.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 15 3月, 2016 3 次提交
-
-
由 Jarno Rajahalme 提交于
Extend OVS conntrack interface to cover NAT. New nested OVS_CT_ATTR_NAT attribute may be used to include NAT with a CT action. A bare OVS_CT_ATTR_NAT only mangles existing and expected connections. If OVS_NAT_ATTR_SRC or OVS_NAT_ATTR_DST is included within the nested attributes, new (non-committed/non-confirmed) connections are mangled according to the rest of the nested attributes. The corresponding OVS userspace patch series includes test cases (in tests/system-traffic.at) that also serve as example uses. This work extends on a branch by Thomas Graf at https://github.com/tgraf/ovs/tree/nat. Signed-off-by: NJarno Rajahalme <jarno@ovn.org> Acked-by: NThomas Graf <tgraf@suug.ch> Acked-by: NJoe Stringer <joe@ovn.org> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Jarno Rajahalme 提交于
Remove the definition of IP_CT_NEW_REPLY from the kernel as it does not make sense. This allows the definition of IP_CT_NUMBER to be simplified as well. Signed-off-by: NJarno Rajahalme <jarno@ovn.org> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Martin KaFai Lau 提交于
Per RFC4898, they count segments sent/received containing a positive length data segment (that includes retransmission segments carrying data). Unlike tcpi_segs_out/in, tcpi_data_segs_out/in excludes segments carrying no data (e.g. pure ack). The patch also updates the segs_in in tcp_fastopen_add_skb() so that segs_in >= data_segs_in property is kept. Together with retransmission data, tcpi_data_segs_out gives a better signal on the rxmit rate. v6: Rebase on the latest net-next v5: Eric pointed out that checking skb->len is still needed in tcp_fastopen_add_skb() because skb can carry a FIN without data. Hence, instead of open coding segs_in and data_segs_in, tcp_segs_in() helper is used. Comment is added to the fastopen case to explain why segs_in has to be reset and tcp_segs_in() has to be called before __skb_pull(). v4: Add comment to the changes in tcp_fastopen_add_skb() and also add remark on this case in the commit message. v3: Add const modifier to the skb parameter in tcp_segs_in() v2: Rework based on recent fix by Eric: commit a9d99ce2 ("tcp: fix tcpi_segs_in after connection establishment") Signed-off-by: NMartin KaFai Lau <kafai@fb.com> Cc: Chris Rapier <rapier@psc.edu> Cc: Eric Dumazet <edumazet@google.com> Cc: Marcelo Ricardo Leitner <mleitner@redhat.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Acked-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 14 3月, 2016 2 次提交
-
-
由 Sabrina Dubroca 提交于
Signed-off-by: NSabrina Dubroca <sd@queasysnail.net> Reviewed-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Zhang Shengju 提交于
This patch adds macro NETCONFA_ALL to represent all type of netconf attributes for IPv4 and IPv6. Signed-off-by: NZhang Shengju <zhangshengju@cmss.chinamobile.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 12 3月, 2016 3 次提交
-
-
由 Daniel Borkmann 提交于
This patch extends bpf_tunnel_key with a tunnel_label member, that maps to ip_tunnel_key's label so underlying backends like vxlan and geneve can propagate the label to udp_tunnel6_xmit_skb(), where it's being set in the IPv6 header. It allows for having 20 more bits to encode/decode flow related meta information programmatically. Tested with vxlan and geneve. Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniel Borkmann 提交于
This work adds support for setting the IPv6 flow label for geneve per device and through collect metadata (ip_tunnel_key) frontends. Also here, the geneve dst cache does not need any special considerations, for the cases where caches can be used, the label is static per cache. Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniel Borkmann 提交于
This work adds support for setting the IPv6 flow label for vxlan per device and through collect metadata (ip_tunnel_key) frontends. The vxlan dst cache does not need any special considerations here, for the cases where caches can be used, the label is static per cache. Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 3月, 2016 5 次提交
-
-
由 Sheng Yang 提交于
TCMU_MAILBOX_FLAG_CAP_OOOC was introduced, and userspace can check the flag for out-of-order completion capability support. Also update the document on how to use the feature. Signed-off-by: NSheng Yang <sheng@yasker.org> Reviewed-by: NAndy Grover <agrover@redhat.com> Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>
-
由 Jason Wang 提交于
This patch tries to poll for new added tx buffer or socket receive queue for a while at the end of tx/rx processing. The maximum time spent on polling were specified through a new kind of vring ioctl. Signed-off-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 Andrew Duggan 提交于
Synaptics uses the Register Mapped Interface (RMI) protocol as a communications interface for their devices. This driver adds the core functionality needed to interface with RMI4 devices. RMI devices can be connected to the host via several transport protocols and can supports a wide variety of functionality defined by RMI functions. Support for transport protocols and RMI functions are implemented in individual drivers. The RMI4 core driver uses a bus architecture to facilitate the various combinations of transport and function drivers needed by a particular device. Signed-off-by: NAndrew Duggan <aduggan@synaptics.com> Signed-off-by: NChristopher Heiny <cheiny@synaptics.com> Tested-by: NBenjamin Tissoires <benjamin.tissoires@redhat.com> Tested-by: NLinus Walleij <linus.walleij@linaro.org> Tested-by: NBjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: NDmitry Torokhov <dmitry.torokhov@gmail.com>
-
由 Amir Vadai 提交于
This patch is based on a patch made by John Fastabend. It adds support for offloading cls_flower. when NETIF_F_HW_TC is on: flags = 0 => Rule will be processed twice - by hardware, and if still relevant, by software. flags = SKIP_HW => Rull will be processed by software only If hardware fail/not capabale to apply the rule, operation will NOT fail. Filter will be processed by SW only. Acked-by: NJiri Pirko <jiri@mellanox.com> Suggested-by: NJohn Fastabend <john.r.fastabend@intel.com> Signed-off-by: NAmir Vadai <amir@vadai.me> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Mauro Carvalho Chehab 提交于
The legacy media controller userspace API exposes entity types that carry both type and function information. The new API replaces the type with a function. It preserves backward compatibility by defining legacy functions for the existing types and using them in drivers. This works fine, as long as newer entity functions won't be added. Unfortunately, some tools, like media-ctl with --print-dot argument rely on the now legacy MEDIA_ENT_T_V4L2_SUBDEV and MEDIA_ENT_T_DEVNODE numeric ranges to identify what entities will be shown. Also, if the entity doesn't match those ranges, it will ignore the major/minor information on devnodes, and won't be getting the devnode name via udev or sysfs. As we're now adding devices outside the old range, the legacy ioctl needs to map the new entity functions into a type at the old range, or otherwise we'll have a regression. Detected on all released media-ctl versions (e. g. versions <= 1.10). Fix this by deriving the type from the function to emulate the legacy API if the function isn't in the legacy functions range. Reported-by: NLaurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: NMauro Carvalho Chehab <mchehab@osg.samsung.com>
-
- 10 3月, 2016 2 次提交
-
-
由 Linus Walleij 提交于
The previous 'o' is in conflict and not very orderly assigned. We want to select an ioctl() major that does not conflict with the existining ones. Add the new reserved major (0xB4) to Documentation/ioctl/ioctl-number.txt Fixes: 3c702e99 ("gpio: add a userspace chardev ABI for GPIOs") Suggested-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
-
由 Tom Herbert 提交于
This module implements the Kernel Connection Multiplexor. Kernel Connection Multiplexor (KCM) is a facility that provides a message based interface over TCP for generic application protocols. With KCM an application can efficiently send and receive application protocol messages over TCP using datagram sockets. For more information see the included Documentation/networking/kcm.txt Signed-off-by: NTom Herbert <tom@herbertland.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 09 3月, 2016 4 次提交
-
-
由 Alexei Starovoitov 提交于
If kprobe is placed on spin_unlock then calling kmalloc/kfree from bpf programs is not safe, since the following dead lock is possible: kfree->spin_lock(kmem_cache_node->lock)...spin_unlock->kprobe-> bpf_prog->map_update->kmalloc->spin_lock(of the same kmem_cache_node->lock) and deadlocks. The following solutions were considered and some implemented, but eventually discarded - kmem_cache_create for every map - add recursion check to slow-path of slub - use reserved memory in bpf_map_update for in_irq or in preempt_disabled - kmalloc via irq_work At the end pre-allocation of all map elements turned out to be the simplest solution and since the user is charged upfront for all the memory, such pre-allocation doesn't affect the user space visible behavior. Since it's impossible to tell whether kprobe is triggered in a safe location from kmalloc point of view, use pre-allocation by default and introduce new BPF_F_NO_PREALLOC flag. While testing of per-cpu hash maps it was discovered that alloc_percpu(GFP_ATOMIC) has odd corner cases and often fails to allocate memory even when 90% of it is free. The pre-allocation of per-cpu hash elements solves this problem as well. Turned out that bpf_map_update() quickly followed by bpf_map_lookup()+bpf_map_delete() is very common pattern used in many of iovisor/bcc/tools, so there is additional benefit of pre-allocation, since such use cases are must faster. Since all hash map elements are now pre-allocated we can remove atomic increment of htab->count and save few more cycles. Also add bpf_map_precharge_memlock() to check rlimit_memlock early to avoid large malloc/free done by users who don't have sufficient limits. Pre-allocation is done with vmalloc and alloc/free is done via percpu_freelist. Here are performance numbers for different pre-allocation algorithms that were implemented, but discarded in favor of percpu_freelist: 1 cpu: pcpu_ida 2.1M pcpu_ida nolock 2.3M bt 2.4M kmalloc 1.8M hlist+spinlock 2.3M pcpu_freelist 2.6M 4 cpu: pcpu_ida 1.5M pcpu_ida nolock 1.8M bt w/smp_align 1.7M bt no/smp_align 1.1M kmalloc 0.7M hlist+spinlock 0.2M pcpu_freelist 2.0M 8 cpu: pcpu_ida 0.7M bt w/smp_align 0.8M kmalloc 0.4M pcpu_freelist 1.5M 32 cpu: kmalloc 0.13M pcpu_freelist 0.49M pcpu_ida nolock is a modified percpu_ida algorithm without percpu_ida_cpu locks and without cross-cpu tag stealing. It's faster than existing percpu_ida, but not as fast as pcpu_freelist. bt is a variant of block/blk-mq-tag.c simlified and customized for bpf use case. bt w/smp_align is using cache line for every 'long' (similar to blk-mq-tag). bt no/smp_align allocates 'long' bitmasks continuously to save memory. It's comparable to percpu_ida and in some cases faster, but slower than percpu_freelist hlist+spinlock is the simplest free list with single spinlock. As expeceted it has very bad scaling in SMP. kmalloc is existing implementation which is still available via BPF_F_NO_PREALLOC flag. It's significantly slower in single cpu and in 8 cpu setup it's 3 times slower than pre-allocation with pcpu_freelist, but saves memory, so in cases where map->max_entries can be large and number of map update/delete per second is low, it may make sense to use it. Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniel Borkmann 提交于
After eBPF being able to programmatically access/manage tunnel key meta data via commit d3aa45ce ("bpf: add helpers to access tunnel metadata") and more recently also for IPv6 through c6c33454 ("bpf: support ipv6 for bpf_skb_{set,get}_tunnel_key"), this work adds two complementary helpers to generically access their auxiliary tunnel options. Geneve and vxlan support this facility. For geneve, TLVs can be pushed, and for the vxlan case its GBP extension. I.e. setting tunnel key for geneve case only makes sense, if we can also read/write TLVs into it. In the GBP case, it provides the flexibility to easily map the group policy ID in combination with other helpers or maps. I chose to model this as two separate helpers, bpf_skb_{set,get}_tunnel_opt(), for a couple of reasons. bpf_skb_{set,get}_tunnel_key() is already rather complex by itself, and there may be cases for tunnel key backends where tunnel options are not always needed. If we would have integrated this into bpf_skb_{set,get}_tunnel_key() nevertheless, we are very limited with remaining helper arguments, so keeping compatibility on structs in case of passing in a flat buffer gets more cumbersome. Separating both also allows for more flexibility and future extensibility, f.e. options could be fed directly from a map, etc. Moreover, change geneve's xmit path to test only for info->options_len instead of TUNNEL_GENEVE_OPT flag. This makes it more consistent with vxlan's xmit path and allows for avoiding to specify a protocol flag in the API on xmit, so it can be protocol agnostic. Having info->options_len is enough information that is needed. Tested with vxlan and geneve. Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniel Borkmann 提交于
Added by 9a628224 ("ip_tunnel: Add dont fragment flag."), allow to feed df flag into tunneling facilities (currently supported on TX by vxlan, geneve and gre) as a hint from eBPF's bpf_skb_set_tunnel_key() helper. Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniel Borkmann 提交于
When overwriting parts of the packet with bpf_skb_store_bytes() that were fed previously into skb->hash calculation, we should clear the current hash with skb_clear_hash(), so that a next skb_get_hash() call can determine the correct hash related to this skb. Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 08 3月, 2016 1 次提交
-
-
由 Wilson Ding 提交于
Armada-3700's uart is a simple serial port, which doesn't support. Configuring the modem control lines. The uart port has a 32 bytes Tx FIFO and a 64 bytes Rx FIFO The uart driver implements the uart core operations. It also support the system (early) console based on Armada-3700's serial port. Known Issue: The uart driver currently doesn't support clock programming, which means the baud-rate stays with the default value configured by the bootloader at boot time [gregory.clement@free-electrons.com: Rewrite many part which are too long to enumerate] Signed-off-by: NWilson Ding <dingwei@marvell.com> Signed-off-by: NNadav Haklai <nadavh@marvell.com> Signed-off-by: NGregory CLEMENT <gregory.clement@free-electrons.com> Acked-by: NRob Herring <robh@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 06 3月, 2016 2 次提交
-
-
由 Dan Williams 提交于
Add the boiler-plate for a 'clear error' command based on section 9.20.7.6 "Function Index 4 - Clear Uncorrectable Error" from the ACPI 6.1 specification, and add a reference implementation in nfit_test. Reviewed-by: NVishal Verma <vishal.l.verma@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Reilly Grant 提交于
The new USBDEVFS_DROP_PRIVILEGES ioctl allows a process to voluntarily relinquish the ability to issue other ioctls that may interfere with other processes and drivers that have claimed an interface on the device. This commit also includes a simple utility to be able to test the ioctl, located at Documentation/usb/usbdevfs-drop-permissions.c Example (with qemu-kvm's input device): $ lsusb ... Bus 001 Device 002: ID 0627:0001 Adomax Technology Co., Ltd $ usb-devices ... C: #Ifs= 1 Cfg#= 1 Atr=a0 MxPwr=100mA I: If#= 0 Alt= 0 #EPs= 1 Cls=03(HID ) Sub=00 Prot=02 Driver=usbhid $ sudo ./usbdevfs-drop-permissions /dev/bus/usb/001/002 OK: privileges dropped! Available options: [0] Exit now [1] Reset device. Should fail if device is in use [2] Claim 4 interfaces. Should succeed where not in use [3] Narrow interface permission mask Which option shall I run?: 1 ERROR: USBDEVFS_RESET failed! (1 - Operation not permitted) Which test shall I run next?: 2 ERROR claiming if 0 (1 - Operation not permitted) ERROR claiming if 1 (1 - Operation not permitted) ERROR claiming if 2 (1 - Operation not permitted) ERROR claiming if 3 (1 - Operation not permitted) Which test shall I run next?: 0 After unbinding usbhid: $ usb-devices ... I: If#= 0 Alt= 0 #EPs= 1 Cls=03(HID ) Sub=00 Prot=02 Driver=(none) $ sudo ./usbdevfs-drop-permissions /dev/bus/usb/001/002 ... Which option shall I run?: 2 OK: claimed if 0 ERROR claiming if 1 (1 - Operation not permitted) ERROR claiming if 2 (1 - Operation not permitted) ERROR claiming if 3 (1 - Operation not permitted) Which test shall I run next?: 1 OK: USBDEVFS_RESET succeeded Which test shall I run next?: 0 After unbinding usbhid and restricting the mask: $ sudo ./usbdevfs-drop-permissions /dev/bus/usb/001/002 ... Which option shall I run?: 3 Insert new mask: 0 OK: privileges dropped! Which test shall I run next?: 2 ERROR claiming if 0 (1 - Operation not permitted) ERROR claiming if 1 (1 - Operation not permitted) ERROR claiming if 2 (1 - Operation not permitted) ERROR claiming if 3 (1 - Operation not permitted) Signed-off-by: NReilly Grant <reillyg@chromium.org> Acked-by: NAlan Stern <stern@rowland.harvard.edu> Signed-off-by: NEmilio López <emilio.lopez@collabora.co.uk> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 05 3月, 2016 2 次提交
-
-
由 Nicolas Dichtel 提交于
INT_MAX needs limits.h in userland. When ethtool.h is included by a userland app, we got the following error: .../usr/include/linux/ethtool.h: In function 'ethtool_validate_speed': .../usr/include/linux/ethtool.h:1471:18: error: 'INT_MAX' undeclared (first use in this function) return speed <= INT_MAX || speed == SPEED_UNKNOWN ^ Fixes: e02564ee ("ethtool: make validate_speed accept all speeds between 0 and INT_MAX") CC: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Nicolas Dichtel 提交于
DIV_ROUND_UP is defined in linux/kernel.h only for the kernel. When ethtool.h is included by a userland app, we got the following error: include/linux/ethtool.h:1218:8: error: variably modified 'queue_mask' at file scope __u32 queue_mask[DIV_ROUND_UP(MAX_NUM_QUEUE, 32)]; ^ Let's add a common definition in uapi and use it everywhere. Fixes: ac2c7ad0 ("net/ethtool: introduce a new ioctl for per queue setting") CC: Kan Liang <kan.liang@intel.com> Suggested-by: NBen Hutchings <ben@decadent.org.uk> Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-