1. 20 4月, 2016 2 次提交
  2. 16 4月, 2016 2 次提交
    • X
      sctp: add the sctp_diag.c file · 8f840e47
      Xin Long 提交于
      This one will implement all the interface of inet_diag, inet_diag_handler.
      which includes sctp_diag_dump, sctp_diag_dump_one and sctp_diag_get_info.
      
      It will work as a module, and register inet_diag_handler when loading.
      
      v2->v3:
      - fix the mistake in inet_assoc_attr_size().
      
      - change inet_diag_msg_laddrs_fill() name to inet_diag_msg_sctpladdrs_fill.
      
      - change inet_diag_msg_paddrs_fill() name to inet_diag_msg_sctpaddrs_fill.
      
      - add inet_diag_msg_sctpinfo_fill() to make asoc/ep fill code clearer.
      
      - add inet_diag_msg_sctpasoc_fill() to make asoc fill code clearer.
      
      - merge inet_asoc_diag_fill() and inet_ep_diag_fill() to
        inet_sctp_diag_fill().
      
      - call sctp_diag_get_info() directly, instead by handler, cause the caller
        is in the same file with it.
      
      - call lock_sock in sctp_tsp_dump_one() to make sure we call get sctp info
        safely.
      
      - after lock_sock(sk), we should check sk != assoc->base.sk.
      
      - change mem[SK_MEMINFO_WMEM_ALLOC] to asoc->sndbuf_used for asoc dump when
        asoc->ep->sndbuf_policy is set. don't use INET_DIAG_MEMINFO attr any more.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8f840e47
    • P
      net/hsr: Added support for HSR v1 · ee1c2797
      Peter Heise 提交于
      This patch adds support for the newer version 1 of the HSR
      networking standard. Version 0 is still default and the new
      version has to be selected via iproute2.
      
      Main changes are in the supervision frame handling and its
      ethertype field.
      Signed-off-by: NPeter Heise <peter.heise@airbus.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ee1c2797
  3. 15 4月, 2016 2 次提交
  4. 12 4月, 2016 2 次提交
  5. 08 4月, 2016 1 次提交
    • A
      perf, bpf: allow bpf programs attach to tracepoints · 98b5c2c6
      Alexei Starovoitov 提交于
      introduce BPF_PROG_TYPE_TRACEPOINT program type and allow it to be attached
      to the perf tracepoint handler, which will copy the arguments into
      the per-cpu buffer and pass it to the bpf program as its first argument.
      The layout of the fields can be discovered by doing
      'cat /sys/kernel/debug/tracing/events/sched/sched_switch/format'
      prior to the compilation of the program with exception that first 8 bytes
      are reserved and not accessible to the program. This area is used to store
      the pointer to 'struct pt_regs' which some of the bpf helpers will use:
      +---------+
      | 8 bytes | hidden 'struct pt_regs *' (inaccessible to bpf program)
      +---------+
      | N bytes | static tracepoint fields defined in tracepoint/format (bpf readonly)
      +---------+
      | dynamic | __dynamic_array bytes of tracepoint (inaccessible to bpf yet)
      +---------+
      
      Not that all of the fields are already dumped to user space via perf ring buffer
      and broken application access it directly without consulting tracepoint/format.
      Same rule applies here: static tracepoint fields should only be accessed
      in a format defined in tracepoint/format. The order of fields and
      field sizes are not an ABI.
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      98b5c2c6
  6. 07 4月, 2016 2 次提交
    • S
      virtio: add VIRTIO_CONFIG_S_NEEDS_RESET device status bit · c00bbcf8
      Stefan Hajnoczi 提交于
      The VIRTIO 1.0 specification added the DEVICE_NEEDS_RESET device status
      bit in "VIRTIO-98: Add DEVICE_NEEDS_RESET".  This patch defines the
      device status bit in the uapi header file so that both the kernel and
      userspace applications can use it.
      
      The bit is currently unused by the virtio guest drivers and vhost.
      According to the spec "a good implementation will try to recover by
      issuing a reset".  This is not attempted here because it requires
      auditing the virtio drivers to ensure there are no resource leaks or
      crashes if the device needs to be reset mid-operation.
      
      See "2.1 Device Status Field" in the VIRTIO 1.0 specification for
      details.
      Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      c00bbcf8
    • J
      vxlan: implement GPE · e1e5314d
      Jiri Benc 提交于
      Implement VXLAN-GPE. Only COLLECT_METADATA is supported for now (it is
      possible to support static configuration, too, if there is demand for it).
      
      The GPE header parsing has to be moved before iptunnel_pull_header, as we
      need to know the protocol.
      
      v2: Removed what was called "L2 mode" in v1 of the patchset. Only "L3 mode"
          (now called "raw mode") is added by this patch. This mode does not allow
          Ethernet header to be encapsulated in VXLAN-GPE when using ip route to
          specify the encapsulation, IP header is encapsulated instead. The patch
          does support Ethernet to be encapsulated, though, using ETH_P_TEB in
          skb->protocol. This will be utilized by other COLLECT_METADATA users
          (openvswitch in particular).
      
          If there is ever demand for Ethernet encapsulation with VXLAN-GPE using
          ip route, it's easy to add a new flag switching the interface to
          "Ethernet mode" (called "L2 mode" in v1 of this patchset). For now,
          leave this out, it seems we don't need it.
      
          Disallowed more flag combinations, especially RCO with GPE.
          Added comment explaining that GBP and GPE cannot be set together.
      Signed-off-by: NJiri Benc <jbenc@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e1e5314d
  7. 06 4月, 2016 1 次提交
  8. 05 4月, 2016 4 次提交
  9. 31 3月, 2016 1 次提交
    • D
      bpf: make padding in bpf_tunnel_key explicit · c0e760c9
      Daniel Borkmann 提交于
      Make the 2 byte padding in struct bpf_tunnel_key between tunnel_ttl
      and tunnel_label members explicit. No issue has been observed, and
      gcc/llvm does padding for the old struct already, where tunnel_label
      was not yet present, so the current code works, but since it's part
      of uapi, make sure we don't introduce holes in structs.
      
      Therefore, add tunnel_ext that we can use generically in future
      (f.e. to flag OAM messages for backends, etc). Also add the offset
      to the compat tests to be sure should some compilers not padd the
      tail of the old version of bpf_tunnel_key.
      
      Fixes: 4018ab18 ("bpf: support flow label for bpf_skb_{set, get}_tunnel_key")
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c0e760c9
  10. 30 3月, 2016 1 次提交
  11. 29 3月, 2016 2 次提交
  12. 23 3月, 2016 3 次提交
    • D
      kernel: add kcov code coverage · 5c9a8750
      Dmitry Vyukov 提交于
      kcov provides code coverage collection for coverage-guided fuzzing
      (randomized testing).  Coverage-guided fuzzing is a testing technique
      that uses coverage feedback to determine new interesting inputs to a
      system.  A notable user-space example is AFL
      (http://lcamtuf.coredump.cx/afl/).  However, this technique is not
      widely used for kernel testing due to missing compiler and kernel
      support.
      
      kcov does not aim to collect as much coverage as possible.  It aims to
      collect more or less stable coverage that is function of syscall inputs.
      To achieve this goal it does not collect coverage in soft/hard
      interrupts and instrumentation of some inherently non-deterministic or
      non-interesting parts of kernel is disbled (e.g.  scheduler, locking).
      
      Currently there is a single coverage collection mode (tracing), but the
      API anticipates additional collection modes.  Initially I also
      implemented a second mode which exposes coverage in a fixed-size hash
      table of counters (what Quentin used in his original patch).  I've
      dropped the second mode for simplicity.
      
      This patch adds the necessary support on kernel side.  The complimentary
      compiler support was added in gcc revision 231296.
      
      We've used this support to build syzkaller system call fuzzer, which has
      found 90 kernel bugs in just 2 months:
      
        https://github.com/google/syzkaller/wiki/Found-Bugs
      
      We've also found 30+ bugs in our internal systems with syzkaller.
      Another (yet unexplored) direction where kcov coverage would greatly
      help is more traditional "blob mutation".  For example, mounting a
      random blob as a filesystem, or receiving a random blob over wire.
      
      Why not gcov.  Typical fuzzing loop looks as follows: (1) reset
      coverage, (2) execute a bit of code, (3) collect coverage, repeat.  A
      typical coverage can be just a dozen of basic blocks (e.g.  an invalid
      input).  In such context gcov becomes prohibitively expensive as
      reset/collect coverage steps depend on total number of basic
      blocks/edges in program (in case of kernel it is about 2M).  Cost of
      kcov depends only on number of executed basic blocks/edges.  On top of
      that, kernel requires per-thread coverage because there are always
      background threads and unrelated processes that also produce coverage.
      With inlined gcov instrumentation per-thread coverage is not possible.
      
      kcov exposes kernel PCs and control flow to user-space which is
      insecure.  But debugfs should not be mapped as user accessible.
      
      Based on a patch by Quentin Casasnovas.
      
      [akpm@linux-foundation.org: make task_struct.kcov_mode have type `enum kcov_mode']
      [akpm@linux-foundation.org: unbreak allmodconfig]
      [akpm@linux-foundation.org: follow x86 Makefile layout standards]
      Signed-off-by: NDmitry Vyukov <dvyukov@google.com>
      Reviewed-by: NKees Cook <keescook@chromium.org>
      Cc: syzkaller <syzkaller@googlegroups.com>
      Cc: Vegard Nossum <vegard.nossum@oracle.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Tavis Ormandy <taviso@google.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
      Cc: Kostya Serebryany <kcc@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Kees Cook <keescook@google.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: David Drysdale <drysdale@google.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Kirill A. Shutemov <kirill@shutemov.name>
      Cc: Jiri Slaby <jslaby@suse.cz>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5c9a8750
    • A
      rapidio: add mport char device driver · e8de3701
      Alexandre Bounine 提交于
      Add mport character device driver to provide user space interface to
      basic RapidIO subsystem operations.
      
      See included Documentation/rapidio/mport_cdev.txt for more details.
      
      [akpm@linux-foundation.org: fix printk warning on i386]
      [dan.carpenter@oracle.com: mport_cdev: fix some error codes]
      Signed-off-by: NAlexandre Bounine <alexandre.bounine@idt.com>
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Tested-by: NBarry Wood <barry.wood@idt.com>
      Cc: Matt Porter <mporter@kernel.crashing.org>
      Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
      Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
      Cc: Barry Wood <barry.wood@idt.com>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e8de3701
    • D
      ethtool: minor doc update · 5f2d4724
      David Decotigny 提交于
      Updates: commit 793cf87d ("ethtool: Set cmd field in
               ETHTOOL_GLINKSETTINGS response to wrong nwords")
      Signed-off-by: NDavid Decotigny <decot@googlers.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5f2d4724
  13. 22 3月, 2016 2 次提交
  14. 19 3月, 2016 1 次提交
  15. 18 3月, 2016 4 次提交
  16. 16 3月, 2016 2 次提交
  17. 15 3月, 2016 3 次提交
    • J
      openvswitch: Interface with NAT. · 05752523
      Jarno Rajahalme 提交于
      Extend OVS conntrack interface to cover NAT.  New nested
      OVS_CT_ATTR_NAT attribute may be used to include NAT with a CT action.
      A bare OVS_CT_ATTR_NAT only mangles existing and expected connections.
      If OVS_NAT_ATTR_SRC or OVS_NAT_ATTR_DST is included within the nested
      attributes, new (non-committed/non-confirmed) connections are mangled
      according to the rest of the nested attributes.
      
      The corresponding OVS userspace patch series includes test cases (in
      tests/system-traffic.at) that also serve as example uses.
      
      This work extends on a branch by Thomas Graf at
      https://github.com/tgraf/ovs/tree/nat.
      Signed-off-by: NJarno Rajahalme <jarno@ovn.org>
      Acked-by: NThomas Graf <tgraf@suug.ch>
      Acked-by: NJoe Stringer <joe@ovn.org>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      05752523
    • J
      netfilter: Remove IP_CT_NEW_REPLY definition. · bfa3f9d7
      Jarno Rajahalme 提交于
      Remove the definition of IP_CT_NEW_REPLY from the kernel as it does
      not make sense.  This allows the definition of IP_CT_NUMBER to be
      simplified as well.
      Signed-off-by: NJarno Rajahalme <jarno@ovn.org>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      bfa3f9d7
    • M
      tcp: Add RFC4898 tcpEStatsPerfDataSegsOut/In · a44d6eac
      Martin KaFai Lau 提交于
      Per RFC4898, they count segments sent/received
      containing a positive length data segment (that includes
      retransmission segments carrying data).  Unlike
      tcpi_segs_out/in, tcpi_data_segs_out/in excludes segments
      carrying no data (e.g. pure ack).
      
      The patch also updates the segs_in in tcp_fastopen_add_skb()
      so that segs_in >= data_segs_in property is kept.
      
      Together with retransmission data, tcpi_data_segs_out
      gives a better signal on the rxmit rate.
      
      v6: Rebase on the latest net-next
      
      v5: Eric pointed out that checking skb->len is still needed in
      tcp_fastopen_add_skb() because skb can carry a FIN without data.
      Hence, instead of open coding segs_in and data_segs_in, tcp_segs_in()
      helper is used.  Comment is added to the fastopen case to explain why
      segs_in has to be reset and tcp_segs_in() has to be called before
      __skb_pull().
      
      v4: Add comment to the changes in tcp_fastopen_add_skb()
      and also add remark on this case in the commit message.
      
      v3: Add const modifier to the skb parameter in tcp_segs_in()
      
      v2: Rework based on recent fix by Eric:
      commit a9d99ce2 ("tcp: fix tcpi_segs_in after connection establishment")
      Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
      Cc: Chris Rapier <rapier@psc.edu>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Marcelo Ricardo Leitner <mleitner@redhat.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a44d6eac
  18. 14 3月, 2016 2 次提交
  19. 12 3月, 2016 3 次提交