1. 25 2月, 2016 1 次提交
  2. 24 2月, 2016 2 次提交
  3. 23 2月, 2016 20 次提交
  4. 22 2月, 2016 10 次提交
    • D
      bpf: don't emit mov A,A on return · 6205b9cf
      Daniel Borkmann 提交于
      While debugging with bpf_jit_disasm I noticed emissions of 'mov %eax,%eax',
      and found that this comes from BPF_RET | BPF_A translations from classic
      BPF. Emitting this is unnecessary as BPF_REG_A is mapped into BPF_REG_0
      already, therefore only emit a mov when immediates are used as return value.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6205b9cf
    • D
      bpf: fix csum update in bpf_l4_csum_replace helper for udp · 2f72959a
      Daniel Borkmann 提交于
      When using this helper for updating UDP checksums, we need to extend
      this in order to write CSUM_MANGLED_0 for csum computations that result
      into 0 as sum. Reason we need this is because packets with a checksum
      could otherwise become incorrectly marked as a packet without a checksum.
      Likewise, if the user indicates BPF_F_MARK_MANGLED_0, then we should
      not turn packets without a checksum into ones with a checksum.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2f72959a
    • D
      bpf: try harder on clones when writing into skb · 3697649f
      Daniel Borkmann 提交于
      When we're dealing with clones and the area is not writeable, try
      harder and get a copy via pskb_expand_head(). Replace also other
      occurences in tc actions with the new skb_try_make_writable().
      Reported-by: NAshhad Sheikh <ashhadsheikh394@gmail.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3697649f
    • D
      bpf: remove artificial bpf_skb_{load, store}_bytes buffer limitation · 21cafc1d
      Daniel Borkmann 提交于
      We currently limit bpf_skb_store_bytes() and bpf_skb_load_bytes()
      helpers to only store or load a maximum buffer of 16 bytes. Thus,
      loading, rewriting and storing headers require several bpf_skb_load_bytes()
      and bpf_skb_store_bytes() calls.
      
      Also here we can use a per-cpu scratch buffer instead in order to not
      pressure stack space any further. I do suspect that this limit was mainly
      set in place for this particular reason. So, ease program development
      by removing this limitation and make the scratchpad generic, so it can
      be reused.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      21cafc1d
    • D
      bpf: add generic bpf_csum_diff helper · 7d672345
      Daniel Borkmann 提交于
      For L4 checksums, we currently have bpf_l4_csum_replace() helper. It's
      currently limited to handle 2 and 4 byte changes in a header and feeds the
      from/to into inet_proto_csum_replace{2,4}() helpers of the kernel. When
      working with IPv6, for example, this makes it rather cumbersome to deal
      with, similarly when editing larger parts of a header.
      
      Instead, extend the API in a more generic way: For bpf_l4_csum_replace(),
      add a case for header field mask of 0 to change the checksum at a given
      offset through inet_proto_csum_replace_by_diff(), and provide a helper
      bpf_csum_diff() that can generically calculate a from/to diff for arbitrary
      amounts of data.
      
      This can be used in multiple ways: for the bpf_l4_csum_replace() only
      part, this even provides us with the option to insert precalculated diffs
      from user space f.e. from a map, or from bpf_csum_diff() during runtime.
      
      bpf_csum_diff() has a optional from/to stack buffer input, so we can
      calculate a diff by using a scratchbuffer for scenarios where we're
      inserting (from is NULL), removing (to is NULL) or diffing (from/to buffers
      don't need to be of equal size) data. Also, bpf_csum_diff() allows to
      feed a previous csum into csum_partial(), so the function can also be
      cascaded.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7d672345
    • R
      ila: autoload module · 84a8cbe4
      Robert Shearman 提交于
      Avoid users having to manually load the module by adding a module
      alias allowing it to be autoloaded by the lwt infra.
      Signed-off-by: NRobert Shearman <rshearma@brocade.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      84a8cbe4
    • R
      mpls: autoload lwt module · b2b04edc
      Robert Shearman 提交于
      Avoid users having to manually load the module by adding a module
      alias allowing it to be autoloaded by the lwt infra.
      Signed-off-by: NRobert Shearman <rshearma@brocade.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b2b04edc
    • R
      lwtunnel: autoload of lwt modules · 745041e2
      Robert Shearman 提交于
      The lwt implementations using net devices can autoload using the
      existing mechanism using IFLA_INFO_KIND. However, there's no mechanism
      that lwt modules not using net devices can use.
      
      Therefore, add the ability to autoload modules registering lwt
      operations for lwt implementations not using a net device so that
      users don't have to manually load the modules.
      
      Only users with the CAP_NET_ADMIN capability can cause modules to be
      loaded, which is ensured by rtnetlink_rcv_msg rejecting non-RTM_GETxxx
      messages for users without this capability, and by
      lwtunnel_build_state not being called in response to RTM_GETxxx
      messages.
      Signed-off-by: NRobert Shearman <rshearma@brocade.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      745041e2
    • Z
      vlan: turn on unicast filtering on vlan device · e817af27
      Zhang Shengju 提交于
      Currently vlan device inherits unicast filtering flag from underlying
      device. If underlying device doesn't support unicast filter, this will
      put vlan device into promiscuous mode when it's stacked.
      
      Tun on IFF_UNICAST_FLT on the vlan device in any case so that it does
      not go into promiscuous mode needlessly. If underlying device does not
      support unicast filtering, that device will enter promiscuous mode.
      Signed-off-by: NZhang Shengju <zhangshengju@cmss.chinamobile.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e817af27
    • N
      sctp: Fix port hash table size computation · d9749fb5
      Neil Horman 提交于
      Dmitry Vyukov noted recently that the sctp_port_hashtable had an error in
      its size computation, observing that the current method never guaranteed
      that the hashsize (measured in number of entries) would be a power of two,
      which the input hash function for that table requires.  The root cause of
      the problem is that two values need to be computed (one, the allocation
      order of the storage requries, as passed to __get_free_pages, and two the
      number of entries for the hash table).  Both need to be ^2, but for
      different reasons, and the existing code is simply computing one order
      value, and using it as the basis for both, which is wrong (i.e. it assumes
      that ((1<<order)*PAGE_SIZE)/sizeof(bucket) is still ^2 when its not).
      
      To fix this, we change the logic slightly.  We start by computing a goal
      allocation order (which is limited by the maximum size hash table we want
      to support.  Then we attempt to allocate that size table, decreasing the
      order until a successful allocation is made.  Then, with the resultant
      successful order we compute the number of buckets that hash table supports,
      which we then round down to the nearest power of two, giving us the number
      of entries the table actually supports.
      
      I've tested this locally here, using non-debug and spinlock-debug kernels,
      and the number of entries in the hashtable consistently work out to be
      powers of two in all cases.
      Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      CC: Dmitry Vyukov <dvyukov@google.com>
      CC: Vladislav Yasevich <vyasevich@gmail.com>
      CC: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d9749fb5
  5. 20 2月, 2016 7 次提交