1. 24 3月, 2017 25 次提交
    • J
      i40e: add parsing of flexible filter fields from userdef · e793095e
      Jacob Keller 提交于
      Add code to parse the user-def field into a data structure format. This
      code is intended to allow future extensions of the user-def field by
      keeping all code that actually reads and writes the field into a single
      location. This ensures that we do not litter the driver with references
      to the user-def field and minimizes the amount of bitwise operations we
      need to do on the data.
      
      Add code which parses the lower 32bits into a flexible word and its
      offset. This will be used in a future patch to enable flexible filters
      which can match on some arbitrary data in the packet payload. For now,
      we just return -EOPNOTSUPP when this is used.
      
      Add code to fill in the user-def field when reporting the filter back,
      even though we don't actually implement any user-def fields yet.
      
      Additionally, ensure that we mask the extended FLOW_EXT bit from the
      flow_type now that we will be accepting filters which have the FLOW_EXT
      bit set (and thus make use of the user-def field).
      
      Change-Id: I238845035c179380a347baa8db8223304f5f6dd7
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      e793095e
    • J
      i40e: partition the ring_cookie to get VF index · 43b15697
      Jacob Keller 提交于
      Do not use the user-def field for determining the VF target. Instead,
      similar to ixgbe, partition the ring_cookie value into 8bits of VF
      index, along with 32bits of queue number. This is better than using the
      user-def field, because it leaves the field open for extension in
      a future patch which will enable flexible data. Also, this matches with
      convention used by ixgbe and other drivers.
      
      Change-Id: Ie36745186d817216b12f0313b99ec95cb8a9130c
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      43b15697
    • J
      i40e: allow changing input set for ntuple filters · 9229e993
      Jacob Keller 提交于
      Add support to detect when we can update the input set for each flow
      type.
      
      Because the hardware only supports a single input set for all flows of
      that matching type, the driver shall only allow the input set to change
      if there are no other configured filters for that flow type.
      
      Thus, the first filter added for each flow type is allowed to change the
      input set, and all future filters must match the same input set. Display
      a diagnostic message whenever the filter input set changes, and
      a warning whenever a filter cannot be accepted because it does not match
      the configured input set.
      
      Change-Id: Ic22e1c267ae37518bb036aca4a5694681449f283
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      9229e993
    • J
      i40e: restore default input set for each flow type · 3bcee1e6
      Jacob Keller 提交于
      Ensure that the default input set is correctly reprogrammed when
      cleaning up after disabling flow director support. This ensures that the
      programmed value will be in a clean state.
      
      Although we do not yet have support for SCTPv4 filters, a future patch
      will add support for this protocol, so we will correctly restore the
      SCTPv4 input set here as well. Note that strictly speaking the default
      hardware value for SCTP includes matching the verification tag. However,
      the ethtool API does not have support for specifying this value, so
      there is no reason to keep the verification field enabled.
      
      This patch is the next step on the way to enabling partial tuple filters
      which will be implemented in a following patch.
      
      Change-Id: Ic22e1c267ae37518bb036aca4a5694681449f283
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      3bcee1e6
    • J
      i40e: check current configured input set when adding ntuple filters · 36777d9f
      Jacob Keller 提交于
      Do not assume that hardware has been programmed with the default mask,
      but instead read the input set registers to determine what is currently
      programmed. This ensures that all programmed filters match exactly how
      the hardware will interpret them, avoiding confusion regarding filter
      behavior.
      
      This sets the initial ground-work for allowing custom input sets where
      some fields are disabled. A future patch will fully implement this
      feature.
      
      Instead of using bitwise negation, we'll just explicitly check for the
      correct value. The use of htonl and htons are used to silence sparse
      warnings. The compiler should be able to handle the constant value and
      avoid actually performing a byteswap.
      
      Change-Id: I3d8db46cb28ea0afdaac8c5b31a2bfb90e3a4102
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      36777d9f
    • J
      i40e: correctly honor the mask fields for ETHTOOL_SRXCLSRLINS · faa16e0f
      Jacob Keller 提交于
      The current implementation of .set_rxnfc does not properly read the mask
      field for filter entries. This results in incorrect driver behavior, as
      we do not reject filters which have masks set to ignore some fields. The
      current implementation simply assumes that every part of the tuple or
      "input set" is specified. This results in filters not behaving as
      expected, and not working correctly.
      
      As a first step in supporting some partial filters, add code which
      checks the mask fields and rejects any filters which do not have an
      acceptable mask. For now, we just assume that all fields must be set.
      
      This will get the driver one step towards allowing some partial filters.
      At a minimum, the ethtool commands which previously installed filters
      that would not function will now return a non-zero exit code indicating
      failure instead.
      
      We should now be meeting the minimum requirements of the .set_rxnfc API,
      by ensuring that all filters we program have a valid mask value for each
      field.
      
      Finally, add code to report the mask correctly so that the ethtool
      command properly reports the mask to the user.
      
      Note that the typecast to (__be16) when checking source and destination
      port masks is required because the ~ bitwise negation operator does not
      correctly handle variables other than integer size.
      
      Change-Id: Ia020149e07c87aa3fcec7b2283621b887ef0546f
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      faa16e0f
    • D
      sched: act_csum: don't mangle TCP and UDP GSO packets · add641e7
      Davide Caratti 提交于
      after act_csum computes the checksum on skbs carrying GSO TCP/UDP packets,
      subsequent segmentation fails because skb_needs_check(skb, true) returns
      true. Because of that, skb_warn_bad_offload() is invoked and the following
      message is displayed:
      
      WARNING: CPU: 3 PID: 28 at net/core/dev.c:2553 skb_warn_bad_offload+0xf0/0xfd
      <...>
      
        [<ffffffff8171f486>] skb_warn_bad_offload+0xf0/0xfd
        [<ffffffff8161304c>] __skb_gso_segment+0xec/0x110
        [<ffffffff8161340d>] validate_xmit_skb+0x12d/0x2b0
        [<ffffffff816135d2>] validate_xmit_skb_list+0x42/0x70
        [<ffffffff8163c560>] sch_direct_xmit+0xd0/0x1b0
        [<ffffffff8163c760>] __qdisc_run+0x120/0x270
        [<ffffffff81613b3d>] __dev_queue_xmit+0x23d/0x690
        [<ffffffff81613fa0>] dev_queue_xmit+0x10/0x20
      
      Since GSO is able to compute checksum on individual segments of such skbs,
      we can simply skip mangling the packet.
      Signed-off-by: NDavide Caratti <dcaratti@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      add641e7
    • J
      net: dwc-xlgmac: use dual license · 67ff2c71
      Jie Deng 提交于
      The driver "dwc-xlgmac" is dual-licensed.
      Declare the dual license with MODULE_LICENSE().
      Signed-off-by: NJie Deng <jiedeng@synopsys.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      67ff2c71
    • J
      net: dwc-xlgmac: declaration of dual license in headers · ea8c1c64
      Jie Deng 提交于
      The driver "dwc-xlgmac" is dual-licensed. This patch adds
      declaration of dual license in file headers.
      Signed-off-by: NJie Deng <jiedeng@synopsys.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ea8c1c64
    • D
      Merge branch 'bpf-socket-cookie-uid' · 101a6e83
      David S. Miller 提交于
      Chenbo Feng says:
      
      ====================
      net: core: Two Helper function about socket information
      
      Introduce two eBpf helper function to get the socket cookie and
      socket uid for each packet. The helper function is useful when
      the *sk field inside sk_buff is not empty. These helper functions
      can be used on socket and uid based traffic monitoring programs.
      
      Change since V7:
      * change the user namespace of uid helper function to sock_net(sk)->user_ns
      
      Change since V6:
      * change the user namespace of uid helper function back to init_user_ns
        since in some situation, for example, pinned bpf object, the current
        user namespace is not always applicable.
      
      Change since V5:
      * Delete unnecessary blank lines in sample program.
      * Refine the variable orders in get_uid helper function.
      
      Change since V4:
      * Using current user namespace to get uid instead of using init_ns.
      * Add compiling setup of example program in to Makefile.
      * Change the name style of the example program binaries.
      
      Change since V3:
      * Fixed some typos and incorrect comments in sample program
      * replaced raw insns with BPF_STX_XADD and add it to libbpf.h
      * Use a temp dir as mount point instead and added a check for
        the user input string.
      * Make the get uid helper function returns the user namespace uid
        instead of kuid.
      * Return a overflowuid instead of 0 when no uid information is found.
      
      Change since V2:
      * Add a sample program to demostrate the usage of the helper function.
      * Moved the helper function proto invoking place.
      * Add function header into tools/include
      * Apply sk_to_full_sk() before getting uid.
      
      Change since V1:
      * Removed the unnecessary declarations and export command
      * resolved conflict with master branch.
      * Examine if the socket is a full socket before getting the uid.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      101a6e83
    • C
      A Sample of using socket cookie and uid for traffic monitoring · 51570a5a
      Chenbo Feng 提交于
      Add a sample program to demostrate the possible usage of
      get_socket_cookie and get_socket_uid helper function. The program will
      store bytes and packets counting of in/out traffic monitored by iptables
      and store the stats in a bpf map in per socket base. The owner uid of
      the socket will be stored as part of the data entry. A shell script for
      running the program is also included.
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NChenbo Feng <fengc@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      51570a5a
    • C
      Add a eBPF helper function to retrieve socket uid · 6acc5c29
      Chenbo Feng 提交于
      Returns the owner uid of the socket inside a sk_buff. This is useful to
      perform per-UID accounting of network traffic or per-UID packet
      filtering. The socket need to be a fullsock otherwise overflowuid is
      returned.
      Signed-off-by: NChenbo Feng <fengc@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6acc5c29
    • C
      Add a helper function to get socket cookie in eBPF · 91b8270f
      Chenbo Feng 提交于
      Retrieve the socket cookie generated by sock_gen_cookie() from a sk_buff
      with a known socket. Generates a new cookie if one was not yet set.If
      the socket pointer inside sk_buff is NULL, 0 is returned. The helper
      function coud be useful in monitoring per socket networking traffic
      statistics and provide a unique socket identifier per namespace.
      Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NChenbo Feng <fengc@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      91b8270f
    • D
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 16ae1f22
      David S. Miller 提交于
      Conflicts:
      	drivers/net/ethernet/broadcom/genet/bcmmii.c
      	drivers/net/hyperv/netvsc.c
      	kernel/bpf/hashtab.c
      
      Almost entirely overlapping changes.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      16ae1f22
    • L
      Merge tag 'sound-4.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · d038e3dc
      Linus Torvalds 提交于
      Pull sound fixes from Takashi Iwai:
       "This contains the collection of small fixes for 4.11 that were pending
        during my vacation:
      
         - a few HD-audio quirks (more Dell headset support, docking station
           support on HP laptops)
      
         - a regression fix for the previous ctxfi DMA mask fix
      
         - a correction of the new CONFIG_SND_X86 menu entry
      
         - a fix for the races in ALSA sequencer core spotted by syzkaller"
      
      * tag 'sound-4.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
        ALSA: hda - Adding a group of pin definition to fix headset problem
        ALSA: seq: Fix racy cell insertions during snd_seq_pool_done()
        ALSA: x86: Make CONFIG_SND_X86 bool
        ALSA: hda - add support for docking station for HP 840 G3
        ALSA: hda - add support for docking station for HP 820 G2
        ALSA: ctxfi: Fix the incorrect check of dma_set_mask() call
      d038e3dc
    • A
      qedf: fix wrong le16 conversion · 6f359f99
      Arnd Bergmann 提交于
      gcc points out that we are converting a 16-bit integer into a 32-bit
      little-endian type and assigning that to 16-bit little-endian
      will end up with a zero:
      
      drivers/scsi/qedf/drv_fcoe_fw_funcs.c: In function 'init_initiator_rw_fcoe_task':
      include/uapi/linux/byteorder/big_endian.h:32:26: error: large integer implicitly truncated to unsigned type [-Werror=overflow]
        t_st_ctx->read_write.rx_id = cpu_to_le32(FCOE_RX_ID);
      
      The correct solution appears to be to just use a 16-bit byte swap instead.
      
      Fixes: be086e7c ("qed*: Utilize Firmware 8.15.3.0")
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Acked-by: NChad Dupuis <chad.dupuis@cavium.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6f359f99
    • D
      Merge branch 'qed-management-interaction-and-feature-changes' · 028ba8aa
      David S. Miller 提交于
      Yuval Mintz says:
      
      ====================
      qed: Management interaction & feature changes
      
      All patches in this series either affect direct interaction with the
      management firmware, or changes logic relating to some values retrieved
      from it.
      
      Patch #1 revises the basic logic for sending messages to the management
      firmware and there completion, and is the most significant [at least
      code-wise] of the bunch.
      
      Patch #2 changes infrastrcure in a way that should better protect us form
      mistakes leading to stack corruption such as was fixed in
      bb480242 ("qed: Prevent stack corruption on MFW interaction").
      
      Patch #3 corrects some update API endian issue [sent here as it would
      create conflicts with #2, and because it's lack would create a rather
      insignifcant problem].
      
      Patch #4 removes some unnecessary logging, allowing cleaner forward
      compatibility with future management firmware versions.
      
      Patches #5, #6 slightly change the number of possible L2 queues in some
      scenarios, leading to the possibility of having more queues / VFS.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      028ba8aa
    • M
      qed: Reserve VF feature before PF · dec26533
      Mintz, Yuval 提交于
      Align the driver feature distribution with the flow utilized
      by the management firmware - first reserve L2 queues for
      VFs and use all the remaining for the PF.
      
      The current distribution might lead to PFs with an enormous
      amount of queues, but at the same time leave us with insufficient
      resources for starting all VFs.
      Signed-off-by: NYuval Mintz <Yuval.Mintz@cavium.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dec26533
    • M
      qed: Don't waste SBs unused by RoCE · 810bb1f0
      Mintz, Yuval 提交于
      When RoCE is enabled on a given L2 interface, the interrupt lines
      are divided equally between L2 and RoCE -
      But in case number of lines needed for RoCE is limited by number
      of available CNQs, we can utilize the additional lines for L2.
      Signed-off-by: NYuval Mintz <Yuval.Mintz@cavium.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      810bb1f0
    • M
      qed: Reduce verbosity of unimplemented MFW messages · 39815944
      Mintz, Yuval 提交于
      Management firmware and driver are meant to be both backward and forward
      compatibile with each other.
      
      If a new mangement firmware would work with an older driver,
      it's possible that driver would receive indications which are meaningless
      to it. That's perfectly acceptible from the firmware part - so no need to
      log such messages at default verbosity; That would only serve to confuse
      users.
      Signed-off-by: NYuval Mintz <Yuval.Mintz@cavium.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      39815944
    • M
      qed: Correct endian order of MAC passed to MFW · 17991002
      Mintz, Yuval 提交于
      The management firmware is running on a Big Endian processor,
      and when running on LE platform HW is configured to swap access
      to memory shared between management firmware and driver on
      32-bit granulariy.
      
      As a result, for matters of simplicity most of the APIs between
      driver and management firmware are based on 32-bit variables.
      MAC settings are one exception, as driver needs to fill a byte
      array when indicating to management firmware that primary MAC
      has changed.
      Due to the swap, driver must make sure that the mac that was
      provided in byte-order would be translated into native order,
      otherwise after the swap the management firmware would read
      it swapped.
      Signed-off-by: NYuval Mintz <Yuval.Mintz@cavium.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      17991002
    • T
      qed: Pass src/dst sizes when interacting with MFW · 2f67af8c
      Tomer Tayar 提交于
      The driver interaction with management firmware involves a union
      of all the data-members relating to the commands the driver prepares.
      
      Current interface assumes the caller always passes such a union -
      but thats cumbersome as well as risky [chancing a stack corruption
      in case caller accidentally passes a smaller member instead of union].
      
      Change implementation so that caller could pass a pointer to any
      of the members instead of the union.
      Signed-off-by: NTomer Tayar <Tomer.Tayar@cavium.com>
      Signed-off-by: NYuval Mintz <Yuval.Mintz@cavium.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2f67af8c
    • T
      qed: Revise MFW command locking · 4ed1eea8
      Tomer Tayar 提交于
      Interaction of driver -> management firmware is based
      on a one-pending mailbox [per interface], and various
      mailbox commands need to be synchronized.
      
      Current scheme is messy, and there's a difficulty extending
      it as it deals differently with various commands as well as
      making assumption on the required behavior for load/unload
      requests.
      
      Drop the current scheme into a completion-list-based approach;
      Each flow would try sending the command when possible,
      allowing one flow to complete another flow's completion and
      relieve the mailbox before sending its own command.
      Signed-off-by: NTomer Tayar <Tomer.Tayar@cavium.com>
      Signed-off-by: NYuval Mintz <Yuval.Mintz@cavium.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4ed1eea8
    • L
      Merge branch 'for-linus-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs · 131fbf4f
      Linus Torvalds 提交于
      Pull btrfs fixes from Chris Mason:
       "Zygo tracked down a very old bug with inline compressed extents.
      
        I didn't tag this one for stable because I want to do individual
        tested backports. It's a little tricky and I'd rather do some extra
        testing on it along the way"
      
      * 'for-linus-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
        btrfs: add missing memset while reading compressed inline extents
        Btrfs: fix regression in lock_delalloc_pages
        btrfs: remove btrfs_err_str function from uapi/linux/btrfs.h
      131fbf4f
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · f341d9f0
      Linus Torvalds 提交于
      Pull networking fixes from David Miller:
      
       1) Several netfilter fixes from Pablo and the crew:
            - Handle fragmented packets properly in netfilter conntrack, from
              Florian Westphal.
            - Fix SCTP ICMP packet handling, from Ying Xue.
            - Fix big-endian bug in nftables, from Liping Zhang.
            - Fix alignment of fake conntrack entry, from Steven Rostedt.
      
       2) Fix feature flags setting in fjes driver, from Taku Izumi.
      
       3) Openvswitch ipv6 tunnel source address not set properly, from Or
          Gerlitz.
      
       4) Fix jumbo MTU handling in amd-xgbe driver, from Thomas Lendacky.
      
       5) sk->sk_frag.page not released properly in some cases, from Eric
          Dumazet.
      
       6) Fix RTNL deadlocks in nl80211, from Johannes Berg.
      
       7) Fix erroneous RTNL lockdep splat in crypto, from Herbert Xu.
      
       8) Cure improper inflight handling during AF_UNIX GC, from Andrey
          Ulanov.
      
       9) sch_dsmark doesn't write to packet headers properly, from Eric
          Dumazet.
      
      10) Fix SCM_TIMESTAMPING_OPT_STATS handling in TCP, from Soheil Hassas
          Yeganeh.
      
      11) Add some IDs for Motorola qmi_wwan chips, from Tony Lindgren.
      
      12) Fix nametbl deadlock in tipc, from Ying Xue.
      
      13) GRO and LRO packets not counted correctly in mlx5 driver, from Gal
          Pressman.
      
      14) Fix reset of internal PHYs in bcmgenet, from Doug Berger.
      
      15) Fix hashmap allocation handling, from Alexei Starovoitov.
      
      16) nl_fib_input() needs stronger netlink message length checking, from
          Eric Dumazet.
      
      17) Fix double-free of sk->sk_filter during sock clone, from Daniel
          Borkmann.
      
      18) Fix RX checksum offloading in aquantia driver, from Pavel Belous.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (85 commits)
        net:ethernet:aquantia: Fix for RX checksum offload.
        amd-xgbe: Fix the ECC-related bit position definitions
        sfc: cleanup a condition in efx_udp_tunnel_del()
        Bluetooth: btqcomsmd: fix compile-test dependency
        inet: frag: release spinlock before calling icmp_send()
        tcp: initialize icsk_ack.lrcvtime at session start time
        genetlink: fix counting regression on ctrl_dumpfamily()
        socket, bpf: fix sk_filter use after free in sk_clone_lock
        ipv4: provide stronger user input validation in nl_fib_input()
        bpf: fix hashmap extra_elems logic
        enic: update enic maintainers
        net: bcmgenet: remove bcmgenet_internal_phy_setup()
        ipv6: make sure to initialize sockc.tsflags before first use
        fjes: Do not load fjes driver if extended socket device is not power on.
        fjes: Do not load fjes driver if system does not have extended socket device.
        net/mlx5e: Count LRO packets correctly
        net/mlx5e: Count GSO packets correctly
        net/mlx5: Increase number of max QPs in default profile
        net/mlx5e: Avoid supporting udp tunnel port ndo for VF reps
        net/mlx5e: Use the proper UAPI values when offloading TC vlan actions
        ...
      f341d9f0
  2. 23 3月, 2017 15 次提交