1. 17 5月, 2016 11 次提交
    • D
      Merge branch 'cls_u32_hw_sw' · 1ca46734
      David S. Miller 提交于
      Sridhar Samudrala says:
      
      ====================
      Enable SW only or HW only offloads with u32 classifier
      
      This set of patches export TCA_CLS_FLAGS_SKIP_HW to userspace and also
      introduces another flag TCA_CLS_FLAGS_SKIP_SW. These flags enable offloading
      u32 filters to either SW or HW only.
      
      The default semantics with no flags is to add the filter to HW if possible and
      also into SW.
      With SKIP_HW flag, the filter is only added to SW.
      With SKIP_SW flag, the filter is added to HW and an error is returned
      to user on failure.
      These flags are mutually exclusive.
      There was an earlier discussion on these semantics in the following email
      thread.
      	http://thread.gmane.org/gmane.linux.network/401733
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1ca46734
    • S
      net: cls_u32: Add support for skip-sw flag to tc u32 classifier. · d34e3e18
      Samudrala, Sridhar 提交于
      On devices that support TC U32 offloads, this flag enables a filter to be
      added only to HW. skip-sw and skip-hw are mutually exclusive flags. By
      default without any flags, the filter is added to both HW and SW, but no
      error checks are done in case of failure to add to HW. With skip-sw,
      failure to add to HW is treated as an error.
      
      Here is a sample script that adds 2 filters, one with skip-sw and the other
      with skip-hw flag.
      
         # add ingress qdisc
         tc qdisc add dev p4p1 ingress
      
         # enable hw tc offload.
         ethtool -K p4p1 hw-tc-offload on
      
         # add u32 filter with skip-sw flag.
         tc filter add dev p4p1 parent ffff: protocol ip prio 99 \
            handle 800:0:1 u32 ht 800: flowid 800:1 \
            skip-sw \
            match ip src 192.168.1.0/24 \
            action drop
      
         # add u32 filter with skip-hw flag.
         tc filter add dev p4p1 parent ffff: protocol ip prio 99 \
            handle 800:0:2 u32 ht 800: flowid 800:2 \
            skip-hw \
            match ip src 192.168.2.0/24 \
            action drop
      Signed-off-by: NSridhar Samudrala <sridhar.samudrala@intel.com>
      Acked-by: NJohn Fastabend <john.r.fastabend@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d34e3e18
    • S
    • D
      Merge branch 'hv_netvsc-races' · 860d7ef6
      David S. Miller 提交于
      Vitaly Kuznetsov says:
      
      ====================
      hv_netvsc: avoid races on mtu change/set channels
      
      Changes since v1:
      - Rebased to net-next [Haiyang Zhang]
      
      Original description:
      
      MTU change and set channels operations are implemented as netvsc device
      re-creation destroying internal structures (struct net_device stays). This
      is really unfortunate but there is no support from Hyper-V host to do it
      in a different way. Such re-creation is unsurprisingly racy, Haiyang
      reported a crash when netvsc_change_mtu() is racing with
      netvsc_link_change() but I was able to identify additional races upon
      investigation. Both netvsc_set_channels() and netvsc_change_mtu() race
      against:
      1) netvsc_link_change()
      2) netvsc_remove()
      3) netvsc_send()
      
      To solve these issues without introducing new locks some refactoring is
      required. We need to get rid of very complex link graph in all the
      internal structures and avoid traveling through structures which are being
      removed.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      860d7ef6
    • V
      hv_netvsc: set nvdev link after populating chn_table · 88098834
      Vitaly Kuznetsov 提交于
      Crash in netvsc_send() is observed when netvsc device is re-created on
      mtu change/set channels. The crash is caused by dereferencing of NULL
      channel pointer which comes from chn_table. The root cause is a mixture
      of two facts:
      - we set nvdev pointer in net_device_context in alloc_net_device()
        before we populate chn_table.
      - we populate chn_table[0] only.
      
      The issue could be papered over by checking channel != NULL in
      netvsc_send() but populating the whole chn_table and writing the
      nvdev pointer afterwards seems more appropriate.
      Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      88098834
    • V
      hv_netvsc: synchronize netvsc_change_mtu()/netvsc_set_channels() with netvsc_remove() · 6da7225f
      Vitaly Kuznetsov 提交于
      When netvsc device is removed during mtu change or channels setup we get
      into troubles as both paths are trying to remove the device. Synchronize
      them with start_remove flag and rtnl lock.
      Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6da7225f
    • V
      hv_netvsc: get rid of struct net_device pointer in struct netvsc_device · 0a1275ca
      Vitaly Kuznetsov 提交于
      Simplify netvsvc pointer graph by getting rid of the redundant ndev
      pointer. We can always get a pointer to struct net_device from somewhere
      else.
      Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0a1275ca
    • V
      hv_netvsc: untangle the pointer mess · 3d541ac5
      Vitaly Kuznetsov 提交于
      We have the following structures keeping netvsc adapter state:
      - struct net_device
      - struct net_device_context
      - struct netvsc_device
      - struct rndis_device
      - struct hv_device
      and there are pointers/dependencies between them:
      - struct net_device_context is contained in struct net_device
      - struct hv_device has driver_data pointer which points to
        'struct net_device' OR 'struct netvsc_device' depending on driver's
        state (!).
      - struct net_device_context has a pointer to 'struct hv_device'.
      - struct netvsc_device has pointers to 'struct hv_device' and
        'struct net_device_context'.
      - struct rndis_device has a pointer to 'struct netvsc_device'.
      
      Different functions get different structures as parameters and use these
      pointers for traveling. The problem is (in addition to keeping in mind
      this complex graph) that some of these structures (struct netvsc_device
      and struct rndis_device) are being removed and re-created on mtu change
      (as we implement it as re-creation of hyper-v device) so our travel using
      these pointers is dangerous.
      
      Simplify this to a the following:
      - add struct netvsc_device pointer to struct net_device_context (which is
        a part of struct net_device and thus never disappears)
      - remove struct hv_device and struct net_device_context pointers from
        struct netvsc_device
      - replace pointer to 'struct netvsc_device' with pointer to
        'struct net_device'.
      - always keep 'struct net_device' in hv_device driver_data.
      
      We'll end up with the following 'circular' structure:
      
      net_device:
       [net_device_context] -> netvsc_device -> rndis_device -> net_device
                            -> hv_device -> net_device
      
      On MTU change we'll be removing the 'netvsc_device -> rndis_device'
      branch and re-creating it making the synchronization easier.
      
      There is one additional redundant pointer left, it is struct net_device
      link in struct netvsc_device, it is going to be removed in a separate
      commit.
      Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3d541ac5
    • V
      hv_netvsc: use start_remove flag to protect netvsc_link_change() · 1bdcec8a
      Vitaly Kuznetsov 提交于
      netvsc_link_change() can race with netvsc_change_mtu() or
      netvsc_set_channels() as these functions destroy struct netvsc_device and
      rndis filter. Use start_remove flag for syncronization. As
      netvsc_change_mtu()/netvsc_set_channels() are called with rtnl lock held
      we need to take it before checking start_remove value in
      netvsc_link_change().
      Reported-by: NHaiyang Zhang <haiyangz@microsoft.com>
      Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1bdcec8a
    • V
      hv_netvsc: move start_remove flag to net_device_context · f580aec4
      Vitaly Kuznetsov 提交于
      struct netvsc_device is destroyed on mtu change so keeping the
      protection flag there is not a good idea. Move it to struct
      net_device_context which is preserved.
      Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f580aec4
    • U
      phy: add support for a reset-gpio specification · da47b457
      Uwe Kleine-König 提交于
      The framework only asserts (for now) that the reset gpio is not active.
      Signed-off-by: NUwe Kleine-König <u.kleine-koenig@pengutronix.de>
      Reviewed-by: NRoger Quadros <rogerq@ti.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      da47b457
  2. 16 5月, 2016 12 次提交
  3. 15 5月, 2016 6 次提交
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 272911b8
      Linus Torvalds 提交于
      Pull networking fixes from David Miller:
      
       1) Fix mvneta/bm dependencies, from Arnd Bergmann.
      
       2) RX completion hw bug workaround in bnxt_en, from Michael Chan.
      
       3) Kernel pointer leak in nf_conntrack, from Linus.
      
       4) Hoplimit route attribute limits not enforced properly, from Paolo
          Abeni.
      
       5) qlcnic driver NULL deref fix from Dan Carpenter.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
        arm64: bpf: jit JMP_JSET_{X,K}
        net/route: enforce hoplimit max value
        nf_conntrack: avoid kernel pointer value leak in slab name
        drivers: net: xgene: fix register offset
        drivers: net: xgene: fix statistics counters race condition
        drivers: net: xgene: fix ununiform latency across queues
        drivers: net: xgene: fix sharing of irqs
        drivers: net: xgene: fix IPv4 forward crash
        xen-netback: fix extra_info handling in xenvif_tx_err()
        net: mvneta: bm: fix dependencies again
        bnxt_en: Add workaround to detect bad opaque in rx completion (part 2)
        bnxt_en: Add workaround to detect bad opaque in rx completion (part 1)
        qlcnic: potential NULL dereference in qlcnic_83xx_get_minidump_template()
      272911b8
    • F
      net: switchdev: Drop EXPERIMENTAL from description · 8fbb89c6
      Florian Fainelli 提交于
      Switchdev has been around for quite a while now, putting "EXPERIMENTAL"
      in the description is no longer accurate, drop it.
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8fbb89c6
    • Z
      arm64: bpf: jit JMP_JSET_{X,K} · 98397fc5
      Zi Shen Lim 提交于
      Original implementation commit e54bcde3 ("arm64: eBPF JIT compiler")
      had the relevant code paths, but due to an oversight always fail jiting.
      
      As a result, we had been falling back to BPF interpreter whenever a BPF
      program has JMP_JSET_{X,K} instructions.
      
      With this fix, we confirm that the corresponding tests in lib/test_bpf
      continue to pass, and also jited.
      
      ...
      [    2.784553] test_bpf: #30 JSET jited:1 188 192 197 PASS
      [    2.791373] test_bpf: #31 tcpdump port 22 jited:1 325 677 625 PASS
      [    2.808800] test_bpf: #32 tcpdump complex jited:1 323 731 991 PASS
      ...
      [    3.190759] test_bpf: #237 JMP_JSET_K: if (0x3 & 0x2) return 1 jited:1 110 PASS
      [    3.192524] test_bpf: #238 JMP_JSET_K: if (0x3 & 0xffffffff) return 1 jited:1 98 PASS
      [    3.211014] test_bpf: #249 JMP_JSET_X: if (0x3 & 0x2) return 1 jited:1 120 PASS
      [    3.212973] test_bpf: #250 JMP_JSET_X: if (0x3 & 0xffffffff) return 1 jited:1 89 PASS
      ...
      
      Fixes: e54bcde3 ("arm64: eBPF JIT compiler")
      Signed-off-by: NZi Shen Lim <zlim.lnx@gmail.com>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Acked-by: NYang Shi <yang.shi@linaro.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      98397fc5
    • P
      net/route: enforce hoplimit max value · 626abd59
      Paolo Abeni 提交于
      Currently, when creating or updating a route, no check is performed
      in both ipv4 and ipv6 code to the hoplimit value.
      
      The caller can i.e. set hoplimit to 256, and when such route will
       be used, packets will be sent with hoplimit/ttl equal to 0.
      
      This commit adds checks for the RTAX_HOPLIMIT value, in both ipv4
      ipv6 route code, substituting any value greater than 255 with 255.
      
      This is consistent with what is currently done for ADVMSS and MTU
      in the ipv4 code.
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      626abd59
    • L
      nf_conntrack: avoid kernel pointer value leak in slab name · 31b0b385
      Linus Torvalds 提交于
      The slab name ends up being visible in the directory structure under
      /sys, and even if you don't have access rights to the file you can see
      the filenames.
      
      Just use a 64-bit counter instead of the pointer to the 'net' structure
      to generate a unique name.
      
      This code will go away in 4.7 when the conntrack code moves to a single
      kmemcache, but this is the backportable simple solution to avoiding
      leaking kernel pointers to user space.
      
      Fixes: 5b3501fa ("netfilter: nf_conntrack: per netns nf_conntrack_cachep")
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      31b0b385
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 6ba5b85f
      Linus Torvalds 提交于
      Pull vfs fixes from Al Viro:
       "Overlayfs fixes from Miklos, assorted fixes from me.
      
        Stable fodder of varying severity, all sat in -next for a while"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        ovl: ignore permissions on underlying lookup
        vfs: add lookup_hash() helper
        vfs: rename: check backing inode being equal
        vfs: add vfs_select_inode() helper
        get_rock_ridge_filename(): handle malformed NM entries
        ecryptfs: fix handling of directory opening
        atomic_open(): fix the handling of create_error
        fix the copy vs. map logics in blk_rq_map_user_iov()
        do_splice_to(): cap the size before passing to ->splice_read()
      6ba5b85f
  4. 14 5月, 2016 11 次提交