1. 16 11月, 2016 25 次提交
    • F
    • F
    • F
    • F
      fe1eb9c5
    • F
      fc5e353c
    • F
    • F
      b3de7f36
    • F
      net: phy: Add phy_ethtool_nway_reset · e86a8987
      Florian Fainelli 提交于
      This function just calls into genphy_restart_aneg() to perform an
      autonegotation restart.
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e86a8987
    • D
      vxlan: Fix uninitialized variable warnings. · 8ebd115b
      David S. Miller 提交于
      drivers/net/vxlan.c: In function ‘vxlan_xmit_one’:
      drivers/net/vxlan.c:2141:10: warning: ‘err’ may be used uninitialized in this function [-Wmaybe-uninitialized]
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8ebd115b
    • D
      Merge branch 'vxlan-xmit-improvements' · 81fea579
      David S. Miller 提交于
      Pravin B Shelar says:
      
      ====================
      vxlan: xmit improvements.
      
      Following patch series improves vxlan fast path, removes
      duplicate code and simplifies vxlan xmit code path.
      
      v2-v3:
      Removed unrelated warning fix from patch 2.
      rearranged error handling from patch 3
      Fixed stats updates in vxlan route lookup in patch 4
      
      v1-v2:
      Fix compilation error when IPv6 support is not enabled.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      81fea579
    • P
      9efdb92d
    • P
      vxlan: simplify vxlan xmit · 0770b53b
      pravin shelar 提交于
      Existing vxlan xmit function handles two distinct cases.
      1. vxlan net device
      2. vxlan lwt device.
      By seperating initialization these two cases the egress path
      looks better.
      Signed-off-by: NPravin B Shelar <pshelar@ovn.org>
      Acked-by: NJiri Benc <jbenc@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0770b53b
    • P
      vxlan: simplify RTF_LOCAL handling. · fee1fad7
      pravin shelar 提交于
      Avoid code duplicate code for handling RTF_LOCAL routes.
      Signed-off-by: NPravin B Shelar <pshelar@ovn.org>
      Acked-by: NJiri Benc <jbenc@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fee1fad7
    • P
      vxlan: improve vxlan route lookup checks. · 655c3de1
      pravin shelar 提交于
      Move route sanity check to respective vxlan[4/6]_get_route functions.
      This allows us to perform all sanity checks before caching the dst so
      that we can avoid these checks on subsequent packets.
      This give move accurate metadata information for packet from
      fill_metadata_dst().
      Signed-off-by: NPravin B Shelar <pshelar@ovn.org>
      Acked-by: NJiri Benc <jbenc@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      655c3de1
    • P
      vxlan: simplify exception handling · c46b7897
      pravin shelar 提交于
      vxlan egress path error handling has became complicated, it
      need to handle IPv4 and IPv6 tunnel cases.
      Earlier patch removes vlan handling from vxlan_build_skb(), so
      vxlan_build_skb does not need to free skb and we can simplify
      the xmit path by having single error handling for both type of
      tunnels.
      Signed-off-by: NPravin B Shelar <pshelar@ovn.org>
      Acked-by: NJiri Benc <jbenc@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c46b7897
    • P
      vxlan: avoid checking socket multiple times. · 03dc52a8
      pravin shelar 提交于
      Check the vxlan socket in vxlan6_getroute().
      Signed-off-by: NPravin B Shelar <pshelar@ovn.org>
      Acked-by: NJiri Benc <jbenc@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      03dc52a8
    • P
      vxlan: avoid vlan processing in vxlan device. · 4a4f86cc
      pravin shelar 提交于
      VxLan device does not have special handling for vlan taging on egress.
      Therefore it does not make sense to expose vlan offloading feature.
      This patch does not change vxlan functinality.
      Signed-off-by: NPravin B Shelar <pshelar@ovn.org>
      Acked-by: NJiri Benc <jbenc@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4a4f86cc
    • P
      udplite: fix NULL pointer dereference · c915fe13
      Paolo Abeni 提交于
      The commit 850cbadd ("udp: use it's own memory accounting schema")
      assumes that the socket proto has memory accounting enabled,
      but this is not the case for UDPLITE.
      Fix it enabling memory accounting for UDPLITE and performing
      fwd allocated memory reclaiming on socket shutdown.
      UDP and UDPLITE share now the same memory accounting limits.
      Also drop the backlog receive operation, since is no more needed.
      
      Fixes: 850cbadd ("udp: use it's own memory accounting schema")
      Reported-by: NAndrei Vagin <avagin@gmail.com>
      Suggested-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c915fe13
    • D
      Merge branch 'bpf-lru' · e6ca4f16
      David S. Miller 提交于
      Martin KaFai Lau says:
      
      ====================
      bpf: LRU map
      
      This patch set adds LRU map implementation to the existing BPF map
      family.
      
      The first few patches introduce the basic BPF LRU list
      implementation.
      
      The later patches introduce the LRU versions of the
      existing BPF_MAP_TYPE_LRU_[PERCPU_]HASH maps by leveraging
      the BPF LRU list.
      
      v2:
      - Added a percpu LRU list option which can be specified as
        a map attribute.
      
        [Note: percpu LRU list has nothing to do with the map's value]
      
      - Removed the cpu variable from the struct bpf_lru_locallist
        since it is not needed.
      
      - Changed the __bpf_lru_node_move_out to __bpf_lru_node_move_to_free in
        patch 1 to prepare the percpu LRU list in patch 2.
      
      - Moved the test_lru_map under selftests
      
      - Refactored a few things in the test codes
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e6ca4f16
    • M
      bpf: Add tests for the LRU bpf_htab · 5db58faf
      Martin KaFai Lau 提交于
      This patch has some unit tests and a test_lru_dist.
      
      The test_lru_dist reads in the numeric keys from a file.
      The files used here are generated by a modified fio-genzipf tool
      originated from the fio test suit.  The sample data file can be
      found here: https://github.com/iamkafai/bpf-lru
      
      The zipf.* data files have 100k numeric keys and the key is also
      ranged from 1 to 100k.
      
      The test_lru_dist outputs the number of unique keys (nr_unique).
      F.e. The following means, 61239 of them is unique out of 100k keys.
      nr_misses means it cannot be found in the LRU map, so nr_misses
      must be >= nr_unique. test_lru_dist also simulates a perfect LRU
      map as a comparison:
      
      [root@arch-fb-vm1 ~]# ~/devshare/fb-kernel/linux/samples/bpf/test_lru_dist \
      /root/zipf.100k.a1_01.out 4000 1
      ...
      test_parallel_lru_dist (map_type:9 map_flags:0x0):
          task:0 BPF LRU: nr_unique:23093(/100000) nr_misses:31603(/100000)
          task:0 Perfect LRU: nr_unique:23093(/100000 nr_misses:34328(/100000)
      ....
      test_parallel_lru_dist (map_type:9 map_flags:0x2):
          task:0 BPF LRU: nr_unique:23093(/100000) nr_misses:31710(/100000)
          task:0 Perfect LRU: nr_unique:23093(/100000 nr_misses:34328(/100000)
      
      [root@arch-fb-vm1 ~]# ~/devshare/fb-kernel/linux/samples/bpf/test_lru_dist \
      /root/zipf.100k.a0_01.out 40000 1
      ...
      test_parallel_lru_dist (map_type:9 map_flags:0x0):
          task:0 BPF LRU: nr_unique:61239(/100000) nr_misses:67054(/100000)
          task:0 Perfect LRU: nr_unique:61239(/100000 nr_misses:66993(/100000)
      ...
      test_parallel_lru_dist (map_type:9 map_flags:0x2):
          task:0 BPF LRU: nr_unique:61239(/100000) nr_misses:67068(/100000)
          task:0 Perfect LRU: nr_unique:61239(/100000 nr_misses:66993(/100000)
      
      LRU map has also been added to map_perf_test:
      /* Global LRU */
      [root@kerneltest003.31.prn1 ~]# for i in 1 4 8; do echo -n "$i cpus: "; \
      ./map_perf_test 16 $i | awk '{r += $3}END{print r " updates"}'; done
       1 cpus: 2934082 updates
       4 cpus: 7391434 updates
       8 cpus: 6500576 updates
      
      /* Percpu LRU */
      [root@kerneltest003.31.prn1 ~]# for i in 1 4 8; do echo -n "$i cpus: "; \
      ./map_perf_test 32 $i | awk '{r += $3}END{print r " updates"}'; done
        1 cpus: 2896553 updates
        4 cpus: 9766395 updates
        8 cpus: 17460553 updates
      Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5db58faf
    • M
      bpf: Add BPF_MAP_TYPE_LRU_PERCPU_HASH · 8f844938
      Martin KaFai Lau 提交于
      Provide a LRU version of the existing BPF_MAP_TYPE_PERCPU_HASH
      Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8f844938
    • M
      bpf: Add BPF_MAP_TYPE_LRU_HASH · 29ba732a
      Martin KaFai Lau 提交于
      Provide a LRU version of the existing BPF_MAP_TYPE_HASH.
      Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      29ba732a
    • M
      bpf: Refactor codes handling percpu map · fd91de7b
      Martin KaFai Lau 提交于
      Refactor the codes that populate the value
      of a htab_elem in a BPF_MAP_TYPE_PERCPU_HASH
      typed bpf_map.
      Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fd91de7b
    • M
      bpf: Add percpu LRU list · 961578b6
      Martin KaFai Lau 提交于
      Instead of having a common LRU list, this patch allows a
      percpu LRU list which can be selected by specifying a map
      attribute.  The map attribute will be added in the later
      patch.
      
      While the common use case for LRU is #reads >> #updates,
      percpu LRU list allows bpf prog to absorb unusual #updates
      under pathological case (e.g. external traffic facing machine which
      could be under attack).
      
      Each percpu LRU is isolated from each other.  The LRU nodes (including
      free nodes) cannot be moved across different LRU Lists.
      
      Here are the update performance comparison between
      common LRU list and percpu LRU list (the test code is
      at the last patch):
      
      [root@kerneltest003.31.prn1 ~]# for i in 1 4 8; do echo -n "$i cpus: "; \
      ./map_perf_test 16 $i | awk '{r += $3}END{print r " updates"}'; done
       1 cpus: 2934082 updates
       4 cpus: 7391434 updates
       8 cpus: 6500576 updates
      
      [root@kerneltest003.31.prn1 ~]# for i in 1 4 8; do echo -n "$i cpus: "; \
      ./map_perf_test 32 $i | awk '{r += $3}END{printr " updates"}'; done
        1 cpus: 2896553 updates
        4 cpus: 9766395 updates
        8 cpus: 17460553 updates
      Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      961578b6
    • M
      bpf: LRU List · 3a08c2fd
      Martin KaFai Lau 提交于
      Introduce bpf_lru_list which will provide LRU capability to
      the bpf_htab in the later patch.
      
      * General Thoughts:
      1. Target use case.  Read is more often than update.
         (i.e. bpf_lookup_elem() is more often than bpf_update_elem()).
         If bpf_prog does a bpf_lookup_elem() first and then an in-place
         update, it still counts as a read operation to the LRU list concern.
      2. It may be useful to think of it as a LRU cache
      3. Optimize the read case
         3.1 No lock in read case
         3.2 The LRU maintenance is only done during bpf_update_elem()
      4. If there is a percpu LRU list, it will lose the system-wise LRU
         property.  A completely isolated percpu LRU list has the best
         performance but the memory utilization is not ideal considering
         the work load may be imbalance.
      5. Hence, this patch starts the LRU implementation with a global LRU
         list with batched operations before accessing the global LRU list.
         As a LRU cache, #read >> #update/#insert operations, it will work well.
      6. There is a local list (for each cpu) which is named
         'struct bpf_lru_locallist'.  This local list is not used to sort
         the LRU property.  Instead, the local list is to batch enough
         operations before acquiring the lock of the global LRU list.  More
         details on this later.
      7. In the later patch, it allows a percpu LRU list by specifying a
         map-attribute for scalability reason and for use cases that need to
         prepare for the worst (and pathological) case like DoS attack.
         The percpu LRU list is completely isolated from each other and the
         LRU nodes (including free nodes) cannot be moved across the list.  The
         following description is for the global LRU list but mostly applicable
         to the percpu LRU list also.
      
      * Global LRU List:
      1. It has three sub-lists: active-list, inactive-list and free-list.
      2. The two list idea, active and inactive, is borrowed from the
         page cache.
      3. All nodes are pre-allocated and all sit at the free-list (of the
         global LRU list) at the beginning.  The pre-allocation reasoning
         is similar to the existing BPF_MAP_TYPE_HASH.  However,
         opting-out prealloc (BPF_F_NO_PREALLOC) is not supported in
         the LRU map.
      
      * Active/Inactive List (of the global LRU list):
      1. The active list, as its name says it, maintains the active set of
         the nodes.  We can think of it as the working set or more frequently
         accessed nodes.  The access frequency is approximated by a ref-bit.
         The ref-bit is set during the bpf_lookup_elem().
      2. The inactive list, as its name also says it, maintains a less
         active set of nodes.  They are the candidates to be removed
         from the bpf_htab when we are running out of free nodes.
      3. The ordering of these two lists is acting as a rough clock.
         The tail of the inactive list is the older nodes and
         should be released first if the bpf_htab needs free element.
      
      * Rotating the Active/Inactive List (of the global LRU list):
      1. It is the basic operation to maintain the LRU property of
         the global list.
      2. The active list is only rotated when the inactive list is running
         low.  This idea is similar to the current page cache.
         Inactive running low is currently defined as
         "# of inactive < # of active".
      3. The active list rotation always starts from the tail.  It moves
         node without ref-bit set to the head of the inactive list.
         It moves node with ref-bit set back to the head of the active
         list and then clears its ref-bit.
      4. The inactive rotation is pretty simply.
         It walks the inactive list and moves the nodes back to the head of
         active list if its ref-bit is set. The ref-bit is cleared after moving
         to the active list.
         If the node does not have ref-bit set, it just leave it as it is
         because it is already in the inactive list.
      
      * Shrinking the Inactive List (of the global LRU list):
      1. Shrinking is the operation to get free nodes when the bpf_htab is
         full.
      2. It usually only shrinks the inactive list to get free nodes.
      3. During shrinking, it will walk the inactive list from the tail,
         delete the nodes without ref-bit set from bpf_htab.
      4. If no free node found after step (3), it will forcefully get
         one node from the tail of inactive or active list.  Forcefully is
         in the sense that it ignores the ref-bit.
      
      * Local List:
      1. Each CPU has a 'struct bpf_lru_locallist'.  The purpose is to
         batch enough operations before acquiring the lock of the
         global LRU.
      2. A local list has two sub-lists, free-list and pending-list.
      3. During bpf_update_elem(), it will try to get from the free-list
         of (the current CPU local list).
      4. If the local free-list is empty, it will acquire from the
         global LRU list.  The global LRU list can either satisfy it
         by its global free-list or by shrinking the global inactive
         list.  Since we have acquired the global LRU list lock,
         it will try to get at most LOCAL_FREE_TARGET elements
         to the local free list.
      5. When a new element is added to the bpf_htab, it will
         first sit at the pending-list (of the local list) first.
         The pending-list will be flushed to the global LRU list
         when it needs to acquire free nodes from the global list
         next time.
      
      * Lock Consideration:
      The LRU list has a lock (lru_lock).  Each bucket of htab has a
      lock (buck_lock).  If both locks need to be acquired together,
      the lock order is always lru_lock -> buck_lock and this only
      happens in the bpf_lru_list.c logic.
      
      In hashtab.c, both locks are not acquired together (i.e. one
      lock is always released first before acquiring another lock).
      Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3a08c2fd
  2. 15 11月, 2016 15 次提交
    • D
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · bb598c1b
      David S. Miller 提交于
      Several cases of bug fixes in 'net' overlapping other changes in
      'net-next-.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bb598c1b
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · e76d21c4
      Linus Torvalds 提交于
      Pull networking fixes from David Miller:
      
       1) Fix off by one wrt. indexing when dumping /proc/net/route entries,
          from Alexander Duyck.
      
       2) Fix lockdep splats in iwlwifi, from Johannes Berg.
      
       3) Cure panic when inserting certain netfilter rules when NFT_SET_HASH
          is disabled, from Liping Zhang.
      
       4) Memory leak when nft_expr_clone() fails, also from Liping Zhang.
      
       5) Disable UFO when path will apply IPSEC tranformations, from Jakub
          Sitnicki.
      
       6) Don't bogusly double cwnd in dctcp module, from Florian Westphal.
      
       7) skb_checksum_help() should never actually use the value "0" for the
          resulting checksum, that has a special meaning, use CSUM_MANGLED_0
          instead. From Eric Dumazet.
      
       8) Per-tx/rx queue statistic strings are wrong in qed driver, fix from
          Yuval MIntz.
      
       9) Fix SCTP reference counting of associations and transports in
          sctp_diag. From Xin Long.
      
      10) When we hit ip6tunnel_xmit() we could have come from an ipv4 path in
          a previous layer or similar, so explicitly clear the ipv6 control
          block in the skb. From Eli Cooper.
      
      11) Fix bogus sleeping inside of inet_wait_for_connect(), from WANG
          Cong.
      
      12) Correct deivce ID of T6 adapter in cxgb4 driver, from Hariprasad
          Shenai.
      
      13) Fix potential access past the end of the skb page frag array in
          tcp_sendmsg(). From Eric Dumazet.
      
      14) 'skb' can legitimately be NULL in inet{,6}_exact_dif_match(). Fix
          from David Ahern.
      
      15) Don't return an error in tcp_sendmsg() if we wronte any bytes
          successfully, from Eric Dumazet.
      
      16) Extraneous unlocks in netlink_diag_dump(), we removed the locking
          but forgot to purge these unlock calls. From Eric Dumazet.
      
      17) Fix memory leak in error path of __genl_register_family(). We leak
          the attrbuf, from WANG Cong.
      
      18) cgroupstats netlink policy table is mis-sized, from WANG Cong.
      
      19) Several XDP bug fixes in mlx5, from Saeed Mahameed.
      
      20) Fix several device refcount leaks in network drivers, from Johan
          Hovold.
      
      21) icmp6_send() should use skb dst device not skb->dev to determine L3
          routing domain. From David Ahern.
      
      22) ip_vs_genl_family sets maxattr incorrectly, from WANG Cong.
      
      23) We leak new macvlan port in some cases of maclan_common_netlink()
          errors. Fix from Gao Feng.
      
      24) Similar to the icmp6_send() fix, icmp_route_lookup() should
          determine L3 routing domain using skb_dst(skb)->dev not skb->dev.
          Also from David Ahern.
      
      25) Several fixes for route offloading and FIB notification handling in
          mlxsw driver, from Jiri Pirko.
      
      26) Properly cap __skb_flow_dissect()'s return value, from Eric Dumazet.
      
      27) Fix long standing regression in ipv4 redirect handling, wrt.
          validating the new neighbour's reachability. From Stephen Suryaputra
          Lin.
      
      28) If sk_filter() trims the packet excessively, handle it reasonably in
          tcp input instead of exploding. From Eric Dumazet.
      
      29) Fix handling of napi hash state when copying channels in sfc driver,
          from Bert Kenward.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (121 commits)
        mlxsw: spectrum_router: Flush FIB tables during fini
        net: stmmac: Fix lack of link transition for fixed PHYs
        sctp: change sk state only when it has assocs in sctp_shutdown
        bnx2: Wait for in-flight DMA to complete at probe stage
        Revert "bnx2: Reset device during driver initialization"
        ps3_gelic: fix spelling mistake in debug message
        net: ethernet: ixp4xx_eth: fix spelling mistake in debug message
        ibmvnic: Fix size of debugfs name buffer
        ibmvnic: Unmap ibmvnic_statistics structure
        sfc: clear napi_hash state when copying channels
        mlxsw: spectrum_router: Correctly dump neighbour activity
        mlxsw: spectrum: Fix refcount bug on span entries
        bnxt_en: Fix VF virtual link state.
        bnxt_en: Fix ring arithmetic in bnxt_setup_tc().
        Revert "include/uapi/linux/atm_zatm.h: include linux/time.h"
        tcp: take care of truncations done by sk_filter()
        ipv4: use new_gw for redirect neigh lookup
        r8152: Fix error path in open function
        net: bpqether.h: remove if_ether.h guard
        net: __skb_flow_dissect() must cap its return value
        ...
      e76d21c4
    • L
      Merge branch 'stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile · d4b95323
      Linus Torvalds 提交于
      Pull arch/tile bugfix from Chris Metcalf:
       "This just fixes an incompatibility with tile __ro_after_init"
      
      * 'stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
        tile: handle __ro_after_init like parisc does
      d4b95323
    • L
      Merge tag 'rtc-4.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux · ac38126b
      Linus Torvalds 提交于
      Pull RTC fixes from Alexandre Belloni:
       "Here are a few driver fixes for 4.9. It has been calm for a while so I
        don't expect more for this cycle.
      
        Drivers:
         - asm9260: fix module autoload
         - cmos: fix crashes
         - omap: fix clock handling"
      
      * tag 'rtc-4.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux:
        rtc: omap: prevent disabling of clock/module during suspend
        rtc: omap: Fix selecting external osc
        rtc: cmos: Don't enable interrupts in the middle of the interrupt handler
        rtc: cmos: remove all __exit_p annotations
        rtc: asm9260: fix module autoload
      ac38126b
    • C
      tile: handle __ro_after_init like parisc does · e123386b
      Chris Metcalf 提交于
      The tile architecture already marks RO_DATA as read-only in
      the kernel, so grouping RO_AFTER_INIT_DATA with RO_DATA, as is
      done by default, means the kernel faults in init when it tries
      to write to RO_AFTER_INIT_DATA.  For now, just arrange that
      __ro_after_init is handled like __write_once, i.e. __read_mostly.
      Reviewed-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: NChris Metcalf <cmetcalf@mellanox.com>
      e123386b
    • I
      mlxsw: spectrum_router: Flush FIB tables during fini · ac571de9
      Ido Schimmel 提交于
      Since commit b45f64d1 ("mlxsw: spectrum_router: Use FIB notifications
      instead of switchdev calls") we reflect to the device the entire FIB
      table and not only FIBs that point to netdevs created by the driver.
      
      During module removal, FIBs of the second type are removed following
      NETDEV_UNREGISTER events sent. The other FIBs are still present in both
      the driver's cache and the device's table.
      
      Fix this by iterating over all the FIB tables in the device and flush
      them. There's no need to take locks, as we're the only writer.
      
      Fixes: b45f64d1 ("mlxsw: spectrum_router: Use FIB notifications instead of switchdev calls")
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ac571de9
    • F
      mdio: Demote print from info to debug in mdio_driver_register · eb2ca35f
      Florian Fainelli 提交于
      While it is useful to know which MDIO driver is being registered, demote
      the pr_info() to a pr_debug().
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      eb2ca35f
    • F
      net: stmmac: Fix lack of link transition for fixed PHYs · c51e424d
      Florian Fainelli 提交于
      Commit 52f95bbf ("stmmac: fix adjust link call in case of a switch
      is attached") added some logic to avoid polling the fixed PHY and
      therefore invoking the adjust_link callback more than once, since this
      is a fixed PHY and link events won't be generated.
      
      This works fine the first time, because we start with phydev->irq =
      PHY_POLL, so we call adjust_link, then we set phydev->irq =
      PHY_IGNORE_INTERRUPT and we stop polling the PHY.
      
      Now, if we called ndo_close(), which calls both phy_stop() and does an
      explicit netif_carrier_off(), we end up with a link down. Upon calling
      ndo_open() again, despite starting the PHY state machine, we have
      PHY_IGNORE_INTERRUPT set, and we generate no link event at all, so the
      link is permanently down.
      
      Fixes: 52f95bbf ("stmmac: fix adjust link call in case of a switch is attached")
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Acked-by: NGiuseppe Cavallaro <peppe.cavallaro@st.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c51e424d
    • G
      driver: macvlan: Replace integer number with bool value · d94d0254
      Gao Feng 提交于
      The return value of function macvlan_addr_busy is used as bool value,
      so use bool value instead of integer number "1" and "0".
      Signed-off-by: NGao Feng <gfree.wind@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d94d0254
    • M
      bpf: Use u64_to_user_ptr() · 535e7b4b
      Mickaël Salaün 提交于
      Replace the custom u64_to_ptr() function with the u64_to_user_ptr()
      macro.
      Signed-off-by: NMickaël Salaün <mic@digikod.net>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      535e7b4b
    • X
      sctp: change sk state only when it has assocs in sctp_shutdown · 5bf35ddf
      Xin Long 提交于
      Now when users shutdown a sock with SEND_SHUTDOWN in sctp, even if
      this sock has no connection (assoc), sk state would be changed to
      SCTP_SS_CLOSING, which is not as we expect.
      
      Besides, after that if users try to listen on this sock, kernel
      could even panic when it dereference sctp_sk(sk)->bind_hash in
      sctp_inet_listen, as bind_hash is null when sock has no assoc.
      
      This patch is to move sk state change after checking sk assocs
      is not empty, and also merge these two if() conditions and reduce
      indent level.
      
      Fixes: d46e416c ("sctp: sctp should change socket state when shutdown is received")
      Reported-by: NAndrey Konovalov <andreyknvl@google.com>
      Tested-by: NAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5bf35ddf
    • D
      Merge branch 'bnx2-kdump-fix' · 193f5122
      David S. Miller 提交于
      Baoquan He says:
      
      ====================
      bnx2: Wait for in-flight DMA to complete at probe stage
      
      This is v2 post.
      
      In commit 3e1be7ad ("bnx2: Reset device during driver initialization"),
      firmware requesting code was moved from open stage to probe stage.
      The reason is in kdump kernel hardware iommu need device be reset in
      driver probe stage, otherwise those in-flight DMA from 1st kernel
      will continue going and look up into the newly created io-page tables.
      However bnx2 chip resetting involves firmware requesting issue, that
      need be done in open stage.
      
      Michale Chan suggested we can just wait for the old in-flight DMA to
      complete at probe stage, then though without device resetting, we
      don't need to worry the old in-flight DMA could continue looking up
      the newly created io-page tables.
      
      v1->v2:
          Michael suggested to wait for the in-flight DMA to complete at probe
          stage. So give up the old method of trying to reset chip at probe
          stage, take the new way accordingly.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      193f5122
    • B
      bnx2: Wait for in-flight DMA to complete at probe stage · 6df77862
      Baoquan He 提交于
      In-flight DMA from 1st kernel could continue going in kdump kernel.
      New io-page table has been created before bnx2 does reset at open stage.
      We have to wait for the in-flight DMA to complete to avoid it look up
      into the newly created io-page table at probe stage.
      Suggested-by: NMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: NBaoquan He <bhe@redhat.com>
      Acked-by: NMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6df77862
    • B
      Revert "bnx2: Reset device during driver initialization" · 5d0d4b91
      Baoquan He 提交于
      This reverts commit 3e1be7ad.
      
      When people build bnx2 driver into kernel, it will fail to detect
      and load firmware because firmware is contained in initramfs and
      initramfs has not been uncompressed yet during do_initcalls. So
      revert commit 3e1be7ad and work out a new way in the later patch.
      Signed-off-by: NBaoquan He <bhe@redhat.com>
      Acked-by: NPaul Menzel <pmenzel@molgen.mpg.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5d0d4b91
    • C
      ps3_gelic: fix spelling mistake in debug message · 7020637b
      Colin Ian King 提交于
      Trivial fix to spelling mistake "unmached" to "unmatched" in
      debug message.
      Signed-off-by: NColin Ian King <colin.king@canonical.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7020637b