1. 25 2月, 2017 2 次提交
  2. 24 2月, 2017 3 次提交
  3. 23 2月, 2017 35 次提交
    • D
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf · ccaba062
      David S. Miller 提交于
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter fixes for net
      
      The following patchset contains Netfilter fixes for your net tree,
      they are:
      
      1) Revisit warning logic when not applying default helper assignment.
         Jiri Kosina considers we are breaking existing setups and not warning
         our users accordinly now that automatic helper assignment has been
         turned off by default. So let's make him happy by spotting the warning
         by when we find a helper but we cannot attach, instead of warning on the
         former deprecated behaviour. Patch from Jiri Kosina.
      
      2) Two patches to fix regression in ctnetlink interfaces with
         nfnetlink_queue. Specifically, perform more relaxed in CTA_STATUS
         and do not bail out if CTA_HELP indicates the same helper that we
         already have. Patches from Kevin Cernekee.
      
      3) A couple of bugfixes for ipset via Jozsef Kadlecsik. Due to wrong
         index logic in hash set types and null pointer exception in the
         list:set type.
      
      4) hashlimit bails out with correct userspace parameters due to wrong
         arithmetics in the code that avoids "divide by zero" when
         transforming the userspace timing in milliseconds to token credits.
         Patch from Alban Browaeys.
      
      5) Fix incorrect NFQA_VLAN_MAX definition, patch from
         Ken-ichirou MATSUZAWA.
      
      6) Don't not declare nfnetlink batch error list as static, since this
         may be used by several subsystems at the same time. Patch from
         Liping Zhang.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ccaba062
    • D
      Merge branch 'mlx4-misc-fixes' · e65ade77
      David S. Miller 提交于
      Tariq Toukan says:
      
      ====================
      mlx4 misc fixes
      
      This patchset contains misc bug fixes from Eric Dumazet and our team
      to the mlx4 Core and Eth drivers.
      
      Series generated against net commit:
      eee2faab tcp: account for ts offset only if tsecr not zero
      
      v3:
      * Rebased, conflict solved.
      
      v2:
      * Added Eric's fix (patch 5/5).
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e65ade77
    • E
      net/mlx4_en: Use __skb_fill_page_desc() · 7f0137e2
      Eric Dumazet 提交于
      Or we might miss the fact that a page was allocated from memory reserves.
      
      Fixes: dceeab0e ("mlx4: support __GFP_MEMALLOC for rx")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7f0137e2
    • J
      net/mlx4_core: Use cq quota in SRIOV when creating completion EQs · 6ed63d84
      Jack Morgenstein 提交于
      When creating EQs to handle CQ completion events for the PF
      or for VFs, we create enough EQE entries to handle completions
      for the max number of CQs that can use that EQ.
      
      When SRIOV is activated, the max number of CQs a VF (or the PF) can
      obtain is its CQ quota (determined by the Hypervisor resource tracker).
      Therefore, when creating an EQ, the number of EQE entries that the VF
      should request for that EQ is the CQ quota value (and not the total
      number of CQs available in the FW).
      
      Under SRIOV, the PF, also must use its CQ quota, because
      the resource tracker also controls how many CQs the PF can obtain.
      
      Using the FW total CQs instead of the CQ quota when creating EQs resulted
      wasting MTT entries, due to allocating more EQEs than were needed.
      
      Fixes: 5a0d0a61 ("mlx4: Structures and init/teardown for VF resource quotas")
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Reported-by: NDexuan Cui <decui@microsoft.com>
      Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6ed63d84
    • M
      net/mlx4_core: Fix VF overwrite of module param which disables DMFS on new probed PFs · 95f1ba9a
      Majd Dibbiny 提交于
      In the VF driver, module parameter mlx4_log_num_mgm_entry_size was
      mistakenly overwritten -- and in a manner which overrode the
      device-managed flow steering option encoded in the parameter.
      
      log_num_mgm_entry_size is a global module parameter which
      affects all ConnectX-3 PFs installed on that host.
      If a VF changes log_num_mgm_entry_size, this will affect all PFs
      which are probed subsequent to the change (by disabling DMFS for
      those PFs).
      
      Fixes: 3c439b55 ("mlx4_core: Allow choosing flow steering mode")
      Signed-off-by: NMajd Dibbiny <majd@mellanox.com>
      Reviewed-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      95f1ba9a
    • E
      net/mlx4: Spoofcheck and zero MAC can't coexist · 745d8ae4
      Eugenia Emantayev 提交于
      Spoofcheck can't be enabled if VF MAC is zero.
      Vice versa, can't zero MAC if spoofcheck is on.
      
      Fixes: 8f7ba3ca ('net/mlx4: Add set VF mac address support')
      Signed-off-by: NEugenia Emantayev <eugenia@mellanox.com>
      Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      745d8ae4
    • O
      net/mlx4: Change ENOTSUPP to EOPNOTSUPP · 423b3aec
      Or Gerlitz 提交于
      As ENOTSUPP is specific to NFS, change the return error value to
      EOPNOTSUPP in various places in the mlx4 driver.
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Suggested-by: NYotam Gigi <yotamg@mellanox.com>
      Reviewed-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      423b3aec
    • D
      uapi: fix linux/rds.h userspace compilation errors · c12f4d76
      Dmitry V. Levin 提交于
      Consistently use types from linux/types.h to fix the following
      linux/rds.h userspace compilation errors:
      
      /usr/include/linux/rds.h:198:2: error: unknown type name 'u8'
        u8 rx_traces;
      /usr/include/linux/rds.h:199:2: error: unknown type name 'u8'
        u8 rx_trace_pos[RDS_MSG_RX_DGRAM_TRACE_MAX];
      /usr/include/linux/rds.h:203:2: error: unknown type name 'u8'
        u8 rx_traces;
      /usr/include/linux/rds.h:204:2: error: unknown type name 'u8'
        u8 rx_trace_pos[RDS_MSG_RX_DGRAM_TRACE_MAX];
      /usr/include/linux/rds.h:205:2: error: unknown type name 'u64'
        u64 rx_trace[RDS_MSG_RX_DGRAM_TRACE_MAX];
      
      Fixes: 3289025a ("RDS: add receive message trace used by application")
      Signed-off-by: NDmitry V. Levin <ldv@altlinux.org>
      Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c12f4d76
    • D
      uapi: fix linux/seg6.h and linux/seg6_iptunnel.h userspace compilation errors · ea3ebc73
      Dmitry V. Levin 提交于
      Include <linux/in6.h> in uapi/linux/seg6.h to fix the following
      linux/seg6.h userspace compilation error:
      
      /usr/include/linux/seg6.h:31:18: error: array type has incomplete element type 'struct in6_addr'
        struct in6_addr segments[0];
      
      Include <linux/seg6.h> in uapi/linux/seg6_iptunnel.h to fix
      the following linux/seg6_iptunnel.h userspace compilation error:
      
      /usr/include/linux/seg6_iptunnel.h:26:21: error: array type has incomplete element type 'struct ipv6_sr_hdr'
        struct ipv6_sr_hdr srh[0];
      
      Fixes: a50a05f4 ("ipv6: sr: add missing Kbuild export for header files")
      Signed-off-by: NDmitry V. Levin <ldv@altlinux.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ea3ebc73
    • J
      lib: Remove string from parman config selection · 50ab3af1
      Jiri Pirko 提交于
      As reported by Geert, remove the string so the user does not see this
      config option. The option is explicitly selected only as a dependency of
      in-kernel users.
      Reported-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Fixes: 44091d29 ("lib: Introduce priority array area manager")
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Tested-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      50ab3af1
    • Z
      forcedeth: Remove return from a void function · ca92aea9
      Zhu Yanjun 提交于
      In a void function, it is not necessary to append a return statement in it.
      Signed-off-by: NZhu Yanjun <yanjun.zhu@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ca92aea9
    • C
      bpf: fix spelling mistake: "proccessed" -> "processed" · bc1750f3
      Colin Ian King 提交于
      trivial fix to spelling mistake in verbose log message
      Signed-off-by: NColin Ian King <colin.king@canonical.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bc1750f3
    • D
      uapi: fix linux/llc.h userspace compilation error · 40df93be
      Dmitry V. Levin 提交于
      Include <linux/if.h> to fix the following linux/llc.h userspace
      compilation error:
      
      /usr/include/linux/llc.h:26:27: error: 'IFHWADDRLEN' undeclared here (not in a function)
        unsigned char   sllc_mac[IFHWADDRLEN];
      Signed-off-by: NDmitry V. Levin <ldv@altlinux.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      40df93be
    • D
      uapi: fix linux/ip6_tunnel.h userspace compilation errors · 557d7acd
      Dmitry V. Levin 提交于
      Include <linux/if.h> and <linux/in6.h> to fix the following
      linux/ip6_tunnel.h userspace compilation errors:
      
      /usr/include/linux/ip6_tunnel.h:23:12: error: 'IFNAMSIZ' undeclared here (not in a function)
        char name[IFNAMSIZ]; /* name of tunnel device */
      /usr/include/linux/ip6_tunnel.h:30:18: error: field 'laddr' has incomplete type
        struct in6_addr laddr; /* local tunnel end-point address */
      Signed-off-by: NDmitry V. Levin <ldv@altlinux.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      557d7acd
    • D
      Merge branch 'mlx5-fixes' · 79873fb6
      David S. Miller 提交于
      Saeed Mahameed says:
      
      ====================
      Mellanox mlx5e fixes for 4.11-rc1
      
      This series includes some important bug fixes for mlx5e driver.
      
      Three misc fixes:
      From Mohamad, compilation fix on s390 system
      From Me, A fix for driver unload when switchdev mode is on.
      From Tariq, HW LRO frag size optimization for when build_skb is not used
      (striding RQ mode).
      
      Three CQE compression related fixes:
      Two fixes from Tariq and I, to correctly setup CQE compression
      parameters on driver load and on arbitrary user modifications.
      Last patch, fixes a very critical issue that was originally reported
      by Tom, where the driver reported csum errors or even page ref issues
      for when cqe compression is enabled and rapidly active.
      
      For your convenience this series was generated on top of net-next branch:
      005c3490 ('Revert "ath10k: Search SMBIOS for OEM board file extension"')
      
      for -stable:
      net/mlx5e: Register/unregister vport representors on interface (for kernel >= 4.9)
      net/mlx5e: Do not reduce LRO WQE size when not using build_skb (for kernel >= 4.9)
      net/mlx5e: Fix broken CQE compression initialization (for kernel >= 4.9)
      net/mlx5e: Update MPWQE stride size when modifying CQE compress state (for kernel >= 4.7)
      net/mlx5e: Fix wrong CQE decompression (for kernel >= 4.7)
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      79873fb6
    • T
      net/mlx5e: Fix wrong CQE decompression · 36154be4
      Tariq Toukan 提交于
      In cqe compression with striding RQ, the decompression of the CQE field
      wqe_counter was done with a wrong wraparound value.
      This caused handling cqes with a wrong pointer to wqe (rx descriptor)
      and creating SKBs with wrong data, pointing to wrong (and already consumed)
      strides/pages.
      
      The meaning of the CQE field wqe_counter in striding RQ holds the
      stride index instead of the WQE index. Hence, when decompressing
      a CQE, wqe_counter should have wrapped-around the number of strides
      in a single multi-packet WQE.
      
      We dropped this wrap-around mask at all in CQE decompression of striding
      RQ. It is not needed as in such cases the CQE compression session would
      break because of different value of wqe_id field, starting a new
      compression session.
      
      Tested:
       ethtool -K ethxx lro off/on
       ethtool --set-priv-flags ethxx rx_cqe_compress on
       super_netperf 16 {ipv4,ipv6} -t TCP_STREAM -m 50 -D
       verified no csum errors and no page refcount issues.
      
      Fixes: 7219ab34 ("net/mlx5e: CQE compression")
      Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
      Reported-by: NTom Herbert <tom@herbertland.com>
      Cc: kernel-team@fb.com
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      36154be4
    • S
      net/mlx5e: Update MPWQE stride size when modifying CQE compress state · 6dc4b54e
      Saeed Mahameed 提交于
      When the admin enables/disables cqe compression, updating
      mpwqe stride size is required:
          CQE compress ON  ==> stride size = 256B
          CQE compress OFF ==> stride size = 64B
      
      This is already done on driver load via mlx5e_set_rq_type_params, all we
      need is just to call it on arbitrary admin changes of cqe compression
      state via priv flags or when changing timestamping state
      (as it is mutually exclusive with cqe compression).
      
      This bug introduces no functional damage, it only makes cqe compression
      occur less often, since in ConnectX4-LX CQE compression is performed
      only on packets smaller than stride size.
      
      Tested:
       ethtool --set-priv-flags ethxx rx_cqe_compress on
       pktgen with  64 < pkt size < 256 and netperf TCP_STREAM (IPv4/IPv6)
       verify `ethtool -S ethxx | grep compress` are advancing more often
       (rapidly)
      
      Fixes: 7219ab34 ("net/mlx5e: CQE compression")
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Reviewed-by: NTariq Toukan <tariqt@mellanox.com>
      Cc: kernel-team@fb.com
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6dc4b54e
    • T
      net/mlx5e: Fix broken CQE compression initialization · b0d4660b
      Tariq Toukan 提交于
      Some of RQ type parameters are derived from CQE compression state flag,
      CQE compression flag was initialized only after RQ type parameters
      setup. This leads to load RQ with stride size smaller than what we
      want for when CQE compression is on.
      
      This bug introduces no functional damage, it only makes CQE compression
      occur less often, since in ConnectX4-LX CQE compression is performed
      only on packets smaller than stride size.
      
      Fix this by marking default status of CQE compression in PFLAG prior to
      calling mlx5e_set_rq_priv_params(), as it inits some fields based on it.
      
      Tested:
       load driver on systems where rx CQE compress will be on (MH)
       pktgen with  64 < pkt size < 256 and netperf TCP_STREAM (IPv4/IPv6)
       verify `ethtool -S ethxx | grep compress` are advancing more often
       (rapidly)
      
      Fixes: 2fc4bfb7 ("net/mlx5e: Dynamic RQ type infrastructure")
      Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
      Cc: kernel-team@fb.com
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b0d4660b
    • T
      net/mlx5e: Do not reduce LRO WQE size when not using build_skb · 4078e637
      Tariq Toukan 提交于
      When rq_type is Striding RQ, no room of SKB_RESERVE is needed
      as SKB allocation is not done via build_skb.
      
      Fixes: e4b85508 ("net/mlx5e: Slightly reduce hardware LRO size")
      Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4078e637
    • S
      net/mlx5e: Register/unregister vport representors on interface attach/detach · 6f08a22c
      Saeed Mahameed 提交于
      Currently vport representors are added only on driver load and removed on
      driver unload.  Apparently we forgot to handle them when we added the
      seamless reset flow feature.  This caused to leave the representors
      netdevs alive and active with open HW resources on pci shutdown and on
      error reset flows.
      
      To overcome this we move their handling to interface attach/detach, so
      they would be cleaned up on shutdown and recreated on reset flows.
      
      Fixes: 26e59d80 ("net/mlx5e: Implement mlx5e interface attach/detach callbacks")
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Reviewed-by: NHadar Hen Zion <hadarh@mellanox.com>
      Reviewed-by: NRoi Dayan <roid@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6f08a22c
    • M
      net/mlx5e: s390 system compilation fix · 18bcf742
      Mohamad Haj Yahia 提交于
      Add necessary headers include for s390 arch compilation.
      
      Fixes: e586b3b0 ("net/mlx5: Ethernet Datapath files")
      Fixes: d605d668 ("net/mlx5e: Add support for ethtool self..")
      Signed-off-by: NMohamad Haj Yahia <mohamad@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      18bcf742
    • A
      tcp: account for ts offset only if tsecr not zero · eee2faab
      Alexey Kodanev 提交于
      We can get SYN with zero tsecr, don't apply offset in this case.
      
      Fixes: ee684b6f ("tcp: send packets with a socket timestamp")
      Signed-off-by: NAlexey Kodanev <alexey.kodanev@oracle.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      eee2faab
    • A
      tcp: setup timestamp offset when write_seq already set · 00355fa5
      Alexey Kodanev 提交于
      Found that when randomized tcp offsets are enabled (by default)
      TCP client can still start new connections without them. Later,
      if server does active close and re-uses sockets in TIME-WAIT
      state, new SYN from client can be rejected on PAWS check inside
      tcp_timewait_state_process(), because either tw_ts_recent or
      rcv_tsval doesn't really have an offset set.
      
      Here is how to reproduce it with LTP netstress tool:
          netstress -R 1 &
          netstress -H 127.0.0.1 -lr 1000000 -a1
      
          [...]
          < S  seq 1956977072 win 43690 TS val 295618 ecr 459956970
          > .  ack 1956911535 win 342 TS val 459967184 ecr 1547117608
          < R  seq 1956911535 win 0 length 0
      +1. < S  seq 1956977072 win 43690 TS val 296640 ecr 459956970
          > S. seq 657450664 ack 1956977073 win 43690 TS val 459968205 ecr 296640
      
      Fixes: 95a22cae ("tcp: randomize tcp timestamp offsets for each connection")
      Signed-off-by: NAlexey Kodanev <alexey.kodanev@oracle.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      00355fa5
    • A
      net/dccp: fix use after free in tw_timer_handler() · ec7cb62d
      Andrey Ryabinin 提交于
      DCCP doesn't purge timewait sockets on network namespace shutdown.
      So, after net namespace destroyed we could still have an active timer
      which will trigger use after free in tw_timer_handler():
      
          BUG: KASAN: use-after-free in tw_timer_handler+0x4a/0xa0 at addr ffff88010e0d1e10
          Read of size 8 by task swapper/1/0
          Call Trace:
           __asan_load8+0x54/0x90
           tw_timer_handler+0x4a/0xa0
           call_timer_fn+0x127/0x480
           expire_timers+0x1db/0x2e0
           run_timer_softirq+0x12f/0x2a0
           __do_softirq+0x105/0x5b4
           irq_exit+0xdd/0xf0
           smp_apic_timer_interrupt+0x57/0x70
           apic_timer_interrupt+0x90/0xa0
      
          Object at ffff88010e0d1bc0, in cache net_namespace size: 6848
          Allocated:
           save_stack_trace+0x1b/0x20
           kasan_kmalloc+0xee/0x180
           kasan_slab_alloc+0x12/0x20
           kmem_cache_alloc+0x134/0x310
           copy_net_ns+0x8d/0x280
           create_new_namespaces+0x23f/0x340
           unshare_nsproxy_namespaces+0x75/0xf0
           SyS_unshare+0x299/0x4f0
           entry_SYSCALL_64_fastpath+0x18/0xad
          Freed:
           save_stack_trace+0x1b/0x20
           kasan_slab_free+0xae/0x180
           kmem_cache_free+0xb4/0x350
           net_drop_ns+0x3f/0x50
           cleanup_net+0x3df/0x450
           process_one_work+0x419/0xbb0
           worker_thread+0x92/0x850
           kthread+0x192/0x1e0
           ret_from_fork+0x2e/0x40
      
      Add .exit_batch hook to dccp_v4_ops()/dccp_v6_ops() which will purge
      timewait sockets on net namespace destruction and prevent above issue.
      
      Fixes: f2bf415c ("mib: add net to NET_ADD_STATS_BH")
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NAndrey Ryabinin <aryabinin@virtuozzo.com>
      Acked-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ec7cb62d
    • D
      uapi: fix linux/if.h userspace compilation errors · 2618be7d
      Dmitry V. Levin 提交于
      Include <sys/socket.h> (guarded by ifndef __KERNEL__) to fix
      the following linux/if.h userspace compilation errors:
      
      /usr/include/linux/if.h:234:19: error: field 'ifru_addr' has incomplete type
         struct sockaddr ifru_addr;
      /usr/include/linux/if.h:235:19: error: field 'ifru_dstaddr' has incomplete type
         struct sockaddr ifru_dstaddr;
      /usr/include/linux/if.h:236:19: error: field 'ifru_broadaddr' has incomplete type
         struct sockaddr ifru_broadaddr;
      /usr/include/linux/if.h:237:19: error: field 'ifru_netmask' has incomplete type
         struct sockaddr ifru_netmask;
      /usr/include/linux/if.h:238:20: error: field 'ifru_hwaddr' has incomplete type
         struct  sockaddr ifru_hwaddr;
      
      This also fixes userspace compilation of the following uapi headers:
        linux/atmbr2684.h
        linux/gsmmux.h
        linux/if_arp.h
        linux/if_bonding.h
        linux/if_frad.h
        linux/if_pppox.h
        linux/if_tunnel.h
        linux/netdevice.h
        linux/route.h
        linux/wireless.h
      
      As no uapi header provides a definition of struct sockaddr, inclusion
      of <sys/socket.h> seems to be the most conservative and the only safe
      fix available.
      
      All current users of <linux/if.h> are very likely to be including
      <sys/socket.h> already because the latter is the sole provider
      of struct sockaddr definition in libc, so adding a uapi header
      with a definition of struct sockaddr would create a potential
      conflict with <sys/socket.h>.
      
      Replacing struct sockaddr in the definition of struct ifreq with
      a different type would create a potential incompatibility with current
      users of struct ifreq who might rely on ifru_addr et al members being
      of type struct sockaddr.
      Signed-off-by: NDmitry V. Levin <ldv@altlinux.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2618be7d
    • R
      l2tp: Avoid schedule while atomic in exit_net · 12d656af
      Ridge Kennedy 提交于
      While destroying a network namespace that contains a L2TP tunnel a
      "BUG: scheduling while atomic" can be observed.
      
      Enabling lockdep shows that this is happening because l2tp_exit_net()
      is calling l2tp_tunnel_closeall() (via l2tp_tunnel_delete()) from
      within an RCU critical section.
      
      l2tp_exit_net() takes rcu_read_lock_bh()
        << list_for_each_entry_rcu() >>
        l2tp_tunnel_delete()
          l2tp_tunnel_closeall()
            __l2tp_session_unhash()
              synchronize_rcu() << Illegal inside RCU critical section >>
      
      BUG: sleeping function called from invalid context
      in_atomic(): 1, irqs_disabled(): 0, pid: 86, name: kworker/u16:2
      INFO: lockdep is turned off.
      CPU: 2 PID: 86 Comm: kworker/u16:2 Tainted: G        W  O    4.4.6-at1 #2
      Hardware name: Xen HVM domU, BIOS 4.6.1-xs125300 05/09/2016
      Workqueue: netns cleanup_net
       0000000000000000 ffff880202417b90 ffffffff812b0013 ffff880202410ac0
       ffffffff81870de8 ffff880202417bb8 ffffffff8107aee8 ffffffff81870de8
       0000000000000c51 0000000000000000 ffff880202417be0 ffffffff8107b024
      Call Trace:
       [<ffffffff812b0013>] dump_stack+0x85/0xc2
       [<ffffffff8107aee8>] ___might_sleep+0x148/0x240
       [<ffffffff8107b024>] __might_sleep+0x44/0x80
       [<ffffffff810b21bd>] synchronize_sched+0x2d/0xe0
       [<ffffffff8109be6d>] ? trace_hardirqs_on+0xd/0x10
       [<ffffffff8105c7bb>] ? __local_bh_enable_ip+0x6b/0xc0
       [<ffffffff816a1b00>] ? _raw_spin_unlock_bh+0x30/0x40
       [<ffffffff81667482>] __l2tp_session_unhash+0x172/0x220
       [<ffffffff81667397>] ? __l2tp_session_unhash+0x87/0x220
       [<ffffffff8166888b>] l2tp_tunnel_closeall+0x9b/0x140
       [<ffffffff81668c74>] l2tp_tunnel_delete+0x14/0x60
       [<ffffffff81668dd0>] l2tp_exit_net+0x110/0x270
       [<ffffffff81668d5c>] ? l2tp_exit_net+0x9c/0x270
       [<ffffffff815001c3>] ops_exit_list.isra.6+0x33/0x60
       [<ffffffff81501166>] cleanup_net+0x1b6/0x280
       ...
      
      This bug can easily be reproduced with a few steps:
      
       $ sudo unshare -n bash  # Create a shell in a new namespace
       # ip link set lo up
       # ip addr add 127.0.0.1 dev lo
       # ip l2tp add tunnel remote 127.0.0.1 local 127.0.0.1 tunnel_id 1 \
          peer_tunnel_id 1 udp_sport 50000 udp_dport 50000
       # ip l2tp add session name foo tunnel_id 1 session_id 1 \
          peer_session_id 1
       # ip link set foo up
       # exit  # Exit the shell, in turn exiting the namespace
       $ dmesg
       ...
       [942121.089216] BUG: scheduling while atomic: kworker/u16:3/13872/0x00000200
       ...
      
      To fix this, move the call to l2tp_tunnel_closeall() out of the RCU
      critical section, and instead call it from l2tp_tunnel_del_work(), which
      is running from the l2tp_wq workqueue.
      
      Fixes: 2b551c6e ("l2tp: close sessions before initiating tunnel delete")
      Signed-off-by: NRidge Kennedy <ridge.kennedy@alliedtelesis.co.nz>
      Acked-by: NGuillaume Nault <g.nault@alphalink.fr>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      12d656af
    • B
      qlogic: netxen: constify bin_attribute structures · ff292458
      Bhumika Goyal 提交于
      Declare bin_attribute structures as const as they are only passed as an
      arguments to the functions device_remove_bin_file and
      device_create_bin_file. These function arguments are of type const, so
      bin_attribute structures having this property can be made const too.
      Done using Coccinelle:
      
      @r1 disable optional_qualifier @
      identifier i;
      position p;
      @@
      static struct bin_attribute i@p = {...};
      
      @ok1@
      identifier r1.i;
      position p,p1;
      @@
      (
      device_remove_bin_file(...,&i@p)
      |
      device_create_bin_file(..., &i@p1)
      )
      
      @bad@
      position p!={r1.p,ok1.p,ok1.p1};
      identifier r1.i;
      @@
      i@p
      
      @depends on !bad disable optional_qualifier@
      identifier r1.i;
      @@
      +const
      struct bin_attribute i;
      Signed-off-by: NBhumika Goyal <bhumirks@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ff292458
    • B
      qlogic: qlcnic_sysfs: constify bin_attribute structures · 0ccea221
      Bhumika Goyal 提交于
      Declare bin_attribute structures as const as they are only passed as an
      arguments to the functions device_remove_bin_file and
      device_create_bin_file. These function arguments are of type const, so
      bin_attribute structures having this property can be made const too.
      Done using Coccinelle:
      
      @r1 disable optional_qualifier @
      identifier i;
      position p;
      @@
      static struct bin_attribute i@p = {...};
      
      @ok1@
      identifier r1.i;
      position p,p1;
      @@
      (
      device_remove_bin_file(...,&i@p)
      |
      device_create_bin_file(..., &i@p1)
      )
      
      @bad@
      position p!={r1.p,ok1.p,ok1.p1};
      identifier r1.i;
      @@
      i@p
      
      @depends on !bad disable optional_qualifier@
      identifier r1.i;
      @@
      +const
      struct bin_attribute i;
      Signed-off-by: NBhumika Goyal <bhumirks@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0ccea221
    • C
      net: emac: add support for device-tree based PHY discovery and setup · a577ca6b
      Christian Lamparter 提交于
      This patch adds glue-code that allows the EMAC driver to interface
      with the existing dt-supported PHYs in drivers/net/phy.
      
      Because currently, the emac driver maintains a small library of
      supported phys for in a private phy.c file located in the drivers
      directory.
      
      The support is limited to mostly single ethernet transceiver like the:
      CIS8201, BCM5248, ET1011C, Marvell 88E1111 and 88E1112, AR8035.
      
      However, routers like the Netgear WNDR4700 and Cisco Meraki MX60(W)
      have a 5-port switch (AR8327N) attached to the EMAC. The switch chip
      is supported by the qca8k mdio driver, which uses the generic phy
      library. Another reason is that PHYLIB also supports the BCM54610,
      which was used for the Western Digital My Book Live.
      
      This will now also make EMAC select PHYLIB.
      Signed-off-by: NChristian Lamparter <chunkeey@googlemail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a577ca6b
    • L
      Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · ca78d317
      Linus Torvalds 提交于
      Pull arm64 updates from Will Deacon:
       - Errata workarounds for Qualcomm's Falkor CPU
       - Qualcomm L2 Cache PMU driver
       - Qualcomm SMCCC firmware quirk
       - Support for DEBUG_VIRTUAL
       - CPU feature detection for userspace via MRS emulation
       - Preliminary work for the Statistical Profiling Extension
       - Misc cleanups and non-critical fixes
      
      * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (74 commits)
        arm64/kprobes: consistently handle MRS/MSR with XZR
        arm64: cpufeature: correctly handle MRS to XZR
        arm64: traps: correctly handle MRS/MSR with XZR
        arm64: ptrace: add XZR-safe regs accessors
        arm64: include asm/assembler.h in entry-ftrace.S
        arm64: fix warning about swapper_pg_dir overflow
        arm64: Work around Falkor erratum 1003
        arm64: head.S: Enable EL1 (host) access to SPE when entered at EL2
        arm64: arch_timer: document Hisilicon erratum 161010101
        arm64: use is_vmalloc_addr
        arm64: use linux/sizes.h for constants
        arm64: uaccess: consistently check object sizes
        perf: add qcom l2 cache perf events driver
        arm64: remove wrong CONFIG_PROC_SYSCTL ifdef
        ARM: smccc: Update HVC comment to describe new quirk parameter
        arm64: do not trace atomic operations
        ACPI/IORT: Fix the error return code in iort_add_smmu_platform_device()
        ACPI/IORT: Fix iort_node_get_id() mapping entries indexing
        arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA
        perf: xgene: Include module.h
        ...
      ca78d317
    • L
      Merge tag 'arc-4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc · a4ee7bac
      Linus Torvalds 提交于
      Pull ARC updates from Vineet Gupta:
      
       - Intc imporvements [Yuriy]
      
       - VDK platform updates [Alexey]
      
      * tag 'arc-4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
        ARC: [plat-*] ARC_HAS_COH_CACHES no longer relevant
        ARCv2: intc: Delete useless comments in Device Trees
        ARCv2: IDU-intc: Delete deprecated parameters in Device Trees
        ARCv2: IDU-intc: mask all common interrupts by default
        ARCv2: IDU-intc: Use build registers for getting numbers of interrupts
        ARCv2: intc: Set default priority for all core interrupts
        ARCv2: intc: Use runtime value of irq count for setting up intc
        ARCv2: intc: Rework the build time irq count information
        ARC: [intc-*]: confine NR_CPU_IRQS to intc code
        ARCv2: intc: Use ARC_REG_STATUS32 for addressing STATUS32 reg
        arc: vdk: Add support of UIO
        arc: vdk: Add support of MMC controller
        arc: vdk: Disable halt on reset
      a4ee7bac
    • L
      Merge tag 'powerpc-4.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 38705613
      Linus Torvalds 提交于
      Pull powerpc updates from Michael Ellerman:
       "Highlights include:
      
         - Support for direct mapped LPC on POWER9, giving Linux direct access
           to devices that may be on there such as a UART.
      
         - Memory hotplug support for the Power9 Radix MMU.
      
         - Add new AUX vectors describing the processor's cache geometry, to
           be used by glibc.
      
         - The ability for a guest to ask the hypervisor to resize the guest's
           hash table, and in addition support for doing so automatically when
           memory is hotplugged into/out-of the guest. This allows the hash
           table to be sized based on the current memory usage of the guest,
           rather than the maximum possible memory usage.
      
         - Implementation of optprobes (kprobe optimisation) for powerpc.
      
        In addition there's the topic branch shared with the KVM tree, which
        includes support for guests to use the Radix MMU on Power9.
      
        Thanks to:
          Alistair Popple, Andrew Donnellan, Aneesh Kumar K.V, Anju T, Anton
          Blanchard, Benjamin Herrenschmidt, Chris Packham, Daniel Axtens,
          Daniel Borkmann, David Gibson, Finn Thain, Gautham R. Shenoy, Gavin
          Shan, Greg Kurz, Joel Stanley, John Allen, Madhavan Srinivasan,
          Mahesh Salgaonkar, Markus Elfring, Michael Neuling, Nathan Fontenot,
          Naveen N. Rao, Nicholas Piggin, Paul Mackerras, Ravi Bangoria, Reza
          Arbab, Shailendra Singh, Vaibhav Jain, Wei Yongjun"
      
      * tag 'powerpc-4.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (129 commits)
        powerpc/mm/radix: Skip ptesync in pte update helpers
        powerpc/mm/radix: Use ptep_get_and_clear_full when clearing pte for full mm
        powerpc/mm/radix: Update pte update sequence for pte clear case
        powerpc/mm: Update PROTFAULT handling in the page fault path
        powerpc/xmon: Fix data-breakpoint
        powerpc/mm: Fix build break with BOOK3S_64=n and MEMORY_HOTPLUG=y
        powerpc/mm: Fix build break when CMA=n && SPAPR_TCE_IOMMU=y
        powerpc/mm: Fix build break with RADIX=y & HUGETLBFS=n
        powerpc/pseries: Fix typo in parameter description
        powerpc/kprobes: Remove kprobe_exceptions_notify()
        kprobes: Introduce weak variant of kprobe_exceptions_notify()
        powerpc/ftrace: Fix confusing help text for DISABLE_MPROFILE_KERNEL
        powerpc/powernv: Fix opal_exit tracepoint opcode
        powerpc: Add a prototype for mcount() so it can be versioned
        powerpc: Drop GPL from of_node_to_nid() export to match other arches
        powerpc/kprobes: Optimize kprobe in kretprobe_trampoline()
        powerpc/kprobes: Implement Optprobes
        powerpc/kprobes: Fixes for kprobe_lookup_name() on BE
        powerpc: Add helper to check if offset is within relative branch range
        powerpc/bpf: Introduce __PPC_SH64()
        ...
      38705613
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · ff47d8c0
      Linus Torvalds 提交于
      Pull s390 updates from Martin Schwidefsky:
      
       - New entropy generation for the pseudo random number generator.
      
       - Early boot printk output via sclp to help debug crashes on boot. This
         needs to be enabled with a kernel parameter.
      
       - Add proper no-execute support with a bit in the page table entry.
      
       - Bug fixes and cleanups.
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (65 commits)
        s390/syscall: fix single stepped system calls
        s390/zcrypt: make ap_bus explicitly non-modular
        s390/zcrypt: Removed unneeded debug feature directory creation.
        s390: add missing "do {} while (0)" loop constructs to multiline macros
        s390/mm: add cond_resched call to kernel page table dumper
        s390: get rid of MACHINE_HAS_PFMF and MACHINE_HAS_HPAGE
        s390/mm: make memory_block_size_bytes available for !MEMORY_HOTPLUG
        s390: replace ACCESS_ONCE with READ_ONCE
        s390: Audit and remove any remaining unnecessary uses of module.h
        s390: mm: Audit and remove any unnecessary uses of module.h
        s390: kernel: Audit and remove any unnecessary uses of module.h
        s390/kdump: Use "LINUX" ELF note name instead of "CORE"
        s390: add no-execute support
        s390: report new vector facilities
        s390: use correct input data address for setup_randomness
        s390/sclp: get rid of common response code handling
        s390/sclp: don't add new lines to each printed string
        s390/sclp: make early sclp code readable
        s390/sclp: disable early sclp code as soon as the base sclp driver is active
        s390/sclp: move early printk code to drivers
        ...
      ff47d8c0
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next · 3051bf36
      Linus Torvalds 提交于
      Pull networking updates from David Miller:
       "Highlights:
      
         1) Support TX_RING in AF_PACKET TPACKET_V3 mode, from Sowmini
            Varadhan.
      
         2) Simplify classifier state on sk_buff in order to shrink it a bit.
            From Willem de Bruijn.
      
         3) Introduce SIPHASH and it's usage for secure sequence numbers and
            syncookies. From Jason A. Donenfeld.
      
         4) Reduce CPU usage for ICMP replies we are going to limit or
            suppress, from Jesper Dangaard Brouer.
      
         5) Introduce Shared Memory Communications socket layer, from Ursula
            Braun.
      
         6) Add RACK loss detection and allow it to actually trigger fast
            recovery instead of just assisting after other algorithms have
            triggered it. From Yuchung Cheng.
      
         7) Add xmit_more and BQL support to mvneta driver, from Simon Guinot.
      
         8) skb_cow_data avoidance in esp4 and esp6, from Steffen Klassert.
      
         9) Export MPLS packet stats via netlink, from Robert Shearman.
      
        10) Significantly improve inet port bind conflict handling, especially
            when an application is restarted and changes it's setting of
            reuseport. From Josef Bacik.
      
        11) Implement TX batching in vhost_net, from Jason Wang.
      
        12) Extend the dummy device so that VF (virtual function) features,
            such as configuration, can be more easily tested. From Phil
            Sutter.
      
        13) Avoid two atomic ops per page on x86 in bnx2x driver, from Eric
            Dumazet.
      
        14) Add new bpf MAP, implementing a longest prefix match trie. From
            Daniel Mack.
      
        15) Packet sample offloading support in mlxsw driver, from Yotam Gigi.
      
        16) Add new aquantia driver, from David VomLehn.
      
        17) Add bpf tracepoints, from Daniel Borkmann.
      
        18) Add support for port mirroring to b53 and bcm_sf2 drivers, from
            Florian Fainelli.
      
        19) Remove custom busy polling in many drivers, it is done in the core
            networking since 4.5 times. From Eric Dumazet.
      
        20) Support XDP adjust_head in virtio_net, from John Fastabend.
      
        21) Fix several major holes in neighbour entry confirmation, from
            Julian Anastasov.
      
        22) Add XDP support to bnxt_en driver, from Michael Chan.
      
        23) VXLAN offloads for enic driver, from Govindarajulu Varadarajan.
      
        24) Add IPVTAP driver (IP-VLAN based tap driver) from Sainath Grandhi.
      
        25) Support GRO in IPSEC protocols, from Steffen Klassert"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1764 commits)
        Revert "ath10k: Search SMBIOS for OEM board file extension"
        net: socket: fix recvmmsg not returning error from sock_error
        bnxt_en: use eth_hw_addr_random()
        bpf: fix unlocking of jited image when module ronx not set
        arch: add ARCH_HAS_SET_MEMORY config
        net: napi_watchdog() can use napi_schedule_irqoff()
        tcp: Revert "tcp: tcp_probe: use spin_lock_bh()"
        net/hsr: use eth_hw_addr_random()
        net: mvpp2: enable building on 64-bit platforms
        net: mvpp2: switch to build_skb() in the RX path
        net: mvpp2: simplify MVPP2_PRS_RI_* definitions
        net: mvpp2: fix indentation of MVPP2_EXT_GLOBAL_CTRL_DEFAULT
        net: mvpp2: remove unused register definitions
        net: mvpp2: simplify mvpp2_bm_bufs_add()
        net: mvpp2: drop useless fields in mvpp2_bm_pool and related code
        net: mvpp2: remove unused 'tx_skb' field of 'struct mvpp2_tx_queue'
        net: mvpp2: release reference to txq_cpu[] entry after unmapping
        net: mvpp2: handle too large value in mvpp2_rx_time_coal_set()
        net: mvpp2: handle too large value handling in mvpp2_rx_pkts_coal_set()
        net: mvpp2: remove useless arguments in mvpp2_rx_{pkts, time}_coal_set
        ...
      3051bf36
    • L
      Merge tag 'gcc-plugins-v4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · 1e74a2eb
      Linus Torvalds 提交于
      Pull gcc-plugins updates from Kees Cook:
       "This includes infrastructure updates and the structleak plugin, which
        performs forced initialization of certain structures to avoid possible
        information exposures to userspace.
      
        Summary:
      
         - infrastructure updates (gcc-common.h)
      
         - introduce structleak plugin for forced initialization of some
           structures"
      
      * tag 'gcc-plugins-v4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
        gcc-plugins: Add structleak for more stack initialization
        gcc-plugins: consolidate on PASS_INFO macro
        gcc-plugins: add PASS_INFO and build_const_char_string()
      1e74a2eb