1. 03 10月, 2016 17 次提交
    • J
      fm10k: use generic ethtool_op_get_ts_info callback · bab02a69
      Jacob Keller 提交于
      This generic callback is for drivers which have software Tx timestamp
      support enabled. Without this, PTP applications requesting software
      timestamps may complain that the requested mode is not supported.
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NKrishneil Singh <Krishneil.k.singh@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      bab02a69
    • A
      cxgb4: unexport cxgb4_dcb_enabled · 7c70c4f8
      Arnd Bergmann 提交于
      A recent cleanup marked cxgb4_dcb_enabled as 'static', which is correct, but this ignored
      how the symbol is also exported. In addition, the export can be compiled out when modules
      are disabled, causing a harmless compiler warning in configurations for which it is not
      used at all:
      
      drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c:282:12: error: 'cxgb4_dcb_enabled' defined but not used [-Werror=unused-function]
      
      This removes the export and moves the function into the correct #ifdef so we only build
      it when there are users.
      
      Fixes: 50935857 ("cxgb4: mark symbols static where possible")
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7c70c4f8
    • A
      net: rtnl: avoid uninitialized data in IFLA_VF_VLAN_LIST handling · fa34cd94
      Arnd Bergmann 提交于
      With the newly added support for IFLA_VF_VLAN_LIST netlink messages,
      we get a warning about potential uninitialized variable use in
      the parsing of the user input when enabling the -Wmaybe-uninitialized
      warning:
      
      net/core/rtnetlink.c: In function 'do_setvfinfo':
      net/core/rtnetlink.c:1756:9: error: 'ivvl$' may be used uninitialized in this function [-Werror=maybe-uninitialized]
      
      I have not been able to prove whether it is possible to arrive in
      this code with an empty IFLA_VF_VLAN_LIST block, but if we do,
      then ndo_set_vf_vlan gets called with uninitialized arguments.
      
      This adds an explicit check for an empty list, making it obvious
      to the reader and the compiler that this cannot happen.
      
      Fixes: 79aab093 ("net: Update API for VF vlan protocol 802.1ad support")
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Reviewed-by: NMoshe Shemesh <moshe@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fa34cd94
    • P
      net: pktgen: fix pkt_size · 63d75463
      Paolo Abeni 提交于
      The commit 879c7220 ("net: pktgen: Observe needed_headroom
      of the device") increased the 'pkt_overhead' field value by
      LL_RESERVED_SPACE.
      As a side effect the generated packet size, computed as:
      
      	/* Eth + IPh + UDPh + mpls */
      	datalen = pkt_dev->cur_pkt_size - 14 - 20 - 8 -
      		  pkt_dev->pkt_overhead;
      
      is decreased by the same value.
      The above changed slightly the behavior of existing pktgen users,
      and made the procfs interface somewhat inconsistent.
      Fix it by restoring the previous pkt_overhead value and using
      LL_RESERVED_SPACE as extralen in skb allocation.
      Also, change pktgen_alloc_skb() to only partially reserve
      the headroom to allow the caller to prefetch from ll header
      start.
      
      v1 -> v2:
       - fixed some typos in the comments
      
      Fixes: 879c7220 ("net: pktgen: Observe needed_headroom of the device")
      Suggested-by: NBen Greear <greearb@candelatech.com>
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      63d75463
    • G
      net: fec: set mac address unconditionally · b82d44d7
      Gavin Schenk 提交于
      If the mac address origin is not dt, you can only safely assign a mac
      address after "link up" of the device. If the link is off the clocks are
      disabled and because of issues assigning registers when clocks are off the
      new mac address cannot be written in .ndo_set_mac_address() on some soc's.
      This fix sets the mac address unconditionally in fec_restart(...) and
      ensures consistency between fec registers and the network layer.
      Signed-off-by: NGavin Schenk <g.schenk@eckelmann.de>
      Acked-by: NFugang Duan <fugang.duan@nxp.com>
      Acked-by: NUwe Kleine-König <u.kleine-koenig@pengutronix.de>
      Fixes: 9638d19e ("net: fec: add netif status check before set mac address")
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b82d44d7
    • B
      net: ethernet: mediatek: mark symbols static where possible · 3a82e78c
      Baoyou Xie 提交于
      We get 2 warnings when building kernel with W=1:
      drivers/net/ethernet/mediatek/mtk_eth_soc.c:2041:5: warning: no previous prototype for 'mtk_get_link_ksettings' [-Wmissing-prototypes]
      drivers/net/ethernet/mediatek/mtk_eth_soc.c:2052:5: warning: no previous prototype for 'mtk_set_link_ksettings' [-Wmissing-prototypes]
      
      In fact, these functions are only used in the file in which they are
      declared and don't need a declaration, but can be made static.
      So this patch marks these functions with 'static'.
      Signed-off-by: NBaoyou Xie <baoyou.xie@linaro.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3a82e78c
    • B
      cxgb4: mark cxgb_setup_tc() static · 8efebd6e
      Baoyou Xie 提交于
      We get 1 warning when building kernel with W=1:
      drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c:2715:5: warning: no previous prototype for 'cxgb_setup_tc' [-Wmissing-prototypes]
      
      In fact, this function is only used in the file in which it is
      declared and don't need a declaration, but can be made static.
      so this patch marks this function with 'static'.
      Signed-off-by: NBaoyou Xie <baoyou.xie@linaro.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8efebd6e
    • M
      ipv6 addrconf: remove addrconf_sysctl_hop_limit() · cb9e684e
      Maciej Żenczykowski 提交于
      This is an effective no-op in terms of user observable behaviour.
      
      By preventing the overwrite of non-null extra1/extra2 fields
      in addrconf_sysctl() we can enable the use of proc_dointvec_minmax().
      
      This allows us to eliminate the constant min/max (1..255) trampoline
      function that is addrconf_sysctl_hop_limit().
      
      This is nice because it simplifies the code, and allows future
      sysctls with constant min/max limits to also not require trampolines.
      
      We still can't eliminate the trampoline for mtu because it isn't
      actually a constant (it depends on other tunables of the device)
      and thus requires at-write-time logic to enforce range.
      Signed-off-by: NMaciej Żenczykowski <maze@google.com>
      Acked-by: NErik Kline <ek@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cb9e684e
    • S
      netfilter: bridge: clarify bridge/netfilter message · d4ef9f72
      Stefan Agner 提交于
      When using bridge without bridge netfilter enabled the message
      displayed is rather confusing and leads to belive that a deprecated
      feature is in use. Use IS_MODULE to be explicit that the message only
      affects users which use bridge netfilter as module and reword the
      message.
      Signed-off-by: NStefan Agner <stefan@agner.ch>
      Acked-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d4ef9f72
    • D
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · b50afd20
      David S. Miller 提交于
      Three sets of overlapping changes.  Nothing serious.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b50afd20
    • L
      Linux 4.8 · c8d2bc9b
      Linus Torvalds 提交于
      c8d2bc9b
    • L
      Merge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm · f76d9c61
      Linus Torvalds 提交于
      Pull ARM fixes from Russell King:
       "Three relatively small fixes for ARM:
      
         - Roger noticed that dma_max_pfn() was calculating the upper limit
           wrongly, by adding the PFN offset of memory twice.
      
         - A fix from Robin to correct parsing of MPIDR values when the
           address size is larger than one BE32 unit.
      
         - A fix from Srinivas to ensure that we do not rely on the boot
           loader (or previous Linux kernel) setting the translation table
           base register a certain way in the decompressor, which can lead to
           crashes"
      
      * 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
        ARM: 8618/1: decompressor: reset ttbcr fields to use TTBR0 on ARMv7
        ARM: 8617/1: dma: fix dma_max_pfn()
        ARM: 8616/1: dt: Respect property size when parsing CPUs
      f76d9c61
    • S
      ARM: 8618/1: decompressor: reset ttbcr fields to use TTBR0 on ARMv7 · 117e5e9c
      Srinivas Ramana 提交于
      If the bootloader uses the long descriptor format and jumps to
      kernel decompressor code, TTBCR may not be in a right state.
      Before enabling the MMU, it is required to clear the TTBCR.PD0
      field to use TTBR0 for translation table walks.
      
      The commit dbece458 ("ARM: 7501/1: decompressor:
      reset ttbcr for VMSA ARMv7 cores") does the reset of TTBCR.N, but
      doesn't consider all the bits for the size of TTBCR.N.
      
      Clear TTBCR.PD0 field and reset all the three bits of TTBCR.N to
      indicate the use of TTBR0 and the correct base address width.
      
      Fixes: dbece458 ("ARM: 7501/1: decompressor: reset ttbcr for VMSA ARMv7 cores")
      Acked-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NSrinivas Ramana <sramana@codeaurora.org>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      117e5e9c
    • L
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · be67d60b
      Linus Torvalds 提交于
      Pull x86 fixes from Thomas Gleixner:
       "The last regression fixes for 4.8 final:
      
         - Two patches addressing the fallout of the CR4 optimizations which
           caused CR4-less machines to fail.
      
         - Fix the VDSO build on big endian machines
      
         - Take care of FPU initialization if no CPUID is available otherwise
           task struct size ends up being zero
      
         - Fix up context tracking in case load_gs_index fails"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/entry/64: Fix context tracking state warning when load_gs_index fails
        x86/boot: Initialize FPU and X86_FEATURE_ALWAYS even if we don't have CPUID
        x86/vdso: Fix building on big endian host
        x86/boot: Fix another __read_cr4() case on 486
        x86/init: Fix cr4_init_shadow() on CR4-less machines
      be67d60b
    • L
      Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus · 66188fb1
      Linus Torvalds 提交于
      Pull MIPS fixes from Ralf Baechle:
       "Another round of fixes:
      
         - CM: Fix mips_cm_max_vp_width for non-MT kernels on MT systems
         - CPS: Avoid BUG() when offlining pre-r6 CPUs
         - DEC: Avoid gas warnings due to suspicious instruction scheduling by
           manually expanding assembler macros.
         - FTLB: Fix configuration by moving confiuguratoin after probing
         - FTLB: clear execution hazard after changing FTLB enable
         - Highmem: Fix detection of unsupported highmem with cache aliases
         - I6400: Don't touch FTLBP chicken bits
         - microMIPS: Fix BUILD_ROLLBACK_PROLOGUE
         - Malta: Fix IOCU disable switch read for MIPS64
         - Octeon: Fix probing of devices attached to GPIO lines
         - uprobes: Misc small fixes"
      
      * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
        MIPS: CM: Fix mips_cm_max_vp_width for non-MT kernels on MT systems
        MIPS: Fix detection of unsupported highmem with cache aliases
        MIPS: Malta: Fix IOCU disable switch read for MIPS64
        MIPS: Fix BUILD_ROLLBACK_PROLOGUE for microMIPS
        MIPS: clear execution hazard after changing FTLB enable
        MIPS: Configure FTLB after probing TLB sizes from config4
        MIPS: Stop setting I6400 FTLBP
        MIPS: DEC: Avoid la pseudo-instruction in delay slots
        MIPS: Octeon: mark GPIO controller node not populated after IRQ init.
        MIPS: uprobes: fix use of uninitialised variable
        MIPS: uprobes: remove incorrect set_orig_insn
        MIPS: fix uretprobe implementation
        MIPS: smp-cps: Avoid BUG() when offlining pre-r6 CPUs
      66188fb1
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc · 0c7fc30f
      Linus Torvalds 提交于
      Pull sparc fixes from David Miller:
      
       1) Fix section mismatches in some builds, from Paul Gortmaker.
      
       2) Need to count huge zero page mappings when doing TSB sizing, from
          Mike Kravetz.
      
       3) Fix handing of cpu_possible_mask when nr_cpus module option is
          specified, from Atish Patra.
      
       4) Don't allocate irq stacks until nr_irqs has been processed, also
          from Atish Patra.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
        sparc64: Fix non-SMP build.
        sparc64: Fix irq stack bootmem allocation.
        sparc64: Fix cpu_possible_mask if nr_cpus is set
        sparc64 mm: Fix more TSB sizing issues
        sparc64: fix section mismatch in find_numa_latencies_for_group
      0c7fc30f
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · bb6bbc7c
      Linus Torvalds 提交于
      Pull networking fixes from David Miller:
      
       1) Fix wrong TCP checksums on MTU probing when checksum offloading is
          disabled, from Douglas Caetano dos Santos.
      
       2) Fix qdisc backlog updates in qfq and sfb schedulers, from Cong Wang.
      
       3) Route lookup flow key protocol value is wrong in ip6gre_xmit_other(),
          fix from Lance Richardson.
      
       4) Scheduling while atomic in multicast routing code of ipv4 and ipv6,
          fix from Nikolay Aleksandrov.
      
       5) Fix packet alignment in fec driver, from Eric Nelson.
      
       6) Fix perf regression in sctp due to struct layout and cache misses,
          from Xin Long.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
        sctp: fix the issue sctp_diag uses lock_sock in rcu_read_lock
        sctp: change to check peer prsctp_capable when using prsctp polices
        sctp: remove prsctp_param from sctp_chunk
        sctp: move sent_count to the memory hole in sctp_chunk
        tg3: Avoid NULL pointer dereference in tg3_io_error_detected()
        act_ife: Fix false encoding
        act_ife: Fix external mac header on encode
        VSOCK: Don't dec ack backlog twice for rejected connections
        Revert "net: ethernet: bcmgenet: use phydev from struct net_device"
        net: fec: align IP header in hardware
        net: fec: remove QUIRK_HAS_RACC from i.mx27
        net: fec: remove QUIRK_HAS_RACC from i.mx25
        ipmr, ip6mr: fix scheduling while atomic and a deadlock with ipmr_get_route
        ip6_gre: fix flowi6_proto value in ip6gre_xmit_other()
        tcp: fix a compile error in DBGUNDO()
        tcp: fix wrong checksum calculation on MTU probing
        sch_sfb: keep backlog updated with qlen
        sch_qfq: keep backlog updated with qlen
        can: dev: fix deadlock reported after bus-off
      bb6bbc7c
  2. 02 10月, 2016 1 次提交
    • P
      MIPS: CM: Fix mips_cm_max_vp_width for non-MT kernels on MT systems · 6605d156
      Paul Burton 提交于
      When discovering the number of VPEs per core, smp_num_siblings will be
      incorrect for kernels built without support for the MIPS MultiThreading
      (MT) ASE running on systems which implement said ASE. This leads to
      accesses to VPEs in secondary cores being performed incorrectly since
      mips_cm_vp_id calculates the wrong ID to write to the local "other"
      registers. Fix this by examining the number of VPEs in the core as
      reported by the CM.
      
      This patch presumes that the number of VPEs will be the same in each
      core of the system. As this path only applies to systems with CM version
      2.5 or lower, and this property is true of all such known systems, this
      is likely to be fine but is described in a comment for good measure.
      Signed-off-by: NPaul Burton <paul.burton@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/14338/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      6605d156
  3. 01 10月, 2016 8 次提交
  4. 30 9月, 2016 14 次提交
    • W
      x86/entry/64: Fix context tracking state warning when load_gs_index fails · 2fa5f04f
      Wanpeng Li 提交于
      This warning:
      
       WARNING: CPU: 0 PID: 3331 at arch/x86/entry/common.c:45 enter_from_user_mode+0x32/0x50
       CPU: 0 PID: 3331 Comm: ldt_gdt_64 Not tainted 4.8.0-rc7+ #13
       Call Trace:
        dump_stack+0x99/0xd0
        __warn+0xd1/0xf0
        warn_slowpath_null+0x1d/0x20
        enter_from_user_mode+0x32/0x50
        error_entry+0x6d/0xc0
        ? general_protection+0x12/0x30
        ? native_load_gs_index+0xd/0x20
        ? do_set_thread_area+0x19c/0x1f0
        SyS_set_thread_area+0x24/0x30
        do_int80_syscall_32+0x7c/0x220
        entry_INT80_compat+0x38/0x50
      
      ... can be reproduced by running the GS testcase of the ldt_gdt test unit in
      the x86 selftests.
      
      do_int80_syscall_32() will call enter_form_user_mode() to convert context
      tracking state from user state to kernel state. The load_gs_index() call
      can fail with user gsbase, gsbase will be fixed up and proceed if this
      happen.
      
      However, enter_from_user_mode() will be called again in the fixed up path
      though it is context tracking kernel state currently.
      
      This patch fixes it by just fixing up gsbase and telling lockdep that IRQs
      are off once load_gs_index() failed with user gsbase.
      Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com>
      Acked-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1475197266-3440-1-git-send-email-wanpeng.li@hotmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      2fa5f04f
    • A
      x86/boot: Initialize FPU and X86_FEATURE_ALWAYS even if we don't have CPUID · 05fb3c19
      Andy Lutomirski 提交于
      Otherwise arch_task_struct_size == 0 and we die.  While we're at it,
      set X86_FEATURE_ALWAYS, too.
      Reported-by: NDavid Saggiorato <david@saggiorato.net>
      Tested-by: NDavid Saggiorato <david@saggiorato.net>
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Fixes: aaeb5c01c5b ("x86/fpu, sched: Introduce CONFIG_ARCH_WANTS_DYNAMIC_TASK_STRUCT and use it on x86")
      Link: http://lkml.kernel.org/r/8de723afbf0811071185039f9088733188b606c9.1475103911.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      05fb3c19
    • S
      x86/vdso: Fix building on big endian host · e4aad645
      Segher Boessenkool 提交于
      We need to call GET_LE to read hdr->e_type.
      
      Fixes: 57f90c3d ("x86/vdso: Error out if the vDSO isn't a valid DSO")
      Reported-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: NSegher Boessenkool <segher@kernel.crashing.org>
      Acked-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: linux-next@vger.kernel.org
      Link: http://lkml.kernel.org/r/20160929193442.GA16617@gate.crashing.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      e4aad645
    • A
      x86/boot: Fix another __read_cr4() case on 486 · 192d1dcc
      Andy Lutomirski 提交于
      The condition for reading CR4 was wrong: there are some CPUs with
      CPUID but not CR4.  Rather than trying to make the condition exact,
      use __read_cr4_safe().
      
      Fixes: 18bc7bd5 ("x86/boot: Synchronize trampoline_cr4_features and mmu_cr4_features directly")
      Reported-by: david@saggiorato.net
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Reviewed-by: NBorislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Link: http://lkml.kernel.org/r/8c453a61c4f44ab6ff43c29780ba04835234d2e5.1475178369.git.luto@kernel.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      192d1dcc
    • C
      mlx5: Add ndo_poll_controller() implementation · 80378384
      Calvin Owens 提交于
      This implements ndo_poll_controller in net_device_ops callbacks for mlx5,
      which is necessary to use netconsole with this driver.
      Acked-By: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NCalvin Owens <calvinowens@fb.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      80378384
    • J
      nfp: bpf: zero extend 4 byte context loads · 6cd80b55
      Jakub Kicinski 提交于
      Set upper 32 bits of destination register to zeros after
      load from the context structure.
      Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6cd80b55
    • X
      sctp: fix the issue sctp_diag uses lock_sock in rcu_read_lock · 1cceda78
      Xin Long 提交于
      When sctp dumps all the ep->assocs, it needs to lock_sock first,
      but now it locks sock in rcu_read_lock, and lock_sock may sleep,
      which would break rcu_read_lock.
      
      This patch is to get and hold one sock when traversing the list.
      After that and get out of rcu_read_lock, lock and dump it. Then
      it will traverse the list again to get the next one until all
      sctp socks are dumped.
      
      For sctp_diag_dump_one, it fixes this issue by holding asoc and
      moving cb() out of rcu_read_lock in sctp_transport_lookup_process.
      
      Fixes: 8f840e47 ("sctp: add the sctp_diag.c file")
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1cceda78
    • D
      Merge branch 'sctp-fixes' · 75b005b9
      David S. Miller 提交于
      Xin Long says:
      
      ====================
      sctp: a bunch of fixes for prsctp polices
      
      This patchset is to fix 2 issues for prsctp polices:
      
        1. patch 1 and 2 fix "netperf-Throughput_Mbps -37.2% regression" issue
           when overloading the CPU.
      
        2. patch 3 fix "prsctp polices should check both sides' prsctp_capable,
           instead of only local side".
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      75b005b9
    • X
      sctp: change to check peer prsctp_capable when using prsctp polices · be4947bf
      Xin Long 提交于
      Now before using prsctp polices, sctp uses asoc->prsctp_enable to
      check if prsctp is enabled. However asoc->prsctp_enable is set only
      means local host support prsctp, sctp should not abandon packet if
      peer host doesn't enable prsctp.
      
      So this patch is to use asoc->peer.prsctp_capable to check if prsctp
      is enabled on both side, instead of asoc->prsctp_enable, as asoc's
      peer.prsctp_capable is set only when local and peer both enable prsctp.
      
      Fixes: a6c2f792 ("sctp: implement prsctp TTL policy")
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      be4947bf
    • X
      sctp: remove prsctp_param from sctp_chunk · 0605483f
      Xin Long 提交于
      Now sctp uses chunk->prsctp_param to save the prsctp param for all the
      prsctp polices, we didn't need to introduce prsctp_param to sctp_chunk.
      We can just use chunk->sinfo.sinfo_timetolive for RTX and BUF polices,
      and reuse msg->expires_at for TTL policy, as the prsctp polices and old
      expires policy are mutual exclusive.
      
      This patch is to remove prsctp_param from sctp_chunk, and reuse msg's
      expires_at for TTL and chunk's sinfo.sinfo_timetolive for RTX and BUF
      polices.
      
      Note that sctp can't use chunk's sinfo.sinfo_timetolive for TTL policy,
      as it needs a u64 variables to save the expires_at time.
      
      This one also fixes the "netperf-Throughput_Mbps -37.2% regression"
      issue.
      
      Fixes: a6c2f792 ("sctp: implement prsctp TTL policy")
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0605483f
    • X
      sctp: move sent_count to the memory hole in sctp_chunk · 73dca124
      Xin Long 提交于
      Now pahole sctp_chunk, it has 2 memory holes:
         struct sctp_chunk {
      	struct list_head           list;
      	atomic_t                   refcnt;
      	/* XXX 4 bytes hole, try to pack */
      	...
      	long unsigned int          prsctp_param;
      	int                        sent_count;
      	/* XXX 4 bytes hole, try to pack */
      
      This patch is to move up sent_count to fill the 1st one and eliminate
      the 2nd one.
      
      It's not just another struct compaction, it also fixes the "netperf-
      Throughput_Mbps -37.2% regression" issue when overloading the CPU.
      
      Fixes: a6c2f792 ("sctp: implement prsctp TTL policy")
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      73dca124
    • D
      mlx4: remove unused fields · 5038056e
      David Decotigny 提交于
      This also can address following UBSAN warnings:
      [   36.640343] ================================================================================
      [   36.648772] UBSAN: Undefined behaviour in drivers/net/ethernet/mellanox/mlx4/fw.c:857:26
      [   36.656853] shift exponent 64 is too large for 32-bit type 'int'
      [   36.663348] ================================================================================
      [   36.671783] ================================================================================
      [   36.680213] UBSAN: Undefined behaviour in drivers/net/ethernet/mellanox/mlx4/fw.c:861:27
      [   36.688297] shift exponent 35 is too large for 32-bit type 'int'
      [   36.694702] ================================================================================
      
      Tested:
        reboot with UBSAN, no warning.
      Signed-off-by: NDavid Decotigny <decot@googlers.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Reviewed-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5038056e
    • M
      ipv6 addrconf: implement RFC7559 router solicitation backoff · bd11f074
      Maciej Żenczykowski 提交于
      This implements:
        https://tools.ietf.org/html/rfc7559
      
      Backoff is performed according to RFC3315 section 14:
        https://tools.ietf.org/html/rfc3315#section-14
      
      We allow setting /proc/sys/net/ipv6/conf/*/router_solicitations
      to a negative value meaning an unlimited number of retransmits,
      and we make this the new default (inline with the RFC).
      
      We also add a new setting:
        /proc/sys/net/ipv6/conf/*/router_solicitation_max_interval
      defaulting to 1 hour (per RFC recommendation).
      Signed-off-by: NMaciej Żenczykowski <maze@google.com>
      Acked-by: NErik Kline <ek@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bd11f074
    • D
      Merge branch 'net_proc_perf' · bcdc6efa
      David S. Miller 提交于
      Jia He says:
      
      ====================
      Reduce cache miss for snmp_fold_field
      
      In a PowerPc server with large cpu number(160), besides commit
      a3a77372 ("net: Optimize snmp stat aggregation by walking all
      the percpu data at once"), I watched several other snmp_fold_field
      callsites which would cause high cache miss rate.
      
      test source code:
      ================
      My simple test case, which read from the procfs items endlessly:
      /***********************************************************/
      int main(int argc, char **argv)
      {
              int i;
              int fd = -1 ;
              int rdsize = 0;
              char buf[LINELEN+1];
      
              buf[LINELEN] = 0;
              memset(buf,0,LINELEN);
      
              if(1 >= argc) {
                      printf("file name empty\n");
                      return -1;
              }
      
              fd = open(argv[1], O_RDWR, 0644);
              if(0 > fd){
                      printf("open error\n");
                      return -2;
              }
      
              for(i=0;i<0xffffffff;i++) {
                      while(0 < (rdsize = read(fd,buf,LINELEN))){
                              //nothing here
                      }
      
                      lseek(fd, 0, SEEK_SET);
              }
      
              close(fd);
              return 0;
      }
      /**********************************************************/
      
      compile and run:
      ================
      gcc test.c -o test
      
      perf stat -d -e cache-misses ./test /proc/net/snmp
      perf stat -d -e cache-misses ./test /proc/net/snmp6
      perf stat -d -e cache-misses ./test /proc/net/sctp/snmp
      perf stat -d -e cache-misses ./test /proc/net/xfrm_stat
      
      before the patch set:
      ====================
       Performance counter stats for 'system wide':
      
               355911097      cache-misses                                                 [40.08%]
              2356829300      L1-dcache-loads                                              [60.04%]
               355642645      L1-dcache-load-misses     #   15.09% of all L1-dcache hits   [60.02%]
               346544541      LLC-loads                                                    [59.97%]
                  389763      LLC-load-misses           #    0.11% of all LL-cache hits    [40.02%]
      
             6.245162638 seconds time elapsed
      
      After the patch set:
      ===================
       Performance counter stats for 'system wide':
      
               194992476      cache-misses                                                 [40.03%]
              6718051877      L1-dcache-loads                                              [60.07%]
               194871921      L1-dcache-load-misses     #    2.90% of all L1-dcache hits   [60.11%]
               187632232      LLC-loads                                                    [60.04%]
                  464466      LLC-load-misses           #    0.25% of all LL-cache hits    [39.89%]
      
             6.868422769 seconds time elapsed
      The cache-miss rate can be reduced from 15% to 2.9%
      
      changelog
      =========
      v6:
      - correct v5
      v5:
      - order local variables from longest to shortest line
      v4:
      - move memset into one block of if statement in snmp6_seq_show_item
      - remove the changes in netstat_seq_show considerred the stack usage is too large
      v3:
      - introduce generic interface (suggested by Marcelo Ricardo Leitner)
      - use max_t instead of self defined macro (suggested by David Miller)
      v2:
      - fix bug in udplite statistics.
      - snmp_seq_show is split into 2 parts
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bcdc6efa