1. 10 9月, 2017 2 次提交
    • R
      sparc64: Handle additional cases of no fault loads · b6fe1089
      Rob Gardner 提交于
      Load instructions using ASI_PNF or other no-fault ASIs should not
      cause a SIGSEGV or SIGBUS.
      
      A garden variety unmapped address follows the TSB miss path, and when
      no valid mapping is found in the process page tables, the miss handler
      checks to see if the access was via a no-fault ASI.  It then fixes up
      the target register with a zero, and skips the no-fault load
      instruction.
      
      But different paths are taken for data access exceptions and alignment
      traps, and these do not respect the no-fault ASI. We add checks in
      these paths for the no-fault ASI, and fix up the target register and
      TPC just like in the TSB miss case.
      Signed-off-by: NRob Gardner <rob.gardner@oracle.com>
      Acked-by: NSam Ravnborg <sam@ravnborg.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b6fe1089
    • A
      sparc64: speed up etrap/rtrap on NG2 and later processors · a7159a87
      Anthony Yznaga 提交于
      For many sun4v processor types, reading or writing a privileged register
      has a latency of 40 to 70 cycles.  Use a combination of the low-latency
      allclean, otherw, normalw, and nop instructions in etrap and rtrap to
      replace 2 rdpr and 5 wrpr instructions and improve etrap/rtrap
      performance.  allclean, otherw, and normalw are available on NG2 and
      later processors.
      
      The average ticks to execute the flush windows trap ("ta 0x3") with and
      without this patch on select platforms:
      
       CPU            Not patched     Patched    % Latency Reduction
      
       NG2            1762            1558            -11.58
       NG4            3619            3204            -11.47
       M7             3015            2624            -12.97
       SPARC64-X      829             770              -7.12
      Signed-off-by: NAnthony Yznaga <anthony.yznaga@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a7159a87
  2. 30 8月, 2017 1 次提交
  3. 29 8月, 2017 3 次提交
  4. 16 8月, 2017 18 次提交
  5. 11 8月, 2017 12 次提交
    • D
      Merge branch 'sparc64-M7-memcpy' · fa5dc772
      David S. Miller 提交于
      Babu Moger says:
      
      ====================
      sparc64: Update memcpy, memset etc. for M7/M8 architectures
      
      This series of patches updates the memcpy, memset, copy_to_user, copy_from_user
      etc for SPARC M7/M8 architecture.
      
      New algorithm here takes advantage of the M7/M8 block init store ASIs, with much
      more optimized way to improve the performance. More detail are in code comments.
      
      Tested and compared the latency measured in ticks(NG4memcpy vs new M7memcpy).
      
      1. Memset numbers(Aligned memset)
      
      No.of bytes   NG4memset	   M7memset    	Delta ((B-A)/A)*100
      	     (Avg.Ticks A) (Avg.Ticks B) (latency reduction)
        3		77		25		-67.53
        7		43		33		-23.25
        32		72		68		 -5.55
        128		164		44		-73.17
        256		335		68		-79.70
        512		511		220		-56.94
        1024		1552		627		-59.60
        2048		3515		1322		-62.38
        4096		6303		2472		-60.78
        8192		13118		4867		-62.89
        16384		26206		10371		-60.42
        32768		52501		18569		-64.63
        65536		100219		35899		-64.17
      
      2. Memcpy numbers(Aligned memcpy)
      
      No.of bytes   NG4memcpy	   M7memcpy    	Delta ((B-A)/A)*100
      	     (Avg.Ticks A) (Avg.Ticks B) (latency reduction)
        3		20		19		-5
        7		29		27		-6.89
        32		30		28		-6.66
        128		89		69		-22.47
        256		142		143		 0.70
        512		341		283		-17.00
        1024		1588		655		-58.75
        2048		3553		1357		-61.80
        4096		7218		2590		-64.11
        8192		13701		5231		-61.82
        16384		28304		10716		-62.13
        32768		56516		22995		-59.31
        65536		115443		50840		-55.96
      
      3. Memset numbers(un-aligned memset)
      
      No.of bytes   NG4memset	   M7memset    	Delta ((B-A)/A)*100
      	     (Avg.Ticks A) (Avg.Ticks B) (latency reduction)
        3		40		31		-22.5
        7		52		29		-44.2307692308
        32		89		86		-3.3707865169
        128		201		74		-63.184079602
        256		340		154		-54.7058823529
        512		961		335		-65.1404786681
        1024		1799		686		-61.8677042802
        2048		3575		1260		-64.7552447552
        4096		6560		2627		-59.9542682927
        8192		13161		6018		-54.273991338
        16384		26465		10439		-60.5554505951
        32768		52119		18649		-64.2184232238
        65536		101593		35724		-64.8361599717
      
      4. Memcpy numbers(un-aligned memcpy)
      
      No.of bytes   NG4memcpy	   M7memcpy    	Delta ((B-A)/A)*100
      	     (Avg.Ticks A) (Avg.Ticks B) (latency reduction)
        3		26		19		-26.9230769231
        7		48		45		-6.25
        32		52		49		-5.7692307692
        128		284		334		17.6056338028
        256		430		482		12.0930232558
        512		646		690		6.8111455108
        1024		1051		1016		-3.3301617507
        2048		1787		1818		1.7347509793
        4096		3309		3376		2.0247809006
        8192		8151		7444		-8.673782358
        16384		34222		34556		0.9759803635
        32768		87851		95044		8.1877269468
        65536		158331		159572		0.7838010244
      
      There is not much difference in numbers with Un-aligned copies
      between NG4memcpy and M7memcpy because they both mostly use the
      same algorithems.
      
      v2:
       1. Fixed indentation issues found by David Miller
       2. Used ENTRY and ENDPROC for the labels in M7patch.S as suggested by David Miller
       3. Now M8 also will use M7memcpy. Also tested on M8 config.
       4. These patches are created on top of below M8 patches
          https://patchwork.ozlabs.org/patch/792661/
          https://patchwork.ozlabs.org/patch/792662/
          However, I did not see these patches in sparc-next tree. It may be in queue now.
          It is possible these patches might cause some build problems. It will resolve
          once all M8 patches are in sparc-next tree.
      
      v0: Initial version
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fa5dc772
    • B
      arch/sparc: Add accurate exception reporting in M7memcpy · 34060b8f
      Babu Moger 提交于
      Add accurate exception reporting in M7memcpy
      Signed-off-by: NBabu Moger <babu.moger@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      34060b8f
    • B
      arch/sparc: Optimized memcpy, memset, copy_to_user, copy_from_user for M7/M8 · b3a04ed5
      Babu Moger 提交于
      New algorithm that takes advantage of the M7/M8 block init store
      ASI, ie, overlapping pipelines and miss buffer filling.
      Full details in code comments.
      Signed-off-by: NBabu Moger <babu.moger@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b3a04ed5
    • B
      arch/sparc: Rename exception handlers · 1ab32693
      Babu Moger 提交于
      Rename exception handlers to memcpy_xxx as these
      are going to be used by new memcpy routines and these
      handlers are not exclusive to NG4memcpy anymore.
      Signed-off-by: NBabu Moger <babu.moger@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1ab32693
    • B
      arch/sparc: Separate the exception handlers from NG4memcpy · de5c073e
      Babu Moger 提交于
      Separate the exception handlers from NG4memcpy so that it can be
      used with new memcpy routines. Make a separate file for all these handlers.
      Signed-off-by: NBabu Moger <babu.moger@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      de5c073e
    • S
      sparc64: update comments in U3memcpy · 061273f9
      Sam Ravnborg 提交于
      Update comments about the range the different
      parts of the code copies, the original comments were wrong.
      
      Introduce a few descriptive labels too.
      
      No functional changes.
      Signed-off-by: NSam Ravnborg <sam@ravnborg.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      061273f9
    • D
      2f7043a3
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 26273939
      Linus Torvalds 提交于
      Pull networking fixes from David Miller:
      
       1) Fix handling of initial STATE message in TIPC, from Jon Paul Maloy.
      
       2) Fix stats handling in bcm_sysport_get_stats(), from Florian
          Fainelli.
      
       3) Reject 16777215 VNI value in geneve_validate(), from Girish
          Moodalbail.
      
       4) Fix initial IGMP sysctl setting regression, from Nikolay Borisov.
      
       5) Once a UFO fragmented frame is treated as UFO, we should continue
          doing so. Likewise once a frame has been segmented, we should
          continue doing that and not try to convert it to a UFO frame. From
          Willem de Bruijn.
      
       6) Test the AF_PACKET RX/TX ring pg_vec state under the socket lock to
          prevent races. From Willem de Bruijn.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
        packet: fix tp_reserve race in packet_set_ring
        udp: consistently apply ufo or fragmentation
        net: sched: set xt_tgchk_param par.nft_compat as 0 in ipt_init_target
        igmp: Fix regression caused by igmp sysctl namespace code.
        geneve: maximum value of VNI cannot be used
        net: systemport: Fix software statistics for SYSTEMPORT Lite
        tipc: remove premature ESTABLISH FSM event at link synchronization
      26273939
    • W
      packet: fix tp_reserve race in packet_set_ring · c27927e3
      Willem de Bruijn 提交于
      Updates to tp_reserve can race with reads of the field in
      packet_set_ring. Avoid this by holding the socket lock during
      updates in setsockopt PACKET_RESERVE.
      
      This bug was discovered by syzkaller.
      
      Fixes: 8913336a ("packet: add PACKET_RESERVE sockopt")
      Reported-by: NAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c27927e3
    • W
      udp: consistently apply ufo or fragmentation · 85f1bd9a
      Willem de Bruijn 提交于
      When iteratively building a UDP datagram with MSG_MORE and that
      datagram exceeds MTU, consistently choose UFO or fragmentation.
      
      Once skb_is_gso, always apply ufo. Conversely, once a datagram is
      split across multiple skbs, do not consider ufo.
      
      Sendpage already maintains the first invariant, only add the second.
      IPv6 does not have a sendpage implementation to modify.
      
      A gso skb must have a partial checksum, do not follow sk_no_check_tx
      in udp_send_skb.
      
      Found by syzkaller.
      
      Fixes: e89e9cf5 ("[IPv4/IPv6]: UFO Scatter-gather approach")
      Reported-by: NAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      85f1bd9a
    • D
      sparc64: Revert 16GB huge page support. · 4d9fbf53
      David S. Miller 提交于
      It overflows the amount of space available in the initial .text section
      of trap handler assembler in some configurations, resulting in build
      failures.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4d9fbf53
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc · f213ad38
      Linus Torvalds 提交于
      Pull sparc updates from David Miller:
      
       1) Recognize M8 cpus, just basic chip ID matching, from Allen Pais.
      
       2) Prevent crashes when bringing up sunvdc virtual block devices in
          some environments. From Jim Quigley.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
        sunvdc: prevent sunvdc panic when mpgroup disk added to guest domain
        sparc64: Increase max_phys_bits to 51 and VA bits to 53 for M8.
        sparc64: recognize and support sparc M8 cpu type
        sparc64: properly name the cpu constants
      f213ad38
  6. 10 8月, 2017 4 次提交