1. 01 2月, 2017 1 次提交
    • F
      sched/cputime: Remove generic asm headers · b672592f
      Frederic Weisbecker 提交于
      cputime_t is now only used by two architectures:
      
      	* powerpc (when CONFIG_VIRT_CPU_ACCOUNTING_NATIVE=y)
      	* s390
      
      And since the core doesn't use it anymore, we don't need any arch support
      from the others. So we can remove their stub implementations.
      
      A final cleanup would be to provide an efficient pure arch
      implementation of cputime_to_nsec() for s390 and powerpc and finally
      remove include/linux/cputime.h .
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Stanislaw Gruszka <sgruszka@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Wanpeng Li <wanpeng.li@hotmail.com>
      Link: http://lkml.kernel.org/r/1485832191-26889-36-git-send-email-fweisbec@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      b672592f
  2. 26 1月, 2017 1 次提交
  3. 25 12月, 2016 1 次提交
  4. 15 12月, 2016 1 次提交
  5. 29 11月, 2016 2 次提交
  6. 17 11月, 2016 1 次提交
  7. 16 11月, 2016 2 次提交
    • C
      locking/core, arch: Remove cpu_relax_lowlatency() · 5bd0b85b
      Christian Borntraeger 提交于
      As there are no users left, we can remove cpu_relax_lowlatency()
      implementations from every architecture.
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Noam Camus <noamc@ezchip.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: virtualization@lists.linux-foundation.org
      Cc: xen-devel@lists.xenproject.org
      Cc: <linux-arch@vger.kernel.org>
      Link: http://lkml.kernel.org/r/1477386195-32736-6-git-send-email-borntraeger@de.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      5bd0b85b
    • C
      locking/core: Introduce cpu_relax_yield() · 79ab11cd
      Christian Borntraeger 提交于
      For spinning loops people do often use barrier() or cpu_relax().
      For most architectures cpu_relax and barrier are the same, but on
      some architectures cpu_relax can add some latency.
      For example on power,sparc64 and arc, cpu_relax can shift the CPU
      towards other hardware threads in an SMT environment.
      On s390 cpu_relax does even more, it uses an hypercall to the
      hypervisor to give up the timeslice.
      In contrast to the SMT yielding this can result in larger latencies.
      In some places this latency is unwanted, so another variant
      "cpu_relax_lowlatency" was introduced. Before this is used in more
      and more places, lets revert the logic and provide a cpu_relax_yield
      that can be called in places where yielding is more important than
      latency. By default this is the same as cpu_relax on all architectures.
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Noam Camus <noamc@ezchip.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: virtualization@lists.linux-foundation.org
      Cc: xen-devel@lists.xenproject.org
      Link: http://lkml.kernel.org/r/1477386195-32736-2-git-send-email-borntraeger@de.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      79ab11cd
  8. 15 11月, 2016 4 次提交
  9. 25 10月, 2016 1 次提交
  10. 08 10月, 2016 1 次提交
  11. 28 9月, 2016 1 次提交
  12. 14 9月, 2016 2 次提交
  13. 13 9月, 2016 1 次提交
  14. 24 8月, 2016 1 次提交
    • J
      ftrace: Add return address pointer to ftrace_ret_stack · 9a7c348b
      Josh Poimboeuf 提交于
      Storing this value will help prevent unwinders from getting out of sync
      with the function graph tracer ret_stack.  Now instead of needing a
      stateful iterator, they can compare the return address pointer to find
      the right ret_stack entry.
      
      Note that an array of 50 ftrace_ret_stack structs is allocated for every
      task.  So when an arch implements this, it will add either 200 or 400
      bytes of memory usage per task (depending on whether it's a 32-bit or
      64-bit platform).
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Byungchul Park <byungchul.park@lge.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nilay Vaish <nilayvaish@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/a95cfcc39e8f26b89a430c56926af0bb217bc0a1.1471607358.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      9a7c348b
  15. 04 8月, 2016 1 次提交
    • K
      dma-mapping: use unsigned long for dma_attrs · 00085f1e
      Krzysztof Kozlowski 提交于
      The dma-mapping core and the implementations do not change the DMA
      attributes passed by pointer.  Thus the pointer can point to const data.
      However the attributes do not have to be a bitfield.  Instead unsigned
      long will do fine:
      
      1. This is just simpler.  Both in terms of reading the code and setting
         attributes.  Instead of initializing local attributes on the stack
         and passing pointer to it to dma_set_attr(), just set the bits.
      
      2. It brings safeness and checking for const correctness because the
         attributes are passed by value.
      
      Semantic patches for this change (at least most of them):
      
          virtual patch
          virtual context
      
          @r@
          identifier f, attrs;
      
          @@
          f(...,
          - struct dma_attrs *attrs
          + unsigned long attrs
          , ...)
          {
          ...
          }
      
          @@
          identifier r.f;
          @@
          f(...,
          - NULL
          + 0
           )
      
      and
      
          // Options: --all-includes
          virtual patch
          virtual context
      
          @r@
          identifier f, attrs;
          type t;
      
          @@
          t f(..., struct dma_attrs *attrs);
      
          @@
          identifier r.f;
          @@
          f(...,
          - NULL
          + 0
           )
      
      Link: http://lkml.kernel.org/r/1468399300-5399-2-git-send-email-k.kozlowski@samsung.comSigned-off-by: NKrzysztof Kozlowski <k.kozlowski@samsung.com>
      Acked-by: NVineet Gupta <vgupta@synopsys.com>
      Acked-by: NRobin Murphy <robin.murphy@arm.com>
      Acked-by: NHans-Christian Noren Egtvedt <egtvedt@samfundet.no>
      Acked-by: Mark Salter <msalter@redhat.com> [c6x]
      Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> [cris]
      Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> [drm]
      Reviewed-by: NBart Van Assche <bart.vanassche@sandisk.com>
      Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
      Acked-by: Fabien Dessenne <fabien.dessenne@st.com> [bdisp]
      Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com> [vb2-core]
      Acked-by: David Vrabel <david.vrabel@citrix.com> [xen]
      Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [xen swiotlb]
      Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
      Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon]
      Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k]
      Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> [s390]
      Acked-by: NBjorn Andersson <bjorn.andersson@linaro.org>
      Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no> [avr32]
      Acked-by: Vineet Gupta <vgupta@synopsys.com> [arc]
      Acked-by: Robin Murphy <robin.murphy@arm.com> [arm64 and dma-iommu]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      00085f1e
  16. 03 8月, 2016 2 次提交
    • A
      signal: consolidate {TS,TLF}_RESTORE_SIGMASK code · 7e781418
      Andy Lutomirski 提交于
      In general, there's no need for the "restore sigmask" flag to live in
      ti->flags.  alpha, ia64, microblaze, powerpc, sh, sparc (64-bit only),
      tile, and x86 use essentially identical alternative implementations,
      placing the flag in ti->status.
      
      Replace those optimized implementations with an equally good common
      implementation that stores it in a bitfield in struct task_struct and
      drop the custom implementations.
      
      Additional architectures can opt in by removing their
      TIF_RESTORE_SIGMASK defines.
      
      Link: http://lkml.kernel.org/r/8a14321d64a28e40adfddc90e18a96c086a6d6f9.1468522723.git.luto@kernel.orgSigned-off-by: NAndy Lutomirski <luto@kernel.org>
      Tested-by: Michael Ellerman <mpe@ellerman.id.au>	[powerpc]
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Rich Felker <dalias@libc.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Chris Metcalf <cmetcalf@mellanox.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dmitry Safonov <dsafonov@virtuozzo.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7e781418
    • F
      treewide: replace obsolete _refok by __ref · bd721ea7
      Fabian Frederick 提交于
      There was only one use of __initdata_refok and __exit_refok
      
      __init_refok was used 46 times against 82 for __ref.
      
      Those definitions are obsolete since commit 312b1485 ("Introduce new
      section reference annotations tags: __ref, __refdata, __refconst")
      
      This patch removes the following compatibility definitions and replaces
      them treewide.
      
      /* compatibility defines */
      #define __init_refok     __ref
      #define __initdata_refok __refdata
      #define __exit_refok     __ref
      
      I can also provide separate patches if necessary.
      (One patch per tree and check in 1 month or 2 to remove old definitions)
      
      [akpm@linux-foundation.org: coding-style fixes]
      Link: http://lkml.kernel.org/r/1466796271-3043-1-git-send-email-fabf@skynet.beSigned-off-by: NFabian Frederick <fabf@skynet.be>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bd721ea7
  17. 27 7月, 2016 1 次提交
  18. 28 6月, 2016 2 次提交
  19. 25 6月, 2016 1 次提交
    • M
      tree wide: get rid of __GFP_REPEAT for order-0 allocations part I · 32d6bd90
      Michal Hocko 提交于
      This is the third version of the patchset previously sent [1].  I have
      basically only rebased it on top of 4.7-rc1 tree and dropped "dm: get
      rid of superfluous gfp flags" which went through dm tree.  I am sending
      it now because it is tree wide and chances for conflicts are reduced
      considerably when we want to target rc2.  I plan to send the next step
      and rename the flag and move to a better semantic later during this
      release cycle so we will have a new semantic ready for 4.8 merge window
      hopefully.
      
      Motivation:
      
      While working on something unrelated I've checked the current usage of
      __GFP_REPEAT in the tree.  It seems that a majority of the usage is and
      always has been bogus because __GFP_REPEAT has always been about costly
      high order allocations while we are using it for order-0 or very small
      orders very often.  It seems that a big pile of them is just a
      copy&paste when a code has been adopted from one arch to another.
      
      I think it makes some sense to get rid of them because they are just
      making the semantic more unclear.  Please note that GFP_REPEAT is
      documented as
      
      * __GFP_REPEAT: Try hard to allocate the memory, but the allocation attempt
      
      * _might_ fail.  This depends upon the particular VM implementation.
        while !costly requests have basically nofail semantic.  So one could
        reasonably expect that order-0 request with __GFP_REPEAT will not loop
        for ever.  This is not implemented right now though.
      
      I would like to move on with __GFP_REPEAT and define a better semantic
      for it.
      
        $ git grep __GFP_REPEAT origin/master | wc -l
        111
        $ git grep __GFP_REPEAT | wc -l
        36
      
      So we are down to the third after this patch series.  The remaining
      places really seem to be relying on __GFP_REPEAT due to large allocation
      requests.  This still needs some double checking which I will do later
      after all the simple ones are sorted out.
      
      I am touching a lot of arch specific code here and I hope I got it right
      but as a matter of fact I even didn't compile test for some archs as I
      do not have cross compiler for them.  Patches should be quite trivial to
      review for stupid compile mistakes though.  The tricky parts are usually
      hidden by macro definitions and thats where I would appreciate help from
      arch maintainers.
      
      [1] http://lkml.kernel.org/r/1461849846-27209-1-git-send-email-mhocko@kernel.org
      
      This patch (of 19):
      
      __GFP_REPEAT has a rather weak semantic but since it has been introduced
      around 2.6.12 it has been ignored for low order allocations.  Yet we
      have the full kernel tree with its usage for apparently order-0
      allocations.  This is really confusing because __GFP_REPEAT is
      explicitly documented to allow allocation failures which is a weaker
      semantic than the current order-0 has (basically nofail).
      
      Let's simply drop __GFP_REPEAT from those places.  This would allow to
      identify place which really need allocator to retry harder and formulate
      a more specific semantic for what the flag is supposed to do actually.
      
      Link: http://lkml.kernel.org/r/1464599699-30131-2-git-send-email-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chen Liqin <liqin.linux@gmail.com>
      Cc: Chris Metcalf <cmetcalf@mellanox.com> [for tile]
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: John Crispin <blogic@openwrt.org>
      Cc: Lennox Wu <lennox.wu@gmail.com>
      Cc: Ley Foon Tan <lftan@altera.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      32d6bd90
  20. 18 6月, 2016 3 次提交
    • B
      microblaze/PCI: Implement pci_resource_to_user() with pcibios_resource_to_bus() · 0ad8f06d
      Bjorn Helgaas 提交于
      "User" addresses are shown in /sys/devices/pci.../.../resource and
      /proc/bus/pci/devices and used as mmap offsets for /proc/bus/pci/BB/DD.F
      files.  For I/O port resources on microblaze, these are PCI bus addresses,
      i.e., raw BAR values.
      
      Previously pci_resource_to_user() computed the user address by subtracting
      "hose->io_base_virt - _IO_BASE" from the resource start:
      
        pci_resource_to_user()
          if (IO)
            offset = (unsigned long)hose->io_base_virt - _IO_BASE;
          *start = rsrc->start - offset;
      
      We've already told the PCI core about that "hose->io_base_virt - _IO_BASE"
      offset:
      
        pcibios_setup_phb_resources()
          res = &hose->io_resource;
          pci_add_resource_offset(resources, res, hose->io_base_virt - _IO_BASE);
      
      so pcibios_resource_to_bus() knows how to do that translation.
      
      No functional change intended.
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Acked-by: NYinghai Lu <yinghai@kernel.org>
      0ad8f06d
    • B
      PCI: Unify pci_resource_to_user() declarations · 8221a013
      Bjorn Helgaas 提交于
      Replace the pci_resource_to_user() declarations in each arch that defines
      HAVE_ARCH_PCI_RESOURCE_TO_USER with a single one in linux/pci.h.
      
      Change the MIPS static inline implementation to a non-inline version so the
      static inline doesn't conflict with the new non-static linux/pci.h
      declaration.
      
      No functional change intended.
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      8221a013
    • B
      microblaze/PCI: Remove useless __pci_mmap_set_pgprot() · c444a2be
      Bjorn Helgaas 提交于
      The microblaze __pci_mmap_set_pgprot() was apparently copied from powerpc,
      where it computes either an uncacheable pgprot_t or a write-combining one.
      But on microblaze, we always use the regular uncacheable pgprot_t.
      
      Remove the useless code in __pci_mmap_set_pgprot() and inline it at the
      only call site.
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Acked-by: NYinghai Lu <yinghai@kernel.org>
      c444a2be
  21. 08 6月, 2016 1 次提交
  22. 29 5月, 2016 1 次提交
    • G
      microblaze: Add <asm/hash.h> · 7b13277b
      George Spelvin 提交于
      Microblaze is an FPGA soft core that can be configured various ways.
      
      If it is configured without a multiplier, the standard __hash_32()
      will require a call to __mulsi3, which is a slow software loop.
      
      Instead, use a shift-and-add sequence for the constant multiply.
      GCC knows how to do this, but it's not as clever as some.
      Signed-off-by: NGeorge Spelvin <linux@sciencehorizons.net>
      Cc: Alistair Francis <alistair.francis@xilinx.com>
      Cc: Michal Simek <michal.simek@xilinx.com>
      7b13277b
  23. 21 5月, 2016 2 次提交
    • Z
      lib/GCD.c: use binary GCD algorithm instead of Euclidean · fff7fb0b
      Zhaoxiu Zeng 提交于
      The binary GCD algorithm is based on the following facts:
      	1. If a and b are all evens, then gcd(a,b) = 2 * gcd(a/2, b/2)
      	2. If a is even and b is odd, then gcd(a,b) = gcd(a/2, b)
      	3. If a and b are all odds, then gcd(a,b) = gcd((a-b)/2, b) = gcd((a+b)/2, b)
      
      Even on x86 machines with reasonable division hardware, the binary
      algorithm runs about 25% faster (80% the execution time) than the
      division-based Euclidian algorithm.
      
      On platforms like Alpha and ARMv6 where division is a function call to
      emulation code, it's even more significant.
      
      There are two variants of the code here, depending on whether a fast
      __ffs (find least significant set bit) instruction is available.  This
      allows the unpredictable branches in the bit-at-a-time shifting loop to
      be eliminated.
      
      If fast __ffs is not available, the "even/odd" GCD variant is used.
      
      I use the following code to benchmark:
      
      	#include <stdio.h>
      	#include <stdlib.h>
      	#include <stdint.h>
      	#include <string.h>
      	#include <time.h>
      	#include <unistd.h>
      
      	#define swap(a, b) \
      		do { \
      			a ^= b; \
      			b ^= a; \
      			a ^= b; \
      		} while (0)
      
      	unsigned long gcd0(unsigned long a, unsigned long b)
      	{
      		unsigned long r;
      
      		if (a < b) {
      			swap(a, b);
      		}
      
      		if (b == 0)
      			return a;
      
      		while ((r = a % b) != 0) {
      			a = b;
      			b = r;
      		}
      
      		return b;
      	}
      
      	unsigned long gcd1(unsigned long a, unsigned long b)
      	{
      		unsigned long r = a | b;
      
      		if (!a || !b)
      			return r;
      
      		b >>= __builtin_ctzl(b);
      
      		for (;;) {
      			a >>= __builtin_ctzl(a);
      			if (a == b)
      				return a << __builtin_ctzl(r);
      
      			if (a < b)
      				swap(a, b);
      			a -= b;
      		}
      	}
      
      	unsigned long gcd2(unsigned long a, unsigned long b)
      	{
      		unsigned long r = a | b;
      
      		if (!a || !b)
      			return r;
      
      		r &= -r;
      
      		while (!(b & r))
      			b >>= 1;
      
      		for (;;) {
      			while (!(a & r))
      				a >>= 1;
      			if (a == b)
      				return a;
      
      			if (a < b)
      				swap(a, b);
      			a -= b;
      			a >>= 1;
      			if (a & r)
      				a += b;
      			a >>= 1;
      		}
      	}
      
      	unsigned long gcd3(unsigned long a, unsigned long b)
      	{
      		unsigned long r = a | b;
      
      		if (!a || !b)
      			return r;
      
      		b >>= __builtin_ctzl(b);
      		if (b == 1)
      			return r & -r;
      
      		for (;;) {
      			a >>= __builtin_ctzl(a);
      			if (a == 1)
      				return r & -r;
      			if (a == b)
      				return a << __builtin_ctzl(r);
      
      			if (a < b)
      				swap(a, b);
      			a -= b;
      		}
      	}
      
      	unsigned long gcd4(unsigned long a, unsigned long b)
      	{
      		unsigned long r = a | b;
      
      		if (!a || !b)
      			return r;
      
      		r &= -r;
      
      		while (!(b & r))
      			b >>= 1;
      		if (b == r)
      			return r;
      
      		for (;;) {
      			while (!(a & r))
      				a >>= 1;
      			if (a == r)
      				return r;
      			if (a == b)
      				return a;
      
      			if (a < b)
      				swap(a, b);
      			a -= b;
      			a >>= 1;
      			if (a & r)
      				a += b;
      			a >>= 1;
      		}
      	}
      
      	static unsigned long (*gcd_func[])(unsigned long a, unsigned long b) = {
      		gcd0, gcd1, gcd2, gcd3, gcd4,
      	};
      
      	#define TEST_ENTRIES (sizeof(gcd_func) / sizeof(gcd_func[0]))
      
      	#if defined(__x86_64__)
      
      	#define rdtscll(val) do { \
      		unsigned long __a,__d; \
      		__asm__ __volatile__("rdtsc" : "=a" (__a), "=d" (__d)); \
      		(val) = ((unsigned long long)__a) | (((unsigned long long)__d)<<32); \
      	} while(0)
      
      	static unsigned long long benchmark_gcd_func(unsigned long (*gcd)(unsigned long, unsigned long),
      								unsigned long a, unsigned long b, unsigned long *res)
      	{
      		unsigned long long start, end;
      		unsigned long long ret;
      		unsigned long gcd_res;
      
      		rdtscll(start);
      		gcd_res = gcd(a, b);
      		rdtscll(end);
      
      		if (end >= start)
      			ret = end - start;
      		else
      			ret = ~0ULL - start + 1 + end;
      
      		*res = gcd_res;
      		return ret;
      	}
      
      	#else
      
      	static inline struct timespec read_time(void)
      	{
      		struct timespec time;
      		clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &time);
      		return time;
      	}
      
      	static inline unsigned long long diff_time(struct timespec start, struct timespec end)
      	{
      		struct timespec temp;
      
      		if ((end.tv_nsec - start.tv_nsec) < 0) {
      			temp.tv_sec = end.tv_sec - start.tv_sec - 1;
      			temp.tv_nsec = 1000000000ULL + end.tv_nsec - start.tv_nsec;
      		} else {
      			temp.tv_sec = end.tv_sec - start.tv_sec;
      			temp.tv_nsec = end.tv_nsec - start.tv_nsec;
      		}
      
      		return temp.tv_sec * 1000000000ULL + temp.tv_nsec;
      	}
      
      	static unsigned long long benchmark_gcd_func(unsigned long (*gcd)(unsigned long, unsigned long),
      								unsigned long a, unsigned long b, unsigned long *res)
      	{
      		struct timespec start, end;
      		unsigned long gcd_res;
      
      		start = read_time();
      		gcd_res = gcd(a, b);
      		end = read_time();
      
      		*res = gcd_res;
      		return diff_time(start, end);
      	}
      
      	#endif
      
      	static inline unsigned long get_rand()
      	{
      		if (sizeof(long) == 8)
      			return (unsigned long)rand() << 32 | rand();
      		else
      			return rand();
      	}
      
      	int main(int argc, char **argv)
      	{
      		unsigned int seed = time(0);
      		int loops = 100;
      		int repeats = 1000;
      		unsigned long (*res)[TEST_ENTRIES];
      		unsigned long long elapsed[TEST_ENTRIES];
      		int i, j, k;
      
      		for (;;) {
      			int opt = getopt(argc, argv, "n:r:s:");
      			/* End condition always first */
      			if (opt == -1)
      				break;
      
      			switch (opt) {
      			case 'n':
      				loops = atoi(optarg);
      				break;
      			case 'r':
      				repeats = atoi(optarg);
      				break;
      			case 's':
      				seed = strtoul(optarg, NULL, 10);
      				break;
      			default:
      				/* You won't actually get here. */
      				break;
      			}
      		}
      
      		res = malloc(sizeof(unsigned long) * TEST_ENTRIES * loops);
      		memset(elapsed, 0, sizeof(elapsed));
      
      		srand(seed);
      		for (j = 0; j < loops; j++) {
      			unsigned long a = get_rand();
      			/* Do we have args? */
      			unsigned long b = argc > optind ? strtoul(argv[optind], NULL, 10) : get_rand();
      			unsigned long long min_elapsed[TEST_ENTRIES];
      			for (k = 0; k < repeats; k++) {
      				for (i = 0; i < TEST_ENTRIES; i++) {
      					unsigned long long tmp = benchmark_gcd_func(gcd_func[i], a, b, &res[j][i]);
      					if (k == 0 || min_elapsed[i] > tmp)
      						min_elapsed[i] = tmp;
      				}
      			}
      			for (i = 0; i < TEST_ENTRIES; i++)
      				elapsed[i] += min_elapsed[i];
      		}
      
      		for (i = 0; i < TEST_ENTRIES; i++)
      			printf("gcd%d: elapsed %llu\n", i, elapsed[i]);
      
      		k = 0;
      		srand(seed);
      		for (j = 0; j < loops; j++) {
      			unsigned long a = get_rand();
      			unsigned long b = argc > optind ? strtoul(argv[optind], NULL, 10) : get_rand();
      			for (i = 1; i < TEST_ENTRIES; i++) {
      				if (res[j][i] != res[j][0])
      					break;
      			}
      			if (i < TEST_ENTRIES) {
      				if (k == 0) {
      					k = 1;
      					fprintf(stderr, "Error:\n");
      				}
      				fprintf(stderr, "gcd(%lu, %lu): ", a, b);
      				for (i = 0; i < TEST_ENTRIES; i++)
      					fprintf(stderr, "%ld%s", res[j][i], i < TEST_ENTRIES - 1 ? ", " : "\n");
      			}
      		}
      
      		if (k == 0)
      			fprintf(stderr, "PASS\n");
      
      		free(res);
      
      		return 0;
      	}
      
      Compiled with "-O2", on "VirtualBox 4.4.0-22-generic #38-Ubuntu x86_64" got:
      
        zhaoxiuzeng@zhaoxiuzeng-VirtualBox:~/develop$ ./gcd -r 500000 -n 10
        gcd0: elapsed 10174
        gcd1: elapsed 2120
        gcd2: elapsed 2902
        gcd3: elapsed 2039
        gcd4: elapsed 2812
        PASS
        zhaoxiuzeng@zhaoxiuzeng-VirtualBox:~/develop$ ./gcd -r 500000 -n 10
        gcd0: elapsed 9309
        gcd1: elapsed 2280
        gcd2: elapsed 2822
        gcd3: elapsed 2217
        gcd4: elapsed 2710
        PASS
        zhaoxiuzeng@zhaoxiuzeng-VirtualBox:~/develop$ ./gcd -r 500000 -n 10
        gcd0: elapsed 9589
        gcd1: elapsed 2098
        gcd2: elapsed 2815
        gcd3: elapsed 2030
        gcd4: elapsed 2718
        PASS
        zhaoxiuzeng@zhaoxiuzeng-VirtualBox:~/develop$ ./gcd -r 500000 -n 10
        gcd0: elapsed 9914
        gcd1: elapsed 2309
        gcd2: elapsed 2779
        gcd3: elapsed 2228
        gcd4: elapsed 2709
        PASS
      
      [akpm@linux-foundation.org: avoid #defining a CONFIG_ variable]
      Signed-off-by: NZhaoxiu Zeng <zhaoxiu.zeng@gmail.com>
      Signed-off-by: NGeorge Spelvin <linux@horizon.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fff7fb0b
    • J
      exit_thread: remove empty bodies · 5f56a5df
      Jiri Slaby 提交于
      Define HAVE_EXIT_THREAD for archs which want to do something in
      exit_thread. For others, let's define exit_thread as an empty inline.
      
      This is a cleanup before we change the prototype of exit_thread to
      accept a task parameter.
      
      [akpm@linux-foundation.org: fix mips]
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chen Liqin <liqin.linux@gmail.com>
      Cc: Chris Metcalf <cmetcalf@mellanox.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
      Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: Jiri Slaby <jslaby@suse.cz>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
      Cc: Lennox Wu <lennox.wu@gmail.com>
      Cc: Ley Foon Tan <lftan@altera.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Steven Miao <realmz6@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5f56a5df
  24. 16 5月, 2016 2 次提交
    • F
      microblaze: pci: export isa_io_base to fix link errors · 52e9e6e0
      Fengguang Wu 提交于
      ERROR: "isa_io_base" [sound/pci/vx222/snd-vx222.ko] undefined!
      ERROR: "isa_io_base" [sound/pci/trident/snd-trident.ko] undefined!
      ERROR: "isa_io_base" [sound/pci/snd-via82xx.ko] undefined!
      ...
      ERROR: "isa_io_base" [drivers/watchdog/wdt_pci.ko] undefined!
      ERROR: "isa_io_base" [drivers/watchdog/pcwd_pci.ko] undefined!
      ERROR: "isa_io_base" [drivers/video/vgastate.ko] undefined!
      ...
      ERROR: "isa_io_base" [drivers/video/fbdev/cirrusfb.ko] undefined!
      ERROR: "isa_io_base" [drivers/video/fbdev/arkfb.ko] undefined!
      ERROR: "isa_io_base" [drivers/usb/host/uhci-hcd.ko] undefined!
      ERROR: "isa_io_base" [drivers/usb/host/isp1362-hcd.ko] undefined!
      ERROR: "isa_io_base" [drivers/tty/serial/jsm/jsm.ko] undefined!
      ERROR: "isa_io_base" [drivers/tty/serial/8250/8250_pci.ko] undefined!
      ...
      ERROR: "isa_io_base" [drivers/scsi/qla2xxx/qla2xxx.ko] undefined!
      ERROR: "isa_io_base" [drivers/scsi/ppa.ko] undefined!
      Signed-off-by: NFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: NMichal Simek <michal.simek@xilinx.com>
      52e9e6e0
    • M
      microblaze: Wire up userfaultfd, membarrier, mlock2 syscalls · fbce3bef
      Michal Simek 提交于
      Wire up new syscalls userfaultfd, membarrier and mlock2.
      Signed-off-by: NMichal Simek <michal.simek@xilinx.com>
      fbce3bef
  25. 26 3月, 2016 1 次提交
  26. 14 3月, 2016 1 次提交
    • A
      ipv4: Update parameters for csum_tcpudp_magic to their original types · 01cfbad7
      Alexander Duyck 提交于
      This patch updates all instances of csum_tcpudp_magic and
      csum_tcpudp_nofold to reflect the types that are usually used as the source
      inputs.  For example the protocol field is populated based on nexthdr which
      is actually an unsigned 8 bit value.  The length is usually populated based
      on skb->len which is an unsigned integer.
      
      This addresses an issue in which the IPv6 function csum_ipv6_magic was
      generating a checksum using the full 32b of skb->len while
      csum_tcpudp_magic was only using the lower 16 bits.  As a result we could
      run into issues when attempting to adjust the checksum as there was no
      protocol agnostic way to update it.
      
      With this change the value is still truncated as many architectures use
      "(len + proto) << 8", however this truncation only occurs for values
      greater than 16776960 in length and as such is unlikely to occur as we stop
      the inner headers at ~64K in size.
      
      I did have to make a few minor changes in the arm, mn10300, nios2, and
      score versions of the function in order to support these changes as they
      were either using things such as an OR to combine the protocol and length,
      or were using ntohs to convert the length which would have truncated the
      value.
      
      I also updated a few spots in terms of whitespace and type differences for
      the addresses.  Most of this was just to make sure all of the definitions
      were in sync going forward.
      Signed-off-by: NAlexander Duyck <aduyck@mirantis.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      01cfbad7
  27. 09 3月, 2016 1 次提交
  28. 08 3月, 2016 1 次提交