1. 07 10月, 2015 1 次提交
  2. 23 9月, 2015 2 次提交
  3. 18 9月, 2015 1 次提交
    • V
      lib/string_helpers.c: fix infinite loop in string_get_size() · 62bef58a
      Vitaly Kuznetsov 提交于
      Some string_get_size() calls (e.g.:
       string_get_size(1, 512, STRING_UNITS_10, ..., ...)
       string_get_size(15, 64, STRING_UNITS_10, ..., ...)
      ) result in an infinite loop. The problem is that if size is equal to
      divisor[units]/blk_size and is smaller than divisor[units] we'll end
      up with size == 0 when we start doing sf_cap calculations:
      
      For string_get_size(1, 512, STRING_UNITS_10, ..., ...) case:
         ...
         remainder = do_div(size, divisor[units]); -> size is 0, remainder is 1
         remainder *= blk_size; -> remainder is 512
         ...
         size *= blk_size; -> size is still 0
         size += remainder / divisor[units]; -> size is still 0
      
      The caller causing the issue is sd_read_capacity(), the problem was
      noticed on Hyper-V, such weird size was reported by host when scanning
      collides with device removal.  This is probably a separate issue worth
      fixing, this patch is intended to prevent the library routine from
      infinite looping.
      Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Acked-by: NJames Bottomley <JBottomley@Odin.com>
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
      Cc: "K. Y. Srinivasan" <kys@microsoft.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      62bef58a
  4. 11 9月, 2015 12 次提交
  5. 09 9月, 2015 1 次提交
    • V
      lib/show_mem.c: correct reserved memory calculation · 156408c0
      Vishnu Pratap Singh 提交于
      CMA reserved memory is not part of total reserved memory.  Currently
      when we print the total reserve memory it considers cma as part of
      reserve memory and do minus of totalcma_pages from reserved, which is
      wrong.  In cases where total reserved is less than cma reserved we will
      get negative values & while printing we print as unsigned and we will
      get a very large value.
      
      Below is the show mem output on X86 ubuntu based system where CMA
      reserved is 100MB (25600 pages) & total reserved is ~40MB(10316 pages).
      And reserve memory shows a large value because of this bug.
      
      Before:
      [  127.066430] 898908 pages RAM
      [  127.066432] 671682 pages HighMem/MovableOnly
      [  127.066434] 4294952012 pages reserved
      [  127.066436] 25600 pages cma reserved
      
      After:
      [   44.663129] 898908 pages RAM
      [   44.663130] 671682 pages HighMem/MovableOnly
      [   44.663130] 10316 pages reserved
      [   44.663131] 25600 pages cma reserved
      Signed-off-by: NVishnu Pratap Singh <vishnu.ps@samsung.com>
      Cc: Michal Nazarewicz <mina86@mina86.com>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: Danesh Petigara <dpetigara@broadcom.com>
      Cc: Laura Abbott <lauraa@codeaurora.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      156408c0
  6. 05 9月, 2015 2 次提交
    • V
      genalloc: add support of multiple gen_pools per device · c98c3635
      Vladimir Zapolskiy 提交于
      This change fills devm_gen_pool_create()/gen_pool_get() "name" argument
      stub with contents and extends of_gen_pool_get() functionality on this
      basis.
      
      If there is no associated platform device with a device node passed to
      of_gen_pool_get(), the function attempts to get a label property or device
      node name (= repeats MTD OF partition standard) and seeks for a named
      gen_pool registered by device of the parent device node.
      
      The main idea of the change is to allow registration of independent
      gen_pools under the same umbrella device, say "partitions" on "storage
      device", the original functionality of one "partition" per "storage
      device" is untouched.
      
      [akpm@linux-foundation.org: fix constness in devres_find()]
      [dan.carpenter@oracle.com: freeing const data pointers]
      Signed-off-by: NVladimir Zapolskiy <vladimir_zapolskiy@mentor.com>
      Cc: Philipp Zabel <p.zabel@pengutronix.de>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Nicolas Ferre <nicolas.ferre@atmel.com>
      Cc: Alexandre Belloni <alexandre.belloni@free-electrons.com>
      Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com>
      Cc: Shawn Guo <shawnguo@kernel.org>
      Cc: Sascha Hauer <kernel@pengutronix.de>
      Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c98c3635
    • V
      genalloc: add name arg to gen_pool_get() and devm_gen_pool_create() · 73858173
      Vladimir Zapolskiy 提交于
      This change modifies gen_pool_get() and devm_gen_pool_create() client
      interfaces adding one more argument "name" of a gen_pool object.
      
      Due to implementation gen_pool_get() is capable to retrieve only one
      gen_pool associated with a device even if multiple gen_pools are created,
      fortunately right at the moment it is sufficient for the clients, hence
      provide NULL as a valid argument on both producer devm_gen_pool_create()
      and consumer gen_pool_get() sides.
      
      Because only one created gen_pool per device is addressable, explicitly
      add a restriction to devm_gen_pool_create() to create only one gen_pool
      per device, this implies two possible error codes returned by the
      function, account it on client side (only misc/sram).  This completes
      client side changes related to genalloc updates.
      
      [akpm@linux-foundation.org: gen_pool_get() cleanup]
      Signed-off-by: NVladimir Zapolskiy <vladimir_zapolskiy@mentor.com>
      Cc: Philipp Zabel <p.zabel@pengutronix.de>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Nicolas Ferre <nicolas.ferre@atmel.com>
      Cc: Alexandre Belloni <alexandre.belloni@free-electrons.com>
      Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com>
      Cc: Shawn Guo <shawnguo@kernel.org>
      Cc: Sascha Hauer <kernel@pengutronix.de>
      Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      73858173
  7. 01 9月, 2015 2 次提交
    • A
      lib: move strncpy_from_unsafe() into mm/maccess.c · dbb7ee0e
      Alexei Starovoitov 提交于
      To fix build errors:
      kernel/built-in.o: In function `bpf_trace_printk':
      bpf_trace.c:(.text+0x11a254): undefined reference to `strncpy_from_unsafe'
      kernel/built-in.o: In function `fetch_memory_string':
      trace_kprobe.c:(.text+0x11acf8): undefined reference to `strncpy_from_unsafe'
      
      move strncpy_from_unsafe() next to probe_kernel_read/write()
      which use the same memory access style.
      Reported-by: NFengguang Wu <fengguang.wu@intel.com>
      Reported-by: NGuenter Roeck <linux@roeck-us.net>
      Fixes: 1a6877b9 ("lib: introduce strncpy_from_unsafe()")
      Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dbb7ee0e
    • A
      md/raid6: delta syndrome for ARM NEON · 0e833e69
      Ard Biesheuvel 提交于
      This implements XOR syndrome calculation using NEON intrinsics.
      As before, the module can be built for ARM and arm64 from the
      same source.
      
      Relative performance on a Cortex-A57 based system:
      
        raid6: int64x1  gen()   905 MB/s
        raid6: int64x1  xor()   881 MB/s
        raid6: int64x2  gen()  1343 MB/s
        raid6: int64x2  xor()  1286 MB/s
        raid6: int64x4  gen()  1896 MB/s
        raid6: int64x4  xor()  1321 MB/s
        raid6: int64x8  gen()  1773 MB/s
        raid6: int64x8  xor()  1165 MB/s
        raid6: neonx1   gen()  1834 MB/s
        raid6: neonx1   xor()  1278 MB/s
        raid6: neonx2   gen()  2528 MB/s
        raid6: neonx2   xor()  1942 MB/s
        raid6: neonx4   gen()  2888 MB/s
        raid6: neonx4   xor()  2334 MB/s
        raid6: neonx8   gen()  2957 MB/s
        raid6: neonx8   xor()  2232 MB/s
        raid6: using algorithm neonx8 gen() 2957 MB/s
        raid6: .... xor() 2232 MB/s, rmw enabled
      
      Cc: Markus Stockhausen <stockhausen@collogia.de>
      Cc: Neil Brown <neilb@suse.de>
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NNeilBrown <neilb@suse.com>
      0e833e69
  8. 29 8月, 2015 1 次提交
  9. 28 8月, 2015 1 次提交
    • R
      nd_blk: change aperture mapping from WC to WB · 67a3e8fe
      Ross Zwisler 提交于
      This should result in a pretty sizeable performance gain for reads.  For
      rough comparison I did some simple read testing using PMEM to compare
      reads of write combining (WC) mappings vs write-back (WB).  This was
      done on a random lab machine.
      
      PMEM reads from a write combining mapping:
      	# dd of=/dev/null if=/dev/pmem0 bs=4096 count=100000
      	100000+0 records in
      	100000+0 records out
      	409600000 bytes (410 MB) copied, 9.2855 s, 44.1 MB/s
      
      PMEM reads from a write-back mapping:
      	# dd of=/dev/null if=/dev/pmem0 bs=4096 count=1000000
      	1000000+0 records in
      	1000000+0 records out
      	4096000000 bytes (4.1 GB) copied, 3.44034 s, 1.2 GB/s
      
      To be able to safely support a write-back aperture I needed to add
      support for the "read flush" _DSM flag, as outlined in the DSM spec:
      
      http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf
      
      This flag tells the ND BLK driver that it needs to flush the cache lines
      associated with the aperture after the aperture is moved but before any
      new data is read.  This ensures that any stale cache lines from the
      previous contents of the aperture will be discarded from the processor
      cache, and the new data will be read properly from the DIMM.  We know
      that the cache lines are clean and will be discarded without any
      writeback because either a) the previous aperture operation was a read,
      and we never modified the contents of the aperture, or b) the previous
      aperture operation was a write and we must have written back the dirtied
      contents of the aperture to the DIMM before the I/O was completed.
      
      In order to add support for the "read flush" flag I needed to add a
      generic routine to invalidate cache lines, mmio_flush_range().  This is
      protected by the ARCH_HAS_MMIO_FLUSH Kconfig variable, and is currently
      only supported on x86.
      Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      67a3e8fe
  10. 27 8月, 2015 1 次提交
  11. 25 8月, 2015 3 次提交
    • T
      MPI: Fix mpi_read_buffer · 0f74fbf7
      Tadeusz Struk 提交于
      Change mpi_read_buffer to return a number without leading zeros
      so that mpi_read_buffer and mpi_get_buffer return the same thing.
      Signed-off-by: NTadeusz Struk <tadeusz.struk@intel.com>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      0f74fbf7
    • L
      PCI: Add pci_iomap_wc() variants · 1b3d4200
      Luis R. Rodriguez 提交于
      PCI BARs tell us whether prefetching is safe, but they don't say
      anything about write combining (WC). WC changes ordering rules
      and allows writes to be collapsed, so it's not safe in general
      to use it on a prefetchable region.
      
      Add pci_iomap_wc() and pci_iomap_wc_range() so drivers can take
      advantage of write combining when they know it's safe.
      
      On architectures that don't fully support WC, e.g., x86 without
      PAT, drivers for legacy framebuffers may get some of the benefit
      by using arch_phys_wc_add() in addition to pci_iomap_wc().  But
      arch_phys_wc_add() is unreliable and should be avoided in
      general.  On x86, it uses MTRRs, which are limited in number and
      size, so the results will vary based on driver loading order.
      
      The goals of adding pci_iomap_wc() are to:
      
      - Give drivers an architecture-independent way to use WC so they can stop
        using interfaces like mtrr_add() (on x86, pci_iomap_wc() uses
        PAT when available).
      
      - Move toward using _PAGE_CACHE_MODE_UC, not _PAGE_CACHE_MODE_UC_MINUS,
        on x86 on ioremap_nocache() (see de33c442 ("x86 PAT: fix
        performance drop for glx, use UC minus for ioremap(), ioremap_nocache()
        and pci_mmap_page_range()").
      Signed-off-by: NLuis R. Rodriguez <mcgrof@suse.com>
      [ Move IORESOURCE_IO check up, space out statements for better readability. ]
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Acked-by: NArnd Bergmann <arnd@arndb.de>
      Cc: <roger.pau@citrix.com>
      Cc: <syrjala@sci.fi>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Antonino Daplas <adaplas@gmail.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Dave Airlie <airlied@redhat.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Davidlohr Bueso <dbueso@suse.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Roger Pau Monné <roger.pau@citrix.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Stefan Bader <stefan.bader@canonical.com>
      Cc: Suresh Siddha <sbsiddha@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: Ville Syrjälä <syrjala@sci.fi>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: airlied@linux.ie
      Cc: benh@kernel.crashing.org
      Cc: dan.j.williams@intel.com
      Cc: david.vrabel@citrix.com
      Cc: jbeulich@suse.com
      Cc: konrad.wilk@oracle.com
      Cc: linux-arch@vger.kernel.org
      Cc: linux-fbdev@vger.kernel.org
      Cc: linux-pci@vger.kernel.org
      Cc: venkatesh.pallipadi@intel.com
      Cc: vinod.koul@intel.com
      Cc: xen-devel@lists.xensource.com
      Link: http://lkml.kernel.org/r/1440443613-13696-6-git-send-email-mcgrof@do-not-panic.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      1b3d4200
    • R
      lib: scatterlist: add sg splitting function · f8bcbe62
      Robert Jarzmik 提交于
      Sometimes a scatter-gather has to be split into several chunks, or sub
      scatter lists. This happens for example if a scatter list will be
      handled by multiple DMA channels, each one filling a part of it.
      
      A concrete example comes with the media V4L2 API, where the scatter list
      is allocated from userspace to hold an image, regardless of the
      knowledge of how many DMAs will fill it :
       - in a simple RGB565 case, one DMA will pump data from the camera ISP
         to memory
       - in the trickier YUV422 case, 3 DMAs will pump data from the camera
         ISP pipes, one for pipe Y, one for pipe U and one for pipe V
      
      For these cases, it is necessary to split the original scatter list into
      multiple scatter lists, which is the purpose of this patch.
      
      The guarantees that are required for this patch are :
       - the intersection of spans of any couple of resulting scatter lists is
         empty.
       - the union of spans of all resulting scatter lists is a subrange of
         the span of the original scatter list.
       - streaming DMA API operations (mapping, unmapping) should not happen
         both on both the resulting and the original scatter list. It's either
         the first or the later ones.
       - the caller is reponsible to call kfree() on the resulting
         scatterlists.
      Signed-off-by: NRobert Jarzmik <robert.jarzmik@free.fr>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      f8bcbe62
  12. 21 8月, 2015 1 次提交
  13. 18 8月, 2015 1 次提交
    • P
      rhashtable-test: extend to test concurrency · f4a3e90b
      Phil Sutter 提交于
      After having tested insertion, lookup, table walk and removal, spawn a
      number of threads running operations on the same rhashtable. Each of
      them will:
      
      1) insert it's own set of objects,
      2) lookup every successfully inserted object and finally
      3) remove objects in several rounds until all of them have been removed,
         making sure the remaining ones are still found after each round.
      
      This should put a good amount of load onto the system and due to
      synchronising thread startup via two semaphores also extensive
      concurrent table access.
      
      The default number of ten threads returned within half a second on my
      local VM with two cores. Running 200 threads took about four seconds. If
      slow systems suffer too much from this though, the default could be
      lowered or even set to zero so this extended test does not run at all by
      default.
      Signed-off-by: NPhil Sutter <phil@nwl.cc>
      Acked-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f4a3e90b
  14. 17 8月, 2015 1 次提交
  15. 15 8月, 2015 1 次提交
  16. 12 8月, 2015 1 次提交
  17. 11 8月, 2015 1 次提交
    • D
      cleanup IORESOURCE_CACHEABLE vs ioremap() · 92b19ff5
      Dan Williams 提交于
      Quoting Arnd:
          I was thinking the opposite approach and basically removing all uses
          of IORESOURCE_CACHEABLE from the kernel. There are only a handful of
          them.and we can probably replace them all with hardcoded
          ioremap_cached() calls in the cases they are actually useful.
      
      All existing usages of IORESOURCE_CACHEABLE call ioremap() instead of
      ioremap_nocache() if the resource is cacheable, however ioremap() is
      uncached by default. Clearly none of the existing usages care about the
      cacheability. Particularly devm_ioremap_resource() never worked as
      advertised since it always fell back to plain ioremap().
      
      Clean this up as the new direction we want is to convert
      ioremap_<type>() usages to memremap(..., flags).
      Suggested-by: NArnd Bergmann <arnd@arndb.de>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      92b19ff5
  18. 07 8月, 2015 7 次提交