1. 07 8月, 2014 1 次提交
    • R
      lib: bitmap: make nbits parameter of bitmap_empty unsigned · 0679cc48
      Rasmus Villemoes 提交于
      Many functions in lib/bitmap.c start with an expression such as lim =
      bits/BITS_PER_LONG.  Since bits has type (signed) int, and since gcc
      cannot know that it is in fact non-negative, it generates worse code
      than it could.  These patches, mostly consisting of changing various
      parameters to unsigned, gives a slight overall code reduction:
      
        add/remove: 1/1 grow/shrink: 8/16 up/down: 251/-414 (-163)
        function                                     old     new   delta
        tick_device_uses_broadcast                   335     425     +90
        __irq_alloc_descs                            498     554     +56
        __bitmap_andnot                               73     115     +42
        __bitmap_and                                  70     101     +31
        bitmap_weight                                  -      11     +11
        copy_hugetlb_page_range                      752     762     +10
        follow_hugetlb_page                          846     854      +8
        hugetlb_init                                1415    1417      +2
        hugetlb_nrpages_setup                        130     131      +1
        hugetlb_add_hstate                           377     376      -1
        bitmap_allocate_region                        82      80      -2
        select_task_rq_fair                         2202    2191     -11
        hweight_long                                  66      55     -11
        __reg_op                                     230     219     -11
        dm_stats_message                            2849    2833     -16
        bitmap_parselist                              92      74     -18
        __bitmap_weight                              115      97     -18
        __bitmap_subset                              153     129     -24
        __bitmap_full                                128     104     -24
        __bitmap_empty                               120      96     -24
        bitmap_set                                   179     149     -30
        bitmap_clear                                 185     155     -30
        __bitmap_equal                               136     105     -31
        __bitmap_intersects                          148     108     -40
        __bitmap_complement                          109      67     -42
        tick_device_setup_broadcast_func.isra         81       -     -81
      
      [The increases in __bitmap_and{,not} are due to bug fixes 17/18,18/18.
      No idea why bitmap_weight suddenly appears.] While 163 bytes treewide is
      insignificant, I believe the bitmap functions are often called with
      locks held, so saving even a few cycles might be worth it.
      
      While making these changes, I found a few other things that might be
      worth including.  16,17,18 are actual bug fixes.  The rest shouldn't
      change the behaviour of any of the functions, provided no-one passed
      negative nbits values.  If something should come up, it should be fairly
      bisectable.
      
      A few issues I thought about, but didn't know what to do with:
      
      * Many of the functions misbehave if nbits is compile-time 0; the
        out-of-line functions generally handle 0 correctly.  bitmap_fill() is
        particularly bad, whether the 0 is known at compile time or not.  It
        would probably be nice to add detection of at least compile-time 0 and
        handle that appropriately.
      
      * I didn't change __bitmap_shift_{left,right} to use unsigned because I
        want to fully understand why the algorithm works before making that
        change.  However, AFAICT, they behave correctly for all (positive) shift
        amounts.  This is not the case for the small_const_nbits versions.  If
        for example nbits = n = BITS_PER_LONG, the shift operators turn into
        no-ops (at least on x86), so one get *dst = *src, whereas one would
        expect to get *dst=0.  That difference in behaviour is somewhat
        annoying.
      
      This patch (of 18):
      
      The compiler can generate slightly smaller and simpler code when it
      knows that "nbits" is non-negative.  Since no-one passes a negative
      bit-count, this shouldn't affect the semantics.
      Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0679cc48
  2. 03 8月, 2011 1 次提交
    • H
      lib, Make gen_pool memory allocator lockless · 7f184275
      Huang Ying 提交于
      This version of the gen_pool memory allocator supports lockless
      operation.
      
      This makes it safe to use in NMI handlers and other special
      unblockable contexts that could otherwise deadlock on locks.  This is
      implemented by using atomic operations and retries on any conflicts.
      The disadvantage is that there may be livelocks in extreme cases.  For
      better scalability, one gen_pool allocator can be used for each CPU.
      
      The lockless operation only works if there is enough memory available.
      If new memory is added to the pool a lock has to be still taken.  So
      any user relying on locklessness has to ensure that sufficient memory
      is preallocated.
      
      The basic atomic operation of this allocator is cmpxchg on long.  On
      architectures that don't have NMI-safe cmpxchg implementation, the
      allocator can NOT be used in NMI handler.  So code uses the allocator
      in NMI handler should depend on CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG.
      Signed-off-by: NHuang Ying <ying.huang@intel.com>
      Reviewed-by: NAndi Kleen <ak@linux.intel.com>
      Reviewed-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLen Brown <len.brown@intel.com>
      7f184275
  3. 27 7月, 2011 1 次提交
    • M
      cpusets: randomize node rotor used in cpuset_mem_spread_node() · 778d3b0f
      Michal Hocko 提交于
      [ This patch has already been accepted as commit 0ac0c0d0 but later
        reverted (commit 35926ff5) because it itroduced arch specific
        __node_random which was defined only for x86 code so it broke other
        archs.  This is a followup without any arch specific code.  Other than
        that there are no functional changes.]
      
      Some workloads that create a large number of small files tend to assign
      too many pages to node 0 (multi-node systems).  Part of the reason is
      that the rotor (in cpuset_mem_spread_node()) used to assign nodes starts
      at node 0 for newly created tasks.
      
      This patch changes the rotor to be initialized to a random node number
      of the cpuset.
      
      [akpm@linux-foundation.org: fix layout]
      [Lee.Schermerhorn@hp.com: Define stub numa_random() for !NUMA configuration]
      [mhocko@suse.cz: Make it arch independent]
      [akpm@linux-foundation.org: fix CONFIG_NUMA=y, MAX_NUMNODES>1 build]
      Signed-off-by: NJack Steiner <steiner@sgi.com>
      Signed-off-by: NLee Schermerhorn <lee.schermerhorn@hp.com>
      Signed-off-by: NMichal Hocko <mhocko@suse.cz>
      Reviewed-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Paul Menage <menage@google.com>
      Cc: Jack Steiner <steiner@sgi.com>
      Cc: Robin Holt <holt@sgi.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Jack Steiner <steiner@sgi.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Paul Menage <menage@google.com>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Robin Holt <holt@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      778d3b0f
  4. 25 5月, 2011 1 次提交
    • M
      bitmap, irq: add smp_affinity_list interface to /proc/irq · 4b060420
      Mike Travis 提交于
      Manually adjusting the smp_affinity for IRQ's becomes unwieldy when the
      cpu count is large.
      
      Setting smp affinity to cpus 256 to 263 would be:
      
      	echo 000000ff,00000000,00000000,00000000,00000000,00000000,00000000,00000000 > smp_affinity
      
      instead of:
      
      	echo 256-263 > smp_affinity_list
      
      Think about what it looks like for cpus around say, 4088 to 4095.
      
      We already have many alternate "list" interfaces:
      
      /sys/devices/system/cpu/cpuX/indexY/shared_cpu_list
      /sys/devices/system/cpu/cpuX/topology/thread_siblings_list
      /sys/devices/system/cpu/cpuX/topology/core_siblings_list
      /sys/devices/system/node/nodeX/cpulist
      /sys/devices/pci***/***/local_cpulist
      
      Add a companion interface, smp_affinity_list to use cpu lists instead of
      cpu maps.  This conforms to other companion interfaces where both a map
      and a list interface exists.
      
      This required adding a bitmap_parselist_user() function in a manner
      similar to the bitmap_parse_user() function.
      
      [akpm@linux-foundation.org: make __bitmap_parselist() static]
      Signed-off-by: NMike Travis <travis@sgi.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Jack Steiner <steiner@sgi.com>
      Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
      Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4b060420
  5. 31 5月, 2010 1 次提交
  6. 28 5月, 2010 1 次提交
  7. 16 12月, 2009 1 次提交
    • A
      bitmap: introduce bitmap_set, bitmap_clear, bitmap_find_next_zero_area · c1a2a962
      Akinobu Mita 提交于
      This introduces new bitmap functions:
      
      bitmap_set: Set specified bit area
      bitmap_clear: Clear specified bit area
      bitmap_find_next_zero_area: Find free bit area
      
      These are mostly stolen from iommu helper. The differences are:
      
      - Use find_next_bit instead of doing test_bit for each bit
      
      - Rewrite bitmap_set and bitmap_clear
      
        Instead of setting or clearing for each bit.
      
      - Check the last bit of the limit
      
        iommu-helper doesn't want to find such area
      
      - The return value if there is no zero area
      
        find_next_zero_area in iommu helper: returns -1
        bitmap_find_next_zero_area: return >= bitmap size
      Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
      Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Greg Kroah-Hartman <gregkh@suse.de>
      Cc: Lothar Wassmann <LW@KARO-electronics.de>
      Cc: Roland Dreier <rolandd@cisco.com>
      Cc: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Joerg Roedel <joerg.roedel@amd.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c1a2a962
  8. 22 8月, 2009 1 次提交
    • L
      Make bitmask 'and' operators return a result code · f4b0373b
      Linus Torvalds 提交于
      When 'and'ing two bitmasks (where 'andnot' is a variation on it), some
      cases want to know whether the result is the empty set or not.  In
      particular, the TLB IPI sending code wants to do cpumask operations and
      determine if there are any CPU's left in the final set.
      
      So this just makes the bitmask (and cpumask) functions return a boolean
      for whether the result has any bits set.
      
      Cc: stable@kernel.org (2.6.30, needed by TLB shootdown fix)
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f4b0373b
  9. 30 12月, 2008 1 次提交
    • R
      bitmap: test for constant as well as small size for inline versions · 4b0bc0bc
      Rusty Russell 提交于
      Impact: reduce text size
      
      bitmap_zero et al have a fastpath for nbits <= BITS_PER_LONG, but this
      should really only apply where the nbits is known at compile time.
      
      This only saves about 1200 bytes on an allyesconfig kernel, but with
      cpumasks going variable that number will increase.
      
         text		data	bss	dec		hex	filename
      35327852        5035607 6782976 47146435        2cf65c3 vmlinux-before
      35326640        5035607 6782976 47145223        2cf6107 vmlinux-after
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      4b0bc0bc
  10. 20 10月, 2008 1 次提交
  11. 17 9月, 2008 1 次提交
    • D
      bitmap: add bitmap_copy_le() · ccbe329b
      David Vrabel 提交于
      bitmap_copy_le() copies a bitmap, putting the bits into little-endian
      order (i.e., each unsigned long word in the bitmap is put into
      little-endian order).
      
      The UWB stack used bitmaps to manage Medium Access Slot availability,
      and these bitmaps need to be written to the hardware in LE order.
      Signed-off-by: NDavid Vrabel <david.vrabel@csr.com>
      ccbe329b
  12. 13 8月, 2008 1 次提交
  13. 13 5月, 2008 1 次提交
  14. 28 4月, 2008 1 次提交
    • P
      mempolicy: add bitmap_onto() and bitmap_fold() operations · 7ea931c9
      Paul Jackson 提交于
      The following adds two more bitmap operators, bitmap_onto() and bitmap_fold(),
      with the usual cpumask and nodemask wrappers.
      
      The bitmap_onto() operator computes one bitmap relative to another.  If the
      n-th bit in the origin mask is set, then the m-th bit of the destination mask
      will be set, where m is the position of the n-th set bit in the relative mask.
      
      The bitmap_fold() operator folds a bitmap into a second that has bit m set iff
      the input bitmap has some bit n set, where m == n mod sz, for the specified sz
      value.
      
      There are two substantive changes between this patch and its
      predecessor bitmap_relative:
       1) Renamed bitmap_relative() to be bitmap_onto().
       2) Added bitmap_fold().
      
      The essential motivation for bitmap_onto() is to provide a mechanism for
      converting a cpuset-relative CPU or Node mask to an absolute mask.  Cpuset
      relative masks are written as if the current task were in a cpuset whose CPUs
      or Nodes were just the consecutive ones numbered 0..N-1, for some N.  The
      bitmap_onto() operator is provided in anticipation of adding support for the
      first such cpuset relative mask, by the mbind() and set_mempolicy() system
      calls, using a planned flag of MPOL_F_RELATIVE_NODES.  These bitmap operators
      (and their nodemask wrappers, in particular) will be used in code that
      converts the user specified cpuset relative memory policy to a specific system
      node numbered policy, given the current mems_allowed of the tasks cpuset.
      
      Such cpuset relative mempolicies will address two deficiencies
      of the existing interface between cpusets and mempolicies:
       1) A task cannot at present reliably establish a cpuset
          relative mempolicy because there is an essential race
          condition, in that the tasks cpuset may be changed in
          between the time the task can query its cpuset placement,
          and the time the task can issue the applicable mbind or
          set_memplicy system call.
       2) A task cannot at present establish what cpuset relative
          mempolicy it would like to have, if it is in a smaller
          cpuset than it might have mempolicy preferences for,
          because the existing interface only allows specifying
          mempolicies for nodes currently allowed by the cpuset.
      
      Cpuset relative mempolicies are useful for tasks that don't distinguish
      particularly between one CPU or Node and another, but only between how many of
      each are allowed, and the proper placement of threads and memory pages on the
      various CPUs and Nodes available.
      
      The motivation for the added bitmap_fold() can be seen in the following
      example.
      
      Let's say an application has specified some mempolicies that presume 16 memory
      nodes, including say a mempolicy that specified MPOL_F_RELATIVE_NODES (cpuset
      relative) nodes 12-15.  Then lets say that application is crammed into a
      cpuset that only has 8 memory nodes, 0-7.  If one just uses bitmap_onto(),
      this mempolicy, mapped to that cpuset, would ignore the requested relative
      nodes above 7, leaving it empty of nodes.  That's not good; better to fold the
      higher nodes down, so that some nodes are included in the resulting mapped
      mempolicy.  In this case, the mempolicy nodes 12-15 are taken modulo 8 (the
      weight of the mems_allowed of the confining cpuset), resulting in a mempolicy
      specifying nodes 4-7.
      Signed-off-by: NPaul Jackson <pj@sgi.com>
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Cc: Christoph Lameter <clameter@sgi.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
      Cc: <kosaki.motohiro@jp.fujitsu.com>
      Cc: <ray-lk@madrabbit.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7ea931c9
  15. 20 4月, 2008 1 次提交
    • M
      cpumask: add cpumask_scnprintf_len function · 30ca60c1
      Mike Travis 提交于
      Add a new function cpumask_scnprintf_len() to return the number of
      characters needed to display "len" cpumask bits.  The current method
      of allocating NR_CPUS bytes is incorrect as what's really needed is
      9 characters per 32-bit word of cpumask bits (8 hex digits plus the
      seperator [','] or the terminating NULL.)  This function provides the
      caller the means to allocate the correct string length.
      
      Cc: Paul Jackson <pj@sgi.com>
      Signed-off-by: NMike Travis <travis@sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      30ca60c1
  16. 20 10月, 2007 1 次提交
  17. 12 10月, 2006 1 次提交
    • R
      [PATCH] bitmap: parse input from kernel and user buffers · 01a3ee2b
      Reinette Chatre 提交于
      lib/bitmap.c:bitmap_parse() is a library function that received as input a
      user buffer.  This seemed to have originated from the way the write_proc
      function of the /proc filesystem operates.
      
      This has been reworked to not use kmalloc and eliminates a lot of
      get_user() overhead by performing one access_ok before using __get_user().
      
      We need to test if we are in kernel or user space (is_user) and access the
      buffer differently.  We cannot use __get_user() to access kernel addresses
      in all cases, for example in architectures with separate address space for
      kernel and user.
      
      This function will be useful for other uses as well; for example, taking
      input for /sysfs instead of /proc, so it was changed to accept kernel
      buffers.  We have this use for the Linux UWB project, as part as the
      upcoming bandwidth allocator code.
      
      Only a few routines used this function and they were changed too.
      Signed-off-by: NReinette Chatre <reinette.chatre@linux.intel.com>
      Signed-off-by: NInaky Perez-Gonzalez <inaky@linux.intel.com>
      Cc: Paul Jackson <pj@sgi.com>
      Cc: Joe Korty <joe.korty@ccur.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      01a3ee2b
  18. 27 6月, 2006 1 次提交
  19. 24 3月, 2006 1 次提交
    • P
      [PATCH] bitmap: region cleanup · 87e24802
      Paul Jackson 提交于
      Paul Mundt <lethal@linux-sh.org> says:
      
      This patch set implements a number of patches to clean up and restructure the
      bitmap region code, in addition to extending the interface to support
      multiword spanning allocations.
      
      The current implementation (before this patch set) is limited by only being
      able to allocate pages <= BITS_PER_LONG, as noted by the strategically
      positioned BUG_ON() at lib/bitmap.c:752:
      
              /* We don't do regions of pages > BITS_PER_LONG.  The
      	 * algorithm would be a simple look for multiple zeros in the
      	 * array, but there's no driver today that needs this.  If you
      	 * trip this BUG(), you get to code it... */
              BUG_ON(pages > BITS_PER_LONG);
      
      As I seem to have been the first person to trigger this, the result ends up
      being the following patch set with the help of Paul Jackson.
      
      The final patch in the series eliminates quite a bit of code duplication, so
      the bitmap code size ends up being smaller than the current implementation as
      an added bonus.
      
      After these are applied, it should already be possible to do multiword
      allocations with dma_alloc_coherent() out of ranges established by
      dma_declare_coherent_memory() on x86 without having to change any of the code,
      and the SH store queue API will follow up on this as the other user that needs
      support for this.
      
      This patch:
      
      Some code cleanup on the lib/bitmap.c bitmap_*_region() routines:
      
       * spacing
       * variable names
       * comments
      
      Has no change to code function.
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      Signed-off-by: NPaul Jackson <pj@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      87e24802
  20. 31 10月, 2005 1 次提交
    • P
      [PATCH] cpusets: bitmap and mask remap operators · fb5eeeee
      Paul Jackson 提交于
      In the forthcoming task migration support, a key calculation will be
      mapping cpu and node numbers from the old set to the new set while
      preserving cpuset-relative offset.
      
      For example, if a task and its pages on nodes 8-11 are being migrated to
      nodes 24-27, then pages on node 9 (the 2nd node in the old set) should be
      moved to node 25 (the 2nd node in the new set.)
      
      As with other bitmap operations, the proper way to code this is to provide
      the underlying calculation in lib/bitmap.c, and then to provide the usual
      cpumask and nodemask wrappers.
      
      This patch provides that.  These operations are termed 'remap' operations.
      Both remapping a single bit and a set of bits is supported.
      Signed-off-by: NPaul Jackson <pj@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      fb5eeeee
  21. 17 4月, 2005 1 次提交
    • L
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds 提交于
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      
      Let it rip!
      1da177e4