提交 · 0ead0f84e81a41c3e98aeceab04af8ab1bb08d1f · openanolis / cloud-kernel

16 12月, 2009 14 次提交

mm: slab-allocate memory section nodemask for large systems · 9ae49fab

由 David Rientjes 提交于 12月 14, 2009

Nodemasks should not be allocated on the stack for large systems (when it
is larger than 256 bytes) since there is a threat of overflow.

This patch causes the unregister_mem_sect_under_nodes() nodemask to be
allocated on the stack for smaller systems and be allocated by slab for
larger systems.

GFP_KERNEL is used since remove_memory_block() can block.

Cc: Gary Hade <garyhade@us.ibm.com>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Cc: Alex Chiang <achiang@hp.com>
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9ae49fab

mm: add numa node symlink for cpu devices in sysfs · 1830794a

由 Alex Chiang 提交于 12月 14, 2009

You can discover which CPUs belong to a NUMA node by examining
/sys/devices/system/node/node#/

However, it's not convenient to go in the other direction, when looking at
/sys/devices/system/cpu/cpu#/

Yes, you can muck about in sysfs, but adding these symlinks makes life a
lot more convenient.
Signed-off-by: NAlex Chiang <achiang@hp.com>
Acked-by: NDavid Rientjes <rientjes@google.com>
Cc: Gary Hade <garyhade@us.ibm.com>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Rientjes <rientjes@google.com>
Cc: Greg KH <greg@kroah.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: David Rientjes <rientjes@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1830794a

mm: refactor unregister_cpu_under_node() · b9d52dad

由 Alex Chiang 提交于 12月 14, 2009

By returning early if the node is not online, we can unindent the
interesting code by two levels.

No functional change.
Signed-off-by: NAlex Chiang <achiang@hp.com>
Cc: Gary Hade <garyhade@us.ibm.com>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Rientjes <rientjes@google.com>
Cc: Greg KH <greg@kroah.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: David Rientjes <rientjes@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b9d52dad

mm: refactor register_cpu_under_node() · f8246f31

由 Alex Chiang 提交于 12月 14, 2009

By returning early if the node is not online, we can unindent the
interesting code by one level.

No functional change.
Signed-off-by: NAlex Chiang <achiang@hp.com>
Cc: Gary Hade <garyhade@us.ibm.com>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Rientjes <rientjes@google.com>
Cc: Greg KH <greg@kroah.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: David Rientjes <rientjes@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f8246f31

mm: add numa node symlink for memory section in sysfs · dee5d0d5

由 Alex Chiang 提交于 12月 14, 2009

Commit c04fc586 (mm: show node to memory section relationship with
symlinks in sysfs) created symlinks from nodes to memory sections, e.g.

/sys/devices/system/node/node1/memory135 -> ../../memory/memory135

If you're examining the memory section though and are wondering what node
it might belong to, you can find it by grovelling around in sysfs, but
it's a little cumbersome.

Add a reverse symlink for each memory section that points back to the
node to which it belongs.
Signed-off-by: NAlex Chiang <achiang@hp.com>
Cc: Gary Hade <garyhade@us.ibm.com>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Acked-by: NDavid Rientjes <rientjes@google.com>
Cc: Greg KH <greg@kroah.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: David Rientjes <rientjes@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

dee5d0d5

hugetlb: offload per node attribute registrations · 39da08cb

由 Lee Schermerhorn 提交于 12月 14, 2009

Offload the registration and unregistration of per node hstate sysfs
attributes to a worker thread rather than attempt the
allocation/attachment or detachment/freeing of the attributes in the
context of the memory hotplug handler.

I don't know that this is absolutely required, but the registration can
sleep in allocations and other mem hot plug handlers do it this way.  If
it turns out this is NOT required, we can drop this patch.

N.B.,  Only tested build, boot, libhugetlbfs regression.
       i.e., no memory hotplug testing.
Signed-off-by: NLee Schermerhorn <lee.schermerhorn@hp.com>
Reviewed-by: NAndi Kleen <andi@firstfloor.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Nishanth Aravamudan <nacc@us.ibm.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Adam Litke <agl@us.ibm.com>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Eric Whitney <eric.whitney@hp.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

39da08cb

hugetlb: handle memory hot-plug events · 4faf8d95

由 Lee Schermerhorn 提交于 12月 14, 2009

Register per node hstate attributes only for nodes with memory.  As
suggested by David Rientjes.

With Memory Hotplug, memory can be added to a memoryless node and a node
with memory can become memoryless.  Therefore, add a memory on/off-line
notifier callback to [un]register a node's attributes on transition
to/from memoryless state.

N.B.,  Only tested build, boot, libhugetlbfs regression.
       i.e., no memory hotplug testing.
Signed-off-by: NLee Schermerhorn <lee.schermerhorn@hp.com>
Reviewed-by: NAndi Kleen <andi@firstfloor.org>
Acked-by: NDavid Rientjes <rientjes@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Nishanth Aravamudan <nacc@us.ibm.com>
Cc: Adam Litke <agl@us.ibm.com>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Eric Whitney <eric.whitney@hp.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4faf8d95

hugetlb: add per node hstate attributes · 9a305230

由 Lee Schermerhorn 提交于 12月 14, 2009

Add the per huge page size control/query attributes to the per node
sysdevs:

/sys/devices/system/node/node<ID>/hugepages/hugepages-<size>/
	nr_hugepages       - r/w
	free_huge_pages    - r/o
	surplus_huge_pages - r/o

The patch attempts to re-use/share as much of the existing global hstate
attribute initialization and handling, and the "nodes_allowed" constraint
processing as possible.

Calling set_max_huge_pages() with no node indicates a change to global
hstate parameters.  In this case, any non-default task mempolicy will be
used to generate the nodes_allowed mask.  A valid node id indicates an
update to that node's hstate parameters, and the count argument specifies
the target count for the specified node.  From this info, we compute the
target global count for the hstate and construct a nodes_allowed node mask
contain only the specified node.

Setting the node specific nr_hugepages via the per node attribute
effectively ignores any task mempolicy or cpuset constraints.

With this patch:

(me):ls /sys/devices/system/node/node0/hugepages/hugepages-2048kB
./  ../  free_hugepages  nr_hugepages  surplus_hugepages

Starting from:
Node 0 HugePages_Total:     0
Node 0 HugePages_Free:      0
Node 0 HugePages_Surp:      0
Node 1 HugePages_Total:     0
Node 1 HugePages_Free:      0
Node 1 HugePages_Surp:      0
Node 2 HugePages_Total:     0
Node 2 HugePages_Free:      0
Node 2 HugePages_Surp:      0
Node 3 HugePages_Total:     0
Node 3 HugePages_Free:      0
Node 3 HugePages_Surp:      0
vm.nr_hugepages = 0

Allocate 16 persistent huge pages on node 2:
(me):echo 16 >/sys/devices/system/node/node2/hugepages/hugepages-2048kB/nr_hugepages

[Note that this is equivalent to:
	numactl -m 2 hugeadmin --pool-pages-min 2M:+16
]

Yields:
Node 0 HugePages_Total:     0
Node 0 HugePages_Free:      0
Node 0 HugePages_Surp:      0
Node 1 HugePages_Total:     0
Node 1 HugePages_Free:      0
Node 1 HugePages_Surp:      0
Node 2 HugePages_Total:    16
Node 2 HugePages_Free:     16
Node 2 HugePages_Surp:      0
Node 3 HugePages_Total:     0
Node 3 HugePages_Free:      0
Node 3 HugePages_Surp:      0
vm.nr_hugepages = 16

Global controls work as expected--reduce pool to 8 persistent huge pages:
(me):echo 8 >/sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages

Node 0 HugePages_Total:     0
Node 0 HugePages_Free:      0
Node 0 HugePages_Surp:      0
Node 1 HugePages_Total:     0
Node 1 HugePages_Free:      0
Node 1 HugePages_Surp:      0
Node 2 HugePages_Total:     8
Node 2 HugePages_Free:      8
Node 2 HugePages_Surp:      0
Node 3 HugePages_Total:     0
Node 3 HugePages_Free:      0
Node 3 HugePages_Surp:      0
Signed-off-by: NLee Schermerhorn <lee.schermerhorn@hp.com>
Acked-by: NMel Gorman <mel@csn.ul.ie>
Reviewed-by: NAndi Kleen <andi@firstfloor.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Nishanth Aravamudan <nacc@us.ibm.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Adam Litke <agl@us.ibm.com>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Eric Whitney <eric.whitney@hp.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9a305230

/dev/mem: remove redundant parameter from do_write_kmem() · ee32398f

由 Wu Fengguang 提交于 12月 14, 2009

Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Avi Kivity <avi@qumranet.com>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ee32398f

/dev/mem: remove the "written" variable in write_kmem() · 80ad89a0

由 Wu Fengguang 提交于 12月 14, 2009

Also rename "len" to "sz". No behavior change.
Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Avi Kivity <avi@qumranet.com>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

80ad89a0

/dev/mem: make size_inside_page() logic straight · 7fabaddd

由 Wu Fengguang 提交于 12月 14, 2009

Also convert more size_inside_page() users.
Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Avi Kivity <avi@qumranet.com>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7fabaddd

/dev/mem: cleanup unxlate_dev_mem_ptr() calls · fa29e97b

由 Wu Fengguang 提交于 12月 14, 2009

No behaviour change.

[akpm@linux-foundation.org: cleanuplets]
[akpm@linux-foundation.org: remove unused `ret']
Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
Acked-by: NAndi Kleen <ak@linux.intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Mark Brown <broonie@opensource.wolfsonmicro.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Avi Kivity <avi@qumranet.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

fa29e97b

/dev/mem: introduce size_inside_page() · f222318e

由 Wu Fengguang 提交于 12月 14, 2009

Introduce size_inside_page() to replace duplicate /dev/mem code.

Also apply it to /dev/kmem, whose alignment logic was buggy.
Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
Acked-by: NAndi Kleen <ak@linux.intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Mark Brown <broonie@opensource.wolfsonmicro.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Avi Kivity <avi@qumranet.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f222318e

/dev/mem: remove redundant test on len · 4ea2f43f

由 Wu Fengguang 提交于 12月 14, 2009

The len test in write_kmem() is always true, so can be reduced.
Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
Acked-by: NAndi Kleen <ak@linux.intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Mark Brown <broonie@opensource.wolfsonmicro.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Avi Kivity <avi@qumranet.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4ea2f43f

15 12月, 2009 5 次提交

i2c-core: i2c bus should support PM entries in struct dev_pm_ops · 54067ee2

由 sonic zhang 提交于 12月 14, 2009

Struct dev_pm_ops is not configured in current i2c bus type. i2c drivers
only depends on suspend/resume entries in struct dev_pm_ops are not
informed of PM suspend and resume events by i2c framework.
Signed-off-by: NSonic Zhang <sonic.zhang@analog.com>
Signed-off-by: NJean Delvare <khali@linux-fr.org>

54067ee2

i2c: Drop I2C_CLIENT_INSMOD_2 to 8 · e5e9f44c

由 Jean Delvare 提交于 12月 14, 2009

These macros simply declare an enum, so drivers might as well declare
it themselves. This puts an end to the arbitrary limit of 8 chip types
per i2c driver.
Signed-off-by: NJean Delvare <khali@linux-fr.org>
Tested-by: NWolfram Sang <w.sang@pengutronix.de>

e5e9f44c

i2c: Drop I2C_CLIENT_INSMOD_1 · 1f86df49

由 Jean Delvare 提交于 12月 14, 2009

This macro simply declares an enum, so drivers might as well declare
it themselves.
Signed-off-by: NJean Delvare <khali@linux-fr.org>
Tested-by: NWolfram Sang <w.sang@pengutronix.de>

1f86df49

i2c: Get rid of struct i2c_client_address_data · c3813d6a

由 Jean Delvare 提交于 12月 14, 2009

Struct i2c_client_address_data only contains one field at this point,
which makes its usefulness questionable. Get rid of it and pass simple
address lists around instead.
Signed-off-by: NJean Delvare <khali@linux-fr.org>
Tested-by: NWolfram Sang <w.sang@pengutronix.de>

c3813d6a

i2c: Drop the kind parameter from detect callbacks · 310ec792

由 Jean Delvare 提交于 12月 14, 2009

The "kind" parameter always has value -1, and nobody is using it any
longer, so we can remove it.
Signed-off-by: NJean Delvare <khali@linux-fr.org>
Tested-by: NWolfram Sang <w.sang@pengutronix.de>

310ec792

14 12月, 2009 21 次提交

md: add 'recovery_start' per-device sysfs attribute · 06e3c817

由 Dan Williams 提交于 12月 12, 2009

Enable external metadata arrays to manage rebuild checkpointing via a
md/dev-XXX/recovery_start attribute which reflects rdev->recovery_offset

Also update resync_start_store to allow 'none' to be written, for
consistency.
Signed-off-by: NDan Williams <dan.j.williams@intel.com>
Signed-off-by: NNeilBrown <neilb@suse.de>

06e3c817

md: rcu_read_lock() walk of mddev->disks in md_do_sync() · 4e59ca7d

由 Dan Williams 提交于 12月 12, 2009

Other walks of this list are either under rcu_read_lock() or the list
mutation lock (mddev_lock()). This protects against the improbable case of a
disk being removed from the array at the start of md_do_sync().
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

4e59ca7d

md: integrate spares into array at earliest opportunity. · 93be75ff

由 NeilBrown 提交于 12月 14, 2009

As v1.x metadata can record that a member of the array is
not completely recovered, it make sense to record that a
spare has become a regular member of the array at the earliest
opportunity.
So remove the tests on "recovery_offset > 0" in super_1_sync
as they really aren't needed, and schedule a metadata update
immediately after adding spares to a degraded array.

This means that if a crash happens immediately after a recovery
starts, the new device will be included in the array and recovery will
continue from wherever it was up to.  Previously this didn't happen
unless recovery was at least 1/16 of the way through.
Signed-off-by: NNeilBrown <neilb@suse.de>

93be75ff

md: move compat_ioctl handling into md.c · aa98aa31

由 Arnd Bergmann 提交于 12月 14, 2009

The RAID ioctls are only implemented in md.c, so the
handling for them should also be moved there from
fs/compat_ioctl.c.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Cc: Neil Brown <neilb@suse.de>
Cc: Andre Noll <maan@systemlinux.org>
Cc: linux-raid@vger.kernel.org
Signed-off-by: NNeilBrown <neilb@suse.de>

aa98aa31

md: revise Kconfig help for MD_MULTIPATH · 93bd89a6

由 NeilBrown 提交于 12月 14, 2009

Make it clear in the config message that MD_MULTIPATH is not under
active development.

Cc: Oren Held <orenhe@il.ibm.com>
Signed-off-by: NNeilBrown <neilb@suse.de>

93bd89a6

N
md: add MODULE_DESCRIPTION for all md related modules. · 0efb9e61
由 NeilBrown 提交于 12月 14, 2009
```
Suggested by  Oren Held <orenhe@il.ibm.com>
Signed-off-by: NNeilBrown <neilb@suse.de>
```
0efb9e61

raid: improve MD/raid10 handling of correctable read errors. · 1e50915f

由 Robert Becker 提交于 12月 14, 2009

We've noticed severe lasting performance degradation of our raid
arrays when we have drives that yield large amounts of media errors.
The raid10 module will queue each failed read for retry, and also
will attempt call fix_read_error() to perform the read recovery.
Read recovery is performed while the array is frozen, so repeated
recovery attempts can degrade the performance of the array for
extended periods of time.

With this patch I propose adding a per md device max number of
corrected read attempts.  Each rdev will maintain a count of
read correction attempts in the rdev->read_errors field (not
used currently for raid10). When we enter fix_read_error()
we'll check to see when the last read error occurred, and
divide the read error count by 2 for every hour since the
last read error. If at that point our read error count
exceeds the read error threshold, we'll fail the raid device.

In addition in this patch I add sysfs nodes (get/set) for
the per md max_read_errors attribute, the rdev->read_errors
attribute, and added some printk's to indicate when
fix_read_error fails to repair an rdev.

For testing I used debugfs->fail_make_request to inject
IO errors to the rdev while doing IO to the raid array.
Signed-off-by: NRobert Becker <Rob.Becker@riverbed.com>
Signed-off-by: NNeilBrown <neilb@suse.de>

1e50915f

md/raid10: print more useful messages on device failure. · 67b8dc4b

由 Robert Becker 提交于 12月 14, 2009

When we get a read error on a device in a RAID10, and attempting to
repair the error fails, print more useful messages about why it
failed.
Signed-off-by: NRobert Becker <Rob.Becker@riverbed.com>
Signed-off-by: NNeilBrown <neilb@suse.de>

67b8dc4b

md/bitmap: update dirty flag when bitmap bits are explicitly set. · ffa23322

由 NeilBrown 提交于 12月 14, 2009

There is a sysfs file which allows bits in the write-intent
bitmap to be explicit set - indicating that the block is thought
to be 'dirty'.
When this happens we should really set recovery_cp backwards
to include the block to reflect this dirtiness.

In particular, a 'resync' process will refuse to start if
recovery_cp is beyond the end of the array, so this is needed
to allow a resync to be triggered.
Signed-off-by: NNeilBrown <neilb@suse.de>

ffa23322

md: Support write-intent bitmaps with externally managed metadata. · ece5cff0

由 NeilBrown 提交于 12月 14, 2009

In this case, the metadata needs to not be in the same
sector as the bitmap.
md will not read/write any bitmap metadata.  Config must be
done via sysfs and when a recovery makes the array non-degraded
again, writing 'true' to 'bitmap/can_clear' will allow bits in
the bitmap to be cleared again.
Signed-off-by: NNeilBrown <neilb@suse.de>

ece5cff0

md/bitmap: move setting of daemon_lastrun out of bitmap_read_sb · 624ce4f5

由 NeilBrown 提交于 12月 14, 2009

Setting daemon_lastrun really has nothing to do with reading
the bitmap superblock, it just happens to be needed at the same time.
bitmap_read_sb is about to become options, so move that code out
to after the call to bitmap_read_sb.
Signed-off-by: NNeilBrown <neilb@suse.de>

624ce4f5

md: support updating bitmap parameters via sysfs. · 43a70507

由 NeilBrown 提交于 12月 14, 2009

A new attribute directory 'bitmap' in 'md' is created which
contains files for configuring the bitmap.
'location' identifies where the bitmap is, either 'none',
or 'file' or 'sector offset from metadata'.
Writing 'location' can create or remove a bitmap.
Adding a 'file' bitmap this way is not yet supported.
'chunksize' and 'time_base' must be set before 'location'
can be set.

'chunksize' can be set before creating a bitmap, but is
currently always over-ridden by the bitmap superblock.

'time_base' and 'backlog' can be updated at any time.
Signed-off-by: NNeilBrown <neilb@suse.de>
Reviewed-by: NAndre Noll <maan@systemlinux.org>

43a70507

md: factor out parsing of fixed-point numbers · 72e02075

由 NeilBrown 提交于 12月 14, 2009

safe_delay_store can parse fixed point numbers (for fractions
of a second).  We will want to do that for another sysfs
file soon, so factor out the code.
Signed-off-by: NNeilBrown <neilb@suse.de>

72e02075

md: support bitmap offset appropriate for external-metadata arrays. · f6af949c

由 NeilBrown 提交于 12月 14, 2009

For md arrays were metadata is managed externally, the kernel does not
know about a superblock so the superblock offset is 0.
If we want to have a write-intent-bitmap near the end of the
devices of such an array, we should support sector_t sized offset.
We need offset be possibly negative for when the bitmap is before
the metadata, so use loff_t instead.

Also add sanity check that bitmap does not overlap with data.
Signed-off-by: NNeilBrown <neilb@suse.de>

f6af949c

md: remove needless setting of thread->timeout in raid10_quiesce · 9cd30fdc

由 NeilBrown 提交于 12月 14, 2009

As bitmap_create and bitmap_destroy already set thread->timeout
as appropriate, there is no need to do it in raid10_quiesce.
There is a possible need to wake the thread after the timeout
has been set low, but it is better to do that where the timeout
is actually set low, in bitmap_create.
Signed-off-by: NNeilBrown <neilb@suse.de>

9cd30fdc

N
md: change daemon_sleep to be in 'jiffies' rather than 'seconds'. · 1b04be96
由 NeilBrown 提交于 12月 14, 2009
```
This removes a lot of multiplications by HZ.
Signed-off-by: NNeilBrown <neilb@suse.de>
```
1b04be96

md: move offset, daemon_sleep and chunksize out of bitmap structure · 42a04b50

由 NeilBrown 提交于 12月 14, 2009

... and into bitmap_info.  These are all configuration parameters
that need to be set before the bitmap is created.
Signed-off-by: NNeilBrown <neilb@suse.de>

42a04b50

md: collect bitmap-specific fields into one structure. · c3d9714e

由 NeilBrown 提交于 12月 14, 2009

In preparation for making bitmap fields configurable via sysfs,
start tidying up by making a single structure to contain the
configuration fields.
Signed-off-by: NNeilBrown <neilb@suse.de>

c3d9714e

N
md/raid1: add takeover support for raid5->raid1 · 709ae487
由 NeilBrown 提交于 12月 14, 2009
```
A 2-device raid5 array can now be converted to raid1.
Signed-off-by: NNeilBrown <neilb@suse.de>
```
709ae487

md: add honouring of suspend_{lo,hi} to raid1. · 6eef4b21

由 NeilBrown 提交于 12月 14, 2009

This will allow us to stop writeout to portions of the array
while  they are resynced by someone else - e.g. another node in
a cluster.
Signed-off-by: NNeilBrown <neilb@suse.de>

6eef4b21

md/raid5: don't complete make_request on barrier until writes are scheduled · 729a1866

由 NeilBrown 提交于 12月 14, 2009

The post-barrier-flush is sent by md as soon as make_request on the
barrier write completes.  For raid5, the data might not be in the
per-device queues yet.  So for barrier requests, wait for any
pre-reading to be done so that the request will be in the per-device
queues.

We use the 'preread_active' count to check that nothing is still in
the preread phase, and delay the decrement of this count until after
write requests have been submitted to the underlying devices.
Signed-off-by: NNeilBrown <neilb@suse.de>

729a1866

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功