提交 · 07a723097cd31e9c2def69ac4d42834a16f79219 · openeuler / raspberrypi-kernel

01 11月, 2011 1 次提交

dma-mapping: fix sync_single_range_* DMA debugging · 07a72309

由 Clemens Ladisch 提交于 10月 31, 2011

Commit 5fd75a78 (dma-mapping: remove unnecessary sync_single_range_*
in dma_map_ops) unified not only the dma_map_ops but also the
corresponding debug_dma_sync_* calls.  This led to spurious WARN()ings
like the following because the DMA debug code was no longer able to detect
the DMA buffer base address without the separate offset parameter:

WARNING: at lib/dma-debug.c:911 check_sync+0xce/0x446()
firewire_ohci 0000:04:00.0: DMA-API: device driver tries to sync DMA memory it has not allocated [device address=0x00000000cedaa400] [size=1024 bytes]
Call Trace: ...
 [<ffffffff811326a5>] check_sync+0xce/0x446
 [<ffffffff81132ad9>] debug_dma_sync_single_for_device+0x39/0x3b
 [<ffffffffa01d6e6a>] ohci_queue_iso+0x4f3/0x77d [firewire_ohci]
 ...

To fix this, unshare the sync_single_* and sync_single_range_*
implementations so that we are able to call the correct debug_dma_sync_*
functions.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: NClemens Ladisch <clemens@ladisch.de>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

07a72309

30 10月, 2011 3 次提交

i2c: Functions for byte-swapped smbus_write/read_word_data · 06a67848

由 Jonathan Cameron 提交于 10月 30, 2011

Reimplemented at least 17 times discounting error mangling cases
where it could be used.
Signed-off-by: NJonathan Cameron <jic23@cam.ac.uk>
Signed-off-by: NJean Delvare <khali@linux-fr.org>

06a67848

KVM: s390: implement sigp external call · 7697e71f

由 Christian Ehrhardt 提交于 10月 18, 2011

Implement sigp external call, which might be required for guests that
issue an external call instead of an emergency signal for IPI.

This fixes an issue with "KVM: unknown SIGP: 0x02" when booting
such an SMP guest.
Signed-off-by: NChristian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

7697e71f

vlan: allow nested vlan_do_receive() · 6a32e4f9

由 Eric Dumazet 提交于 10月 29, 2011

commit 2425717b (net: allow vlan traffic to be received under bond)
broke ARP processing on vlan on top of bonding.

       +-------+
eth0 --| bond0 |---bond0.103
eth1 --|       |
       +-------+

52870.115435: skb_gro_reset_offset <-napi_gro_receive
52870.115435: dev_gro_receive <-napi_gro_receive
52870.115435: napi_skb_finish <-napi_gro_receive
52870.115435: netif_receive_skb <-napi_skb_finish
52870.115435: get_rps_cpu <-netif_receive_skb
52870.115435: __netif_receive_skb <-netif_receive_skb
52870.115436: vlan_do_receive <-__netif_receive_skb
52870.115436: bond_handle_frame <-__netif_receive_skb
52870.115436: vlan_do_receive <-__netif_receive_skb
52870.115436: arp_rcv <-__netif_receive_skb
52870.115436: kfree_skb <-arp_rcv

Packet is dropped in arp_rcv() because its pkt_type was set to
PACKET_OTHERHOST in the first vlan_do_receive() call, since no eth0.103
exists.

We really need to change pkt_type only if no more rx_handler is about to
be called for the packet.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Reviewed-by: NJiri Pirko <jpirko@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6a32e4f9

29 10月, 2011 1 次提交

of: include errno.h · e51130c0

由 Kalle Valo 提交于 10月 06, 2011

When compiling ath6kl for beagleboard (omap2plus_defconfig plus
CONFIG_ATH6KL, CONFIG_OF disable) with current linux-next compilation
fails:

include/linux/of.h:269: error: 'ENOSYS' undeclared (first use in this function)
include/linux/of.h:276: error: 'ENOSYS' undeclared (first use in this function)
include/linux/of.h:289: error: 'ENOSYS' undeclared (first use in this function)

Fix this by including errno.h from of.h.
Signed-off-by: NKalle Valo <kvalo@qca.qualcomm.com>
Acked-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: NGrant Likely <grant.likely@secretlab.ca>

e51130c0

28 10月, 2011 6 次提交

vfs: add generic_file_llseek_size · 5760495a

由 Andi Kleen 提交于 9月 15, 2011

Add a generic_file_llseek variant to the VFS that allows passing in
the maximum file size of the file system, instead of always
using maxbytes from the superblock.

This can be used to eliminate some cut'n'paste seek code in ext4.
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

5760495a

vfs: do (nearly) lockless generic_file_llseek · ef3d0fd2

由 Andi Kleen 提交于 9月 15, 2011

The i_mutex lock use of generic _file_llseek hurts.  Independent processes
accessing the same file synchronize over a single lock, even though
they have no need for synchronization at all.

Under high utilization this can cause llseek to scale very poorly on larger
systems.

This patch does some rethinking of the llseek locking model:

First the 64bit f_pos is not necessarily atomic without locks
on 32bit systems. This can already cause races with read() today.
This was discussed on linux-kernel in the past and deemed acceptable.
The patch does not change that.

Let's look at the different seek variants:

SEEK_SET: Doesn't really need any locking.
If there's a race one writer wins, the other loses.

For 32bit the non atomic update races against read()
stay the same. Without a lock they can also happen
against write() now.  The read() race was deemed
acceptable in past discussions, and I think if it's
ok for read it's ok for write too.

=> Don't need a lock.

SEEK_END: This behaves like SEEK_SET plus it reads
the maximum size too. Reading the maximum size would have the
32bit atomic problem. But luckily we already have a way to read
the maximum size without locking (i_size_read), so we
can just use that instead.

Without i_mutex there is no synchronization with write() anymore,
however since the write() update is atomic on 64bit it just behaves
like another racy SEEK_SET.  On non atomic 32bit it's the same
as SEEK_SET.

=> Don't need a lock, but need to use i_size_read()

SEEK_CUR: This has a read-modify-write race window
on the same file. One could argue that any application
doing unsynchronized seeks on the same file is already broken.
But for the sake of not adding a regression here I'm
using the file->f_lock to synchronize this. Using this
lock is much better than the inode mutex because it doesn't
synchronize between processes.

=> So still need a lock, but can use a f_lock.

This patch implements this new scheme in generic_file_llseek.
I dropped generic_file_llseek_unlocked and changed all callers.
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

ef3d0fd2

vfs: add hex format for MAY_* flag values · 8522ca58

由 Aneesh Kumar K.V 提交于 10月 23, 2011

We are going to add more flags and having them in hex format
make it simpler
Acked-by: NJ. Bruce Fields <bfields@redhat.com>
Acked-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

8522ca58

Fix build break when freezer not configured · e0c8ea1a

由 Steve French 提交于 10月 25, 2011

fs/cifs/transport.c: In function 'wait_for_response':
fs/cifs/transport.c:328: error: implicit declaration of function 'wait_event_freezekillable'

Caused by commit f06ac72e ("cifs, freezer: add
wait_event_freezekillable and have cifs use it").  In this config,
CONFIG_FREEZER is not set.
Reviewed-by: NShirish Pargaonkar <shirishp@us.ibm.com>
CC: Jeff Layton <jlayton@redhat.com>
Signed-off-by: NSteve French <smfrench@gmail.com>

e0c8ea1a

Revert "drm/ttm: add a way to bo_wait for either the last read or last write" · 1717c0e2

由 Dave Airlie 提交于 10月 27, 2011

This reverts commit dfadbbdb.

Further upstream discussion between Marek and Thomas decided this wasn't
fully baked and needed further work, so revert it before it hits mainline.
Signed-off-by: NDave Airlie <airlied@redhat.com>

1717c0e2

Revert "drm/radeon/kms: add a new gem_wait ioctl with read/write flags" · 83f30d0e

由 Dave Airlie 提交于 10月 27, 2011

This reverts commit d3ed7402.

Further upstream discussion between Thomas and Marek decided this needed
more work and driver specifics. So revert before it goes upstream.
Signed-off-by: NDave Airlie <airlied@redhat.com>

83f30d0e

27 10月, 2011 25 次提交

mmc: fix compile error when CONFIG_BLOCK is not enabled · a6029e1f

由 Namjae Jeon 提交于 10月 13, 2011

'DISK_NAME_LEN' is undeclared when CONFIG_BLOCK is disabled; its use was
introduced via genhd.h by the general purpose partition patch.

To fix, we just add our own MAX_MMC_PART_NAME_LEN macro instead of using
DISK_NAME_LEN.
Reported-by: NRandy Dunlap <rdunlap@xenotime.net>
Signed-off-by: NNamjae Jeon <linkinjeon@gmail.com>
Acked-by: NRandy Dunlap <rdunlap@xenotime.net>
Acked-by: NAndrei Warkentin <andreiw@vmware.com>
Signed-off-by: NChris Ball <cjb@laptop.org>

a6029e1f

mmc: core: add workaround for controllers with broken multiblock reads · 2bf22b39

由 Paul Walmsley 提交于 10月 06, 2011

Due to hardware bugs, some MMC host controllers don't support
multiple-block reads[1].  To resolve, add a new MMC capability flag,
MMC_CAP2_NO_MULTI_READ, which can be set by affected host controller
drivers.  When this capability is set, all reads will be issued one
sector at a time.

1. See for example Advisory 2.1.1.128 "MMC: Multiple Block Read
Operation Issue" in _OMAP3530/3525/3515/3503 Silicon Errata_
Revision F (October 2010) (SPRZ278F), available from
http://focus.ti.com/lit/er/sprz278f/sprz278f.pdfSigned-off-by: NPaul Walmsley <paul@pwsan.com>
Cc: Dave Hylands <dhylands@gmail.com>
Tested-by: NSteve Sakoman <sakoman@gmail.com>
Signed-off-by: NChris Ball <cjb@laptop.org>

2bf22b39

ipv6: tcp: fix TCLASS value in ACK messages sent from TIME_WAIT · b903d324

由 Eric Dumazet 提交于 10月 27, 2011

commit 66b13d99 (ipv4: tcp: fix TOS value in ACK messages sent from
TIME_WAIT) fixed IPv4 only.

This part is for the IPv6 side, adding a tclass param to ip6_xmit()

We alias tw_tclass and tw_tos, if socket family is INET6.

[ if sockets is ipv4-mapped, only IP_TOS socket option is used to fill
TOS field, TCLASS is not taken into account ]
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b903d324

gpio: pl061: add DT binding support · 76c05c8a

由 Rob Herring 提交于 8月 10, 2011

This adds devicetree binding support to the ARM pl061 driver removing the
platform_data dependency. When DT binding is used, the gpio numbering is
assigned dynamically. For now, interrupts are not supported with DT until
irqdomains learn dynamic irq assignment.

Rather than add another case of -1, updating the driver to use NO_IRQ.
Signed-off-by: NRob Herring <rob.herring@calxeda.com>
Acked-by: NBaruch Siach <baruch@tkos.co.il>
Signed-off-by: NGrant Likely <grant.likely@secretlab.ca>

76c05c8a

gpio: fix build error in include/asm-generic/gpio.h · eb9ae7f2

由 Hamo 提交于 10月 21, 2011

Should call the platform-specific __gpio_{get,set}_value
instead of generic gpio_{get,set}_value
Signed-off-by: NYang Bai <hamo.by@gmail.com>
Acked-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NGrant Likely <grant.likely@secretlab.ca>

eb9ae7f2

mmc: recognise SDIO cards with SDIO_CCCR_REV 3.00 · b4625dab

由 Bing Zhao 提交于 10月 20, 2011

Table 6-2: CCCR bit Definitions, address 00h.  Part E1 SDIO Simplified
Specification Version 3.00, Feb. 25, 2011.

This patch has been tested with Marvell WLAN device SD8797.
Signed-off-by: NBing Zhao <bzhao@marvell.com>
Signed-off-by: NChris Ball <cjb@laptop.org>

b4625dab

mmc: core: support HPI send command · eb0d8f13

由 Jaehoon Chung 提交于 10月 18, 2011

HPI command is defined in eMMC4.41.
This feature is important for eMMC4.5 devices.
Signed-off-by: NJaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: NChris Ball <cjb@laptop.org>

eb0d8f13

mmc: core: Add cache control for eMMC4.5 device · 881d1c25

由 Seungwon Jeon 提交于 10月 14, 2011

This patch adds cache feature of eMMC4.5 Spec.
If device supports cache capability, host can utilize some specific
operations.
Signed-off-by: NSeungwon Jeon <tgih.jun@samsung.com>
Signed-off-by: NJaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: NChris Ball <cjb@laptop.org>

881d1c25

mmc: core: new discard feature support at eMMC v4.5 · b3bf9153

由 Kyungmin Park 提交于 10月 18, 2011

MMC v4.5 supports the DISCARD feature (CMD38).  It's different from
trim and there's no check bit.  Currently it's only supported at v4.5.
Signed-off-by: NKyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: NJaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: NChris Ball <cjb@laptop.org>

b3bf9153

mmc: core: mmc sanitize feature support for v4.5 · d9ddd629

由 Kyungmin Park 提交于 10月 14, 2011

In the v4.5, there's no secure erase & trim support.
Instead it supports the sanitize feature.
Signed-off-by: NKyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: NJaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: NChris Ball <cjb@laptop.org>

d9ddd629

mmc: dw_mmc: modify DATA register offset · 4e0a5adf

由 Jaehoon Chung 提交于 10月 17, 2011

In dw_mmc 2.40a spec, Data register's offset is changed.
Before we used Data register offset 0x100. but if somebody uses a
2.40a controller, we must use 0x200 for Data register.

This patch adds a version-id checking point and uses SDMMC_DATA(x)
instead of SDMMC_DATA.  It assumes 2.40a is the latest version.
Signed-off-by: NJaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: NKyungmin Park <kyungmin.park@samsung.com>
Acked-by: NJames Hogan <james.hogan@imgtec.com>
Signed-off-by: NChris Ball <cjb@laptop.org>

4e0a5adf

mmc: core: Add Power Off Notify Feature eMMC 4.5 · bec8726a

由 Girish K S 提交于 10月 13, 2011

This patch adds support for the power off notify feature, available in
eMMC 4.5 devices. If the host has support for this feature, then the
mmc core will notify the device by setting the POWER_OFF_NOTIFICATION
byte in the extended csd register with a value of 1 (POWER_ON).

For suspend mode short timeout is used, whereas for the normal poweroff
long timeout is used.
Signed-off-by: NGirish K S <girish.shivananjappa@linaro.org>
Signed-off-by: NJaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: NChris Ball <cjb@laptop.org>

bec8726a

mmc: core: Add default timeout value for CMD6 · b23cf0bd

由 Seungwon Jeon 提交于 9月 23, 2011

EXT_CSD[248] includes the default maximum timeout for CMD6.
This field is added at eMMC4.5 Spec. And it can be used for default
timeout except for some operations which don't define the timeout
(i.e. background operation, sanitize, flush cache) in eMMC4.5 Spec.
Signed-off-by: NSeungwon Jeon <tgih.jun@samsung.com>
Signed-off-by: NJaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: NChris Ball <cjb@laptop.org>

b23cf0bd

mmc: sdhci-pci: add runtime pm support · 66fd8ad5

由 Adrian Hunter 提交于 10月 03, 2011

Ths patch allows runtime PM for sdhci-pci, runtime suspending after
inactivity of 50ms and ensuring runtime resume before SDHC registers
are accessed.  During runtime suspend, interrupts are masked.
The host controller state is restored at runtime resume.

For Medfield, the host controller's card detect mechanism is
supplanted by an always-on GPIO which provides for card detect wake-up.
Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
Signed-off-by: NChris Ball <cjb@laptop.org>

66fd8ad5

mmc: core: general purpose MMC partition support. · e0c368d5

由 Namjae Jeon 提交于 10月 06, 2011

It allows gerneral purpose partitions in MMC Device.  And I try to simply
make mmc_blk_alloc_parts using mmc_part structure suggested by Andrei
Warkentin.  After patching, we see general purpose partitions like this:
> cat /proc/partitions
          179 0 847872 mmcblk0
          179 192 4096 mmcblk0gp3
          179 160 4096 mmcblk0gp2
          179 128 4096 mmcblk0gp1
          179 96  1052672 mmcblk0gp0
          179 64  1024 mmcblk0boot1
          179 32  1024 mmcblk0boot0
Signed-off-by: NNamjae Jeon <linkinjeon@gmail.com>
Acked-by: NAndrei Warkentin <awarkentin@vmware.com>
Signed-off-by: NChris Ball <cjb@laptop.org>

e0c368d5

mmc: block: support no access to boot partitions · f7c56ef2

由 Adrian Hunter 提交于 9月 23, 2011

Intel Medfield platform blocks access to eMMC boot partitions which
results in switch errors.  Since there is no access, mmcboot0/1
devices should not be created.  Add a host capability to reflect that.
Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
Signed-off-by: NChris Ball <cjb@laptop.org>

f7c56ef2

mmc: core: eMMC 4.5 Power Class Selection Feature · b87d8dbf

由 Girish K S 提交于 9月 23, 2011

This patch adds the power class selection feature available for mmc
versions 4.0 and above. During the enumeration stage before switching
to the lower data bus, check if the power class is supported for the
current bus width. If the power class is available then switch to the
power class and use the higher data bus. If power class is not supported
then switch to the lower data bus in a worst case.
Signed-off-by: NGirish K S <girish.shivananjappa@linaro.org>
Signed-off-by: NChris Ball <cjb@laptop.org>

b87d8dbf

mmc: sh_mmcif: simplify platform data · 714c4a6e

由 Guennadi Liakhovetski 提交于 8月 30, 2011

Provide platforms with a simplified way to specify MMCIF DMA slave IDs in
a way, similar to SDHI and other sh_dma clients.
Signed-off-by: NGuennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: NChris Ball <cjb@laptop.org>

714c4a6e

mmc: core: add eMMC hardware reset support · b2499518

由 Adrian Hunter 提交于 8月 29, 2011

eMMC's may have a hardware reset line.  This patch provides a
host controller operation to implement hardware reset and
a function to reset and reinitialize the card.  Also, for MMC,
the reset is always performed before initialization.

The host must set the new host capability MMC_CAP_HW_RESET
to enable hardware reset.
Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
Signed-off-by: NChris Ball <cjb@laptop.org>

b2499518

mmc: core: clarify how to use post_req in case of errors · 7c8a2829

由 Per Forlin 提交于 8月 29, 2011

The err condition in post_req() is set to undo a call made to pre_req()
that hasn't been started yet.  The err condition is not set if an MMC
request returns an error.
Signed-off-by: NPer Forlin <per.forlin@linaro.org>
Acked-by: NLinus Walleij <linus.walleij@linaro.org>
Signed-off-by: NChris Ball <cjb@laptop.org>

7c8a2829

mmc: sdio: Workaround for dev with broken CMD53 · 9a0da648

由 Stefan Nilsson XK 提交于 9月 15, 2011

Adds a quirk which can be turned on for SDIO devices that do not support
512 byte requests in byte mode during CMD53. These requests will always
be sent in block mode instead.

This patch also enables this quirk for ST-Ericsson CW1200 WLAN device.
Signed-off-by: NStefan Nilsson XK <stefan.xk.nilsson@stericsson.com>
Signed-off-by: NUlf HANSSON <ulf.hansson@stericsson.com>
Acked-by: NLinus Walleij <linus.walleij@linaro.org>
Signed-off-by: NChris Ball <cjb@laptop.org>

9a0da648

mmc: sdhi: Allow named IRQs to use specific handlers · d5098cb6

由 Simon Horman 提交于 8月 26, 2011

Allow named IRQs to use corresponding specific handlers. If named IRQs are
used, at least an "sdcard" IRQ has to be specified by the platform. If
names are not used, an arbitrary number of IRQs can be provided by the
platform, in which case the generic ISR will be used for each of them.

Cc: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Acked-by: NMagnus Damm <magnus.damm@gmail.com>
Signed-off-by: NSimon Horman <horms@verge.net.au>
[g.liakhovetski@gmx.de: style and typo corrections, platform data check]
Signed-off-by: NGuennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: NChris Ball <cjb@laptop.org>

d5098cb6

mmc: core: add random fault injection · 1b676f70

由 Per Forlin 提交于 8月 19, 2011

This adds support to inject data errors after a completed host transfer.
The mmc core will return error even though the host transfer is successful.
This simple fault injection proved to be very useful to test the
non-blocking error handling in the mmc_blk_issue_rw_rq().
Random faults can also test how the host driver handles pre_req()
and post_req() in case of errors.
Signed-off-by: NPer Forlin <per.forlin@linaro.org>
Acked-by: NAkinobu Mita <akinobu.mita@gmail.com>
Reviewed-by: NLinus Walleij <linus.walleij@linaro.org>
Signed-off-by: NChris Ball <cjb@laptop.org>

1b676f70

mmc: atmel-mci: use ATMEL_PDC_SCND_BUF_OFF instead of a literal value · 1ebbe3d3

由 Ludovic Desroches 提交于 8月 11, 2011

Signed-off-by: NLudovic Desroches <ludovic.desroches@atmel.com>
Signed-off-by: NNicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: NChris Ball <cjb@laptop.org>

1ebbe3d3

mmc: atmel-mci: change namespace · 2c96a293

由 Ludovic Desroches 提交于 8月 11, 2011

Homogenize namespace to atmci.
Signed-off-by: NLudovic Desroches <ludovic.desroches@atmel.com>
Signed-off-by: NNicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: NChris Ball <cjb@laptop.org>

2c96a293

26 10月, 2011 3 次提交

params: make dashes and underscores in parameter names truly equal · b1e4d20c

由 Michal Schmidt 提交于 10月 10, 2011

The user may use "foo-bar" for a kernel parameter defined as "foo_bar".
Make sure it works the other way around too.

Apply the equality of dashes and underscores on early_params and __setup
params as well.

The example given in Documentation/kernel-parameters.txt indicates that
this is the intended behaviour.

With the patch the kernel accepts "log-buf-len=1M" as expected.
https://bugzilla.redhat.com/show_bug.cgi?id=744545Signed-off-by: NMichal Schmidt <mschmidt@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (neatened implementations)

b1e4d20c

libceph: don't complain on msgpool alloc failures · b61c2763

由 Sage Weil 提交于 8月 09, 2011

The pool allocation failures are masked by the pool; there is no need to
spam the console about them.  (That's the whole point of having the pool
in the first place.)

Mark msg allocations whose failure is safely handled as such.
Signed-off-by: NSage Weil <sage@newdream.net>

b61c2763

libceph: create messenger with client · 6ab00d46

由 Sage Weil 提交于 8月 09, 2011

This simplifies the init/shutdown paths, and makes client->msgr available
during the rest of the setup process.
Signed-off-by: NSage Weil <sage@newdream.net>

6ab00d46

25 10月, 2011 1 次提交

ore: RAID5 Write · 769ba8d9

由 Boaz Harrosh 提交于 10月 14, 2011

This is finally the RAID5 Write support.

The bigger part of this patch is not the XOR engine itself, But the
read4write logic, which is a complete mini prepare_for_striping
reading engine that can read scattered pages of a stripe into cache
so it can be used for XOR calculation. That is, if the write was not
stripe aligned.

The main algorithm behind the XOR engine is the 2 dimensional array:
	struct __stripe_pages_2d.
A drawing might save 1000 words
---

__stripe_pages_2d
       |
 n = pages_in_stripe_unit;
 w = group_width - parity;
       |                            pages array presented to the XOR lib
       |                                                |
       V                                                |
 __1_page_stripe[0].pages --> [c0][c1]..[cw][c_par] <---|
       |                                                |
 __1_page_stripe[1].pages --> [c0][c1]..[cw][c_par] <---
       |
...    |                         ...
       |
 __1_page_stripe[n].pages --> [c0][c1]..[cw][c_par]
                               ^
                               |
           data added columns first then row

---
The pages are put on this array columns first. .i.e:
	p0-of-c0, p1-of-c0, ... pn-of-c0, p0-of-c1, ...
So we are doing a corner turn of the pages.

Note that pages will zigzag down and left. but are put sequentially
in growing order. So when the time comes to XOR the stripe, only the
beginning and end of the array need be checked. We scan the array
and any NULL spot will be field by pages-to-be-read.

The FS that wants to support RAID5 needs to supply an
operations-vector that searches a given page in cache, and specifies
if the page is uptodate or need reading. All these pages to be read
are put on a slave ore_io_state and synchronously read. All the pages
of a stripe are read in one IO, using the scatter gather mechanism.

In write we constrain our IO to only be incomplete on a single
stripe. Meaning either the complete IO is within a single stripe so
we might have pages to read from both beginning  or end of the
strip. Or we have some reading to do at beginning but end at strip
boundary. The left over pages are pushed to the next IO by the API
already established by previous work, where an IO offset/length
combination presented to the ORE might get the length truncated and
the user must re-submit the leftover pages. (Both exofs and NFS
support this)

But any ORE user should make it's best effort to align it's IO
before hand and avoid complications. A cached ore_layout->stripe_size
member can be used for that calculation. (NOTE: that ORE demands
that stripe_size may not be bigger then 32bit)

What else? Well read it and tell me.
Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com>

769ba8d9