• B
    riscv: Fix memmove and optimise memcpy when misalign · 703b84ec
    Bin Meng 提交于
    At present U-Boot SPL fails to boot on SiFive Unleashed board, due
    to a load address misaligned exception happens when loading the FIT
    image in spl_load_simple_fit(). The exception happens in memmove()
    which is called by fdt_splice_().
    
    Commit 8f0dc4cf introduces an assembly version of memmove but
    it does take misalignment into account (it checks if length is a
    multiple of machine word size but pointers need also be aligned).
    As a result it will generate misaligned load/store for the majority
    of cases and causes significant performance regression on hardware
    that traps misaligned load/store and emulate them using firmware.
    
    The current behaviour of memcpy is that it checks if both src and
    dest pointers are co-aligned (aka congruent modular SZ_REG). If
    aligned, it will copy data word-by-word after first aligning
    pointers to word boundary. If src and dst are not co-aligned,
    however, byte-wise copy will be performed.
    
    This patch was taken from the Linux kernel patch [1], which has not
    been applied at the time being. It fixes the memmove and optimises
    memcpy for misaligned cases. It will first align destination pointer
    to word-boundary regardless whether src and dest are co-aligned or
    not. If they indeed are, then wordwise copy is performed. If they
    are not co-aligned, then it will load two adjacent words from src
    and use shifts to assemble a full machine word. Some additional
    assembly level micro-optimisation is also performed to ensure more
    instructions can be compressed (e.g. prefer a0 to t6).
    
    With this patch, U-Boot boots again on SiFive Unleashed board.
    
    [1] https://patchwork.kernel.org/project/linux-riscv/patch/20210216225555.4976-1-gary@garyguo.net/
    
    Fixes: 8f0dc4cf ("riscv: assembler versions of memcpy, memmove, memset")
    Signed-off-by: NBin Meng <bmeng.cn@gmail.com>
    703b84ec
memcpy.S 3.3 KB