• A
    perf: Paper over the hw.target problems · 03804742
    Alexander Shishkin 提交于
    euler inclusion
    category: bugfix
    bugzilla: 9513/11006/11050
    CVE: NA
    --------------------------------------------------
    
    [ Cheng Jian
    HULK-Syzkaller reported a problem which has been reported to mainline(lkml)
    by syzbot early, this patch comes from the reply form lkml.
    v1	https://lkml.org/lkml/2019/2/28/529
    v2	https://lkml.org/lkml/2019/3/8/206
    we merged v1 first but cause bugzilla #11050, it was because :
    we also use perf_remove_from_context() in perf_event_open() when we move
    events from a SW context to a HW context, so we can't destroy the event
    here.
    now v2 will not exhibit that warning.
    it's same to another patch at https://lkml.org/lkml/2019/3/8/536.
    but more clear than it.]
    
    First, we have a race between perf_event_release_kernel() and
    perf_free_event(), which happens when parent's event is released while the
    child's fork fails (because of a fatal signal, for example), that looks
    like this:
    
    cpu X                            cpu Y
    -----                            -----
                                     copy_process() error path
    perf_release(parent)             +->perf_event_free_task()
    +-> lock(child_ctx->mutex)       |  |
    +-> remove_from_context(child)   |  |
    +-> unlock(child_ctx->mutex)     |  |
    |                                |  +-> lock(child_ctx->mutex)
    |                                |  +-> unlock(child_ctx->mutex)
    |                                +-> free_task(child_task)
    +-> put_task_struct(child_task)
    
    Technically, we're still holding a reference to the task via
    parent->hw.target, that's not stopping free_task(), so we end up poking at
    free'd memory, as is pointed out by KASAN in the syzkaller report (see Link
    below). The straightforward fix is to drop the hw.target reference while
    the task is still around.
    
    Therein lies the second problem: the users of hw.target (uprobe) assume
    that it's around at ->destroy() callback time, where they use it for
    context. So, in order to not break the uprobe teardown and avoid leaking
    stuff, we need to call ->destroy() at the same time.
    
    This patch fixes the race and the subsequent fallout by doing both these
    things at remove_from_context time.
    Signed-off-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com>
    Link: https://syzkaller.appspot.com/bug?extid=a24c397a29ad22d86c98Signed-off-by: NCheng Jian <cj.chengjian@huawei.com>
    Reviewed-by: NXie XiuQi <xiexiuqi@huawei.com>
    Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
    03804742
core.c 279.9 KB