• P
    skb: Propagate pfmemalloc on skb from head page only · cca7af38
    Pavel Emelyanov 提交于
    Hi.
    
    I'm trying to send big chunks of memory from application address space via
    TCP socket using vmsplice + splice like this
    
       mem = mmap(128Mb);
       vmsplice(pipe[1], mem); /* splice memory into pipe */
       splice(pipe[0], tcp_socket); /* send it into network */
    
    When I'm lucky and a huge page splices into the pipe and then into the socket
    _and_ client and server ends of the TCP connection are on the same host,
    communicating via lo, the whole connection gets stuck! The sending queue
    becomes full and app stops writing/splicing more into it, but the receiving
    queue remains empty, and that's why.
    
    The __skb_fill_page_desc observes a tail page of a huge page and erroneously
    propagates its page->pfmemalloc value onto socket (the pfmemalloc on tail pages
    contain garbage). Then this skb->pfmemalloc leaks through lo and due to the
    
        tcp_v4_rcv
        sk_filter
            if (skb->pfmemalloc && !sock_flag(sk, SOCK_MEMALLOC)) /* true */
                return -ENOMEM
            goto release_and_discard;
    
    no packets reach the socket. Even TCP re-transmits are dropped by this, as skb
    cloning clones the pfmemalloc flag as well.
    
    That said, here's the proper page->pfmemalloc propagation onto socket: we
    must check the huge-page's head page only, other pages' pfmemalloc and mapping
    values do not contain what is expected in this place. However, I'm not sure
    whether this fix is _complete_, since pfmemalloc propagation via lo also
    oesn't look great.
    
    Both, bit propagation from page to skb and this check in sk_filter, were
    introduced by c48a11c7 (netvm: propagate page->pfmemalloc to skb), in v3.5 so
    Mel and stable@ are in Cc.
    Signed-off-by: NPavel Emelyanov <xemul@parallels.com>
    Acked-by: NEric Dumazet <edumazet@google.com>
    Acked-by: NMel Gorman <mgorman@suse.de>
    Signed-off-by: NDavid S. Miller <davem@davemloft.net>
    cca7af38
skbuff.h 77.8 KB