• N
    mm/mempolicy.c: convert the shared_policy lock to a rwlock · 4a8c7bb5
    Nathan Zimmer 提交于
    When running the SPECint_rate gcc on some very large boxes it was
    noticed that the system was spending lots of time in
    mpol_shared_policy_lookup().  The gamess benchmark can also show it and
    is what I mostly used to chase down the issue since the setup for that I
    found to be easier.
    
    To be clear the binaries were on tmpfs because of disk I/O requirements.
    We then used text replication to avoid icache misses and having all the
    copies banging on the memory where the instruction code resides.  This
    results in us hitting a bottleneck in mpol_shared_policy_lookup() since
    lookup is serialised by the shared_policy lock.
    
    I have only reproduced this on very large (3k+ cores) boxes.  The
    problem starts showing up at just a few hundred ranks getting worse
    until it threatens to livelock once it gets large enough.  For example
    on the gamess benchmark at 128 ranks this area consumes only ~1% of
    time, at 512 ranks it consumes nearly 13%, and at 2k ranks it is over
    90%.
    
    To alleviate the contention in this area I converted the spinlock to an
    rwlock.  This allows a large number of lookups to happen simultaneously.
    The results were quite good reducing this consumtion at max ranks to
    around 2%.
    
    [akpm@linux-foundation.org: tidy up code comments]
    Signed-off-by: NNathan Zimmer <nzimmer@sgi.com>
    Acked-by: NDavid Rientjes <rientjes@google.com>
    Acked-by: NVlastimil Babka <vbabka@suse.cz>
    Cc: Nadia Yvette Chambers <nyc@holomorphy.com>
    Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
    Cc: Mel Gorman <mgorman@suse.de>
    Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
    Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
    4a8c7bb5
inode.c 34.7 KB