• C
    mm: use raw_cpu ops for determining current NUMA node · dc322a99
    Christoph Lameter 提交于
    With the preempt checking logic for __this_cpu_ops we will get false
    positives from locations in the code that use numa_node_id.
    
    Before the __this_cpu ops where introduced there were no checks for
    preemption present either.  smp_raw_processor_id() was used.  See
    
      http://www.spinics.net/lists/linux-numa/msg00641.html
    
    Therefore we need to use raw_cpu_read here to avoid false postives.
    
    Note that this issue has been discussed in prior years.  If the process
    changes nodes after retrieving the current numa node then that is
    acceptable since most uses of numa_node etc are for optimization and not
    for correctness.
    
    There were suggestions to implement a raw_numa_node_id in order to do
    preempt checks for numa_node_id as well.  But I think we better defer
    that to another patch since that would mean investigating how
    numa_node_id() is used throughout the kernel which would increase the
    scope of this patchset significantly.  After all preemption was never
    checked before when numa_node_id() was used.
    
    Some sample traces:
    
    __this_cpu_read operation in preemptible [00000000] code: login/1456
    caller is __this_cpu_preempt_check+0x2b/0x2d
    CPU: 0 PID: 1456 Comm: login Not tainted 3.12.0-rc4-cl-00062-g2fe80d3b-dirty #185
    Call Trace:
      dump_stack+0x4e/0x82
      check_preemption_disabled+0xc5/0xe0
      __this_cpu_preempt_check+0x2b/0x2d
      get_task_policy+0x1d/0x49
      get_vma_policy+0x14/0x76
      alloc_pages_vma+0x35/0xff
      handle_mm_fault+0x290/0x73b
      __do_page_fault+0x3fe/0x44d
      do_page_fault+0x9/0xc
      page_fault+0x22/0x30
      generic_file_aio_read+0x38e/0x624
      do_sync_read+0x54/0x73
      vfs_read+0x9d/0x12a
      SyS_read+0x47/0x7e
      cstar_dispatch+0x7/0x23
    
    caller is __this_cpu_preempt_check+0x2b/0x2d
    CPU: 0 PID: 1456 Comm: login Not tainted 3.12.0-rc4-cl-00062-g2fe80d3b-dirty #185
    Call Trace:
      dump_stack+0x4e/0x82
      check_preemption_disabled+0xc5/0xe0
      __this_cpu_preempt_check+0x2b/0x2d
      alloc_pages_current+0x8f/0xbc
      __page_cache_alloc+0xb/0xd
      __do_page_cache_readahead+0xf4/0x219
      ra_submit+0x1c/0x20
      ondemand_readahead+0x28c/0x2b4
      page_cache_sync_readahead+0x38/0x3a
      generic_file_aio_read+0x261/0x624
      do_sync_read+0x54/0x73
      vfs_read+0x9d/0x12a
      SyS_read+0x47/0x7e
      cstar_dispatch+0x7/0x23
    Signed-off-by: NChristoph Lameter <cl@linux.com>
    Acked-by: NIngo Molnar <mingo@kernel.org>
    Cc: Alex Shi <alex.shi@intel.com>
    Cc: Tejun Heo <tj@kernel.org>
    Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
    dc322a99
topology.h 7.5 KB