1. 09 11月, 2012 1 次提交
    • A
      revert "epoll: support for disabling items, and a self-test app" · a80a6b85
      Andrew Morton 提交于
      Revert commit 03a7beb5 ("epoll: support for disabling items, and a
      self-test app") pending resolution of the issues identified by Michael
      Kerrisk, copied below.
      
      We'll revisit this for 3.8.
      
      : I've taken a look at this patch as it currently stands in 3.7-rc1, and
      : done a bit of testing. (By the way, the test program
      : tools/testing/selftests/epoll/test_epoll.c does not compile...)
      :
      : There are one or two places where the behavior seems a little strange,
      : so I have a question or two at the end of this mail. But other than
      : that, I want to check my understanding so that the interface can be
      : correctly documented.
      :
      : Just to go though my understanding, the problem is the following
      : scenario in a multithreaded application:
      :
      : 1. Multiple threads are performing epoll_wait() operations,
      :    and maintaining a user-space cache that contains information
      :    corresponding to each file descriptor being monitored by
      :    epoll_wait().
      :
      : 2. At some point, a thread wants to delete (EPOLL_CTL_DEL)
      :    a file descriptor from the epoll interest list, and
      :    delete the corresponding record from the user-space cache.
      :
      : 3. The problem with (2) is that some other thread may have
      :    previously done an epoll_wait() that retrieved information
      :    about the fd in question, and may be in the middle of using
      :    information in the cache that relates to that fd. Thus,
      :    there is a potential race.
      :
      : 4. The race can't solved purely in user space, because doing
      :    so would require applying a mutex across the epoll_wait()
      :    call, which would of course blow thread concurrency.
      :
      : Right?
      :
      : Your solution is the EPOLL_CTL_DISABLE operation. I want to
      : confirm my understanding about how to use this flag, since
      : the description that has accompanied the patches so far
      : has been a bit sparse
      :
      : 0. In the scenario you're concerned about, deleting a file
      :    descriptor means (safely) doing the following:
      :    (a) Deleting the file descriptor from the epoll interest list
      :        using EPOLL_CTL_DEL
      :    (b) Deleting the corresponding record in the user-space cache
      :
      : 1. It's only meaningful to use this EPOLL_CTL_DISABLE in
      :    conjunction with EPOLLONESHOT.
      :
      : 2. Using EPOLL_CTL_DISABLE without using EPOLLONESHOT in
      :    conjunction is a logical error.
      :
      : 3. The correct way to code multithreaded applications using
      :    EPOLL_CTL_DISABLE and EPOLLONESHOT is as follows:
      :
      :    a. All EPOLL_CTL_ADD and EPOLL_CTL_MOD operations should
      :       should EPOLLONESHOT.
      :
      :    b. When a thread wants to delete a file descriptor, it
      :       should do the following:
      :
      :       [1] Call epoll_ctl(EPOLL_CTL_DISABLE)
      :       [2] If the return status from epoll_ctl(EPOLL_CTL_DISABLE)
      :           was zero, then the file descriptor can be safely
      :           deleted by the thread that made this call.
      :       [3] If the epoll_ctl(EPOLL_CTL_DISABLE) fails with EBUSY,
      :           then the descriptor is in use. In this case, the calling
      :           thread should set a flag in the user-space cache to
      :           indicate that the thread that is using the descriptor
      :           should perform the deletion operation.
      :
      : Is all of the above correct?
      :
      : The implementation depends on checking on whether
      : (events & ~EP_PRIVATE_BITS) == 0
      : This replies on the fact that EPOLL_CTL_AD and EPOLL_CTL_MOD always
      : set EPOLLHUP and EPOLLERR in the 'events' mask, and EPOLLONESHOT
      : causes those flags (as well as all others in ~EP_PRIVATE_BITS) to be
      : cleared.
      :
      : A corollary to the previous paragraph is that using EPOLL_CTL_DISABLE
      : is only useful in conjunction with EPOLLONESHOT. However, as things
      : stand, one can use EPOLL_CTL_DISABLE on a file descriptor that does
      : not have EPOLLONESHOT set in 'events' This results in the following
      : (slightly surprising) behavior:
      :
      : (a) The first call to epoll_ctl(EPOLL_CTL_DISABLE) returns 0
      :     (the indicator that the file descriptor can be safely deleted).
      : (b) The next call to epoll_ctl(EPOLL_CTL_DISABLE) fails with EBUSY.
      :
      : This doesn't seem particularly useful, and in fact is probably an
      : indication that the user made a logic error: they should only be using
      : epoll_ctl(EPOLL_CTL_DISABLE) on a file descriptor for which
      : EPOLLONESHOT was set in 'events'. If that is correct, then would it
      : not make sense to return an error to user space for this case?
      
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: "Paton J. Lewis" <palewis@adobe.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a80a6b85
  2. 26 10月, 2012 3 次提交
  3. 23 10月, 2012 1 次提交
  4. 22 10月, 2012 4 次提交
    • L
      perf tools: do not flush maps on COMM for perf report · 9fdbf671
      Luigi Semenzato 提交于
      This fixes a long-standing bug caused by the lack of separate COMM and EXEC
      record types, which makes "perf report" lose track of symbols when a process
      renames itself.
      
      With this fix (suggested by Stephane Eranian), a COMM (rename) no longer
      flushes the maps, which is the correct behavior.  An EXEC also no longer
      flushes the maps, but this doesn't matter because as new mappings are created
      (for the executable and the libraries) the old mappings are automatically
      removed.  This is not by accident: the functionality is necessary because DLLs
      can be explicitly loaded at any time with dlopen(), possibly on top of existing
      text, so "perf report" handles correctly the clobbering of new mappings on top
      of old ones.
      
      An alternative patch (which I proposed earlier) would be to introduce a
      separate PERF_RECORD_EXEC type, but it is a much larger change (about 300
      lines) and is not necessary.
      Signed-off-by: NLuigi Semenzato <semenzato@chromium.org>
      Tested-by: NStephane Eranian <eranian@google.com>
      Acked-by: NStephane Eranian <eranian@google.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Lucas De Marchi <lucas.demarchi@profusion.mobi>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Olof Johansson <olofj@chromium.org>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Sonny Rao <sonnyrao@chromium.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Stephen Wilson <wilsons@start.ca>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Vasiliy Kulikov <segoon@openwall.com>
      Link: http://lkml.kernel.org/r/1345585940-6497-1-git-send-email-semenzato@chromium.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9fdbf671
    • N
      perf help: Fix --help for builtins · 670ab5d2
      Namhyung Kim 提交于
      It seems that commit cc584821 ("perf help: Remove use of die and
      handle errors") caused the problem - it changed the initial value of
      'help_format' from HELP_FORMAT_MAN to HELP_FORMAT_NONE.
      
      This broke the --help option for all builtins, that would produce no
      output, while 'man perf-top' would work it MANPATH is properly setup.
      Reported-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Tested-by: NDavid Ahern <dsahern@gmail.com>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/87r4orj7zc.fsf@sejong.aot.lge.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      670ab5d2
    • A
      perf trace: Check if sample raw_data field is set · fc551f8d
      Arnaldo Carvalho de Melo 提交于
      Sometimes we're segfaulting because we were expecting that the
      perf_sample.raw_data field was set as requested, but in some cases
      that needs further investigation, that field can be NULL, leading
      to segfaults.
      
      Make the tool more robust by checking that before calling any per event
      handlers that may try to use that field.
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-g1fmodl6ys4lq8honbj1igoi@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      fc551f8d
    • A
      perf trace: Validate syscall id before growing syscall table · 3a531260
      Arnaldo Carvalho de Melo 提交于
      In some cases the ID for a syscall read thru the raw_syscalls tracepoint
      is bogus, still needs to be investigated why, but to make the tool more
      robust first try to resolve the ID to a name via libaudit and if it
      fails, don't grow the table.
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-0lsokw3xor7c4ijo45u6bauh@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3a531260
  5. 20 10月, 2012 1 次提交
  6. 18 10月, 2012 1 次提交
  7. 17 10月, 2012 8 次提交
  8. 16 10月, 2012 1 次提交
    • D
      perf tool: Precise mode requires exclude_guest · 1342798c
      David Ahern 提交于
      Summary of events per Peter:
      
        "Intel PEBS in VT-x context uses the DS address as a guest linear address,
        even though its programmed by the host as a host linear address. This
        either results in guest memory corruption and or the hardware faulting and
        'crashing' the virtual machine.  Therefore we have to disable PEBS on VT-x
        enter and re-enable on VT-x exit, enforcing a strict exclude_guest.
      
        AMB IBS does work but doesn't currently support exclude_* at all,
        setting an exclude_* bit will make it fail."
      
      This patch handles userspace perf command, setting the exclude_guest
      attribute if precise mode is requested, but only if a user has not
      specified a request for guest or host only profiling (G or H attribute).
      
      Kernel side AMD currently ignores all exclude_* bits, so there is no impact
      to existing IBS code paths. Robert Richter has a patch where IBS code will
      return EINVAL if an exclude_* bit is set. When this goes in it means use
      of :p on AMD with IBS will first fail with EINVAL (because exclude_guest
      will be set). Then the existing fallback code within perf will unset
      exclude_guest and try again. The second attempt will succeed if the CPU
      supports IBS profiling.
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Acked-by: NRobert Richter <robert.richter@amd.com>
      Tested-by: NRobert Richter <robert.richter@amd.com>
      Reviewed-by: NRobert Richter <robert.richter@amd.com>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Gleb Natapov <gleb@redhat.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert Richter <robert.richter@amd.com>
      Link: http://lkml.kernel.org/r/1347569955-54626-2-git-send-email-dsahern@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1342798c
  9. 15 10月, 2012 1 次提交
  10. 13 10月, 2012 2 次提交
  11. 11 10月, 2012 1 次提交
  12. 09 10月, 2012 2 次提交
  13. 07 10月, 2012 1 次提交
  14. 06 10月, 2012 1 次提交
  15. 05 10月, 2012 9 次提交
  16. 04 10月, 2012 2 次提交
  17. 03 10月, 2012 1 次提交