1. 09 7月, 2009 1 次提交
  2. 07 7月, 2009 2 次提交
  3. 01 7月, 2009 2 次提交
  4. 30 6月, 2009 2 次提交
  5. 26 6月, 2009 8 次提交
  6. 25 6月, 2009 5 次提交
    • T
      futex: request only one page from get_user_pages() · aa715284
      Thomas Gleixner 提交于
      Yanmin noticed that fault_in_user_writeable() requests 4 pages instead
      of one.
      
      That's the result of blindly trusting Linus' proposal :) I even looked
      up the prototype to verify the correctness: the argument in question
      is confusingly enough named "len" while in reality it means number of
      pages.
      Pointed-out-by: NYanmin Zhang <yanmin_zhang@linux.intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      aa715284
    • P
      ring-buffer: Make it generally available · 1155de47
      Paul Mundt 提交于
      In hunting down the cause for the hwlat_detector ring buffer spew in
      my failed -next builds it became obvious that folks are now treating
      ring_buffer as something that is generic independent of tracing and thus,
      suitable for public driver consumption.
      
      Given that there are only a few minor areas in ring_buffer that have any
      reliance on CONFIG_TRACING or CONFIG_FUNCTION_TRACER, provide stubs for
      those and make it generally available.
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      Cc: Jon Masters <jcm@jonmasters.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <20090625053012.GB19944@linux-sh.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1155de47
    • L
      ftrace: Remove duplicate newline · 00e54d08
      Li Zefan 提交于
      Before:
        # echo 'sys_open:traceon:' > set_ftrace_filter
        # echo 'sys_close:traceoff:5' > set_ftrace_filter
        # cat set_ftrace_filter
        #### all functions enabled ####
        sys_open:traceon:unlimited
      
        sys_close:traceoff:count=0
      
      After:
        # cat set_ftrace_filter
        #### all functions enabled ####
        sys_open:traceon:unlimited
        sys_close:traceoff:count=0
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4A4313A7.7030105@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      00e54d08
    • E
      audit: inode watches depend on CONFIG_AUDIT not CONFIG_AUDIT_SYSCALL · 3a6a6c16
      Eric Paris 提交于
      Even though one cannot make use of the audit watch code without
      CONFIG_AUDIT_SYSCALL the spaghetti nature of the audit code means that
      the audit rule filtering requires that it at least be compiled.
      
      Thus build the audit_watch code when we build auditfilter like it was
      before cfcad62c
      
      Clearly this is a point of potential future cleanup..
      Reported-by: NFrans Pop <elendil@planet.nl>
      Signed-off-by: NEric Paris <eparis@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      3a6a6c16
    • T
      futex: Fix the write access fault problem for real · d0725992
      Thomas Gleixner 提交于
      commit 64d1304a (futex: setup writeable mapping for futex ops which
      modify user space data) did address only half of the problem of write
      access faults.
      
      The patch was made on two wrong assumptions:
      
      1) access_ok(VERIFY_WRITE,...) would actually check write access.
      
         On x86 it does _NOT_. It's a pure address range check.
      
      2) a RW mapped region can not go away under us.
      
         That's wrong as well. Nobody can prevent another thread to call
         mprotect(PROT_READ) on that region where the futex resides. If that
         call hits between the get_user_pages_fast() verification and the
         actual write access in the atomic region we are toast again.
      
      The solution is to not rely on access_ok and get_user() for any write
      access related fault on private and shared futexes. Instead we need to
      fault it in with verification of write access.
      
      There is no generic non destructive write mechanism which would fault
      the user page in trough a #PF, but as we already know that we will
      fault we can as well call get_user_pages() directly and avoid the #PF
      overhead.
      
      If get_user_pages() returns -EFAULT we know that we can not fix it
      anymore and need to bail out to user space.
      
      Remove a bunch of confusing comments on this issue as well.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: stable@kernel.org
      d0725992
  7. 24 6月, 2009 19 次提交
    • L
      tracing: Fix trace_buf_size boot option · 9d612bef
      Li Zefan 提交于
      We should be able to specify [KMG] when setting trace_buf_size
      boot option, as documented in kernel-parameters.txt
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4A41F2DB.4020102@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9d612bef
    • H
      timer stats: Optimize by adding quick check to avoid function calls · 507e1231
      Heiko Carstens 提交于
      When the kernel is configured with CONFIG_TIMER_STATS but timer
      stats are runtime disabled we still get calls to
      __timer_stats_timer_set_start_info which initializes some
      fields in the corresponding struct timer_list.
      
      So add some quick checks in the the timer stats setup functions
      to avoid function calls to __timer_stats_timer_set_start_info
      when timer stats are disabled.
      
      In an artificial workload that does nothing but playing ping
      pong with a single tcp packet via loopback this decreases cpu
      consumption by 1 - 1.5%.
      
      This is part of a modified function trace output on SLES11:
      
       perl-2497  [00] 28630647177732388 [+  125]: sk_reset_timer <-tcp_v4_rcv
       perl-2497  [00] 28630647177732513 [+  125]: mod_timer <-sk_reset_timer
       perl-2497  [00] 28630647177732638 [+  125]: __timer_stats_timer_set_start_info <-mod_timer
       perl-2497  [00] 28630647177732763 [+  125]: __mod_timer <-mod_timer
       perl-2497  [00] 28630647177732888 [+  125]: __timer_stats_timer_set_start_info <-__mod_timer
       perl-2497  [00] 28630647177733013 [+   93]: lock_timer_base <-__mod_timer
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Mustafa Mesanovic <mustafa.mesanovic@de.ibm.com>
      Cc: Arjan van de Ven <arjan@infradead.org>
      LKML-Reference: <20090623153811.GA4641@osiris.boeblingen.de.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      507e1231
    • L
      ftrace: Fix t_hash_start() · d82d6244
      Li Zefan 提交于
      When the output of set_ftrace_filter is larger than PAGE_SIZE,
      t_hash_start() will be called the 2nd time, and then we start
      from the head of a hlist, which is wrong and causes some entries
      to be outputed twice.
      
      The worse is, if the hlist is large enough, reading set_ftrace_filter
      won't stop but in a dead loop.
      Reviewed-by: NLiming Wang <liming.wang@windriver.com>
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4A41876E.2060407@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d82d6244
    • L
      ftrace: Don't manipulate @pos in t_start() · 694ce0a5
      Li Zefan 提交于
      It's rather confusing that in t_start(), in some cases @pos is
      incremented, and in some cases it's decremented and then incremented.
      
      This patch rewrites t_start() in a much more general way.
      
      Thus we fix a bug that if ftrace_filtered == 1, functions have tracer
      hooks won't be printed, because the branch is always unreachable:
      
      static void *t_start(...)
      {
      	...
      	if (!p)
      		return t_hash_start(m, pos);
      	return p;
      }
      
      Before:
        # echo 'sys_open' > /mnt/tracing/set_ftrace_filter
        # echo 'sys_write:traceon:4' >> /mnt/tracing/set_ftrace_filter
        sys_open
      
      After:
        # echo 'sys_open' > /mnt/tracing/set_ftrace_filter
        # echo 'sys_write:traceon:4' >> /mnt/tracing/set_ftrace_filter
        sys_open
        sys_write:traceon:count=4
      Reviewed-by: NLiming Wang <liming.wang@windriver.com>
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4A41874B.4090507@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      694ce0a5
    • L
      ftrace: Don't increment @pos in g_start() · 85951842
      Li Zefan 提交于
      It's wrong to increment @pos in g_start(). It causes some entries
      lost when reading set_graph_function, if the output of the file
      is larger than PAGE_SIZE.
      Reviewed-by: NLiming Wang <liming.wang@windriver.com>
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4A418738.7090401@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      85951842
    • L
      tracing: Reset iterator in t_start() · f129e965
      Li Zefan 提交于
      The iterator is m->private, but it's not reset to trace_types in
      t_start(). If the output is larger than PAGE_SIZE and t_start()
      is called the 2nd time, things will go wrong.
      Reviewed-by: NLiming Wang <liming.wang@windriver.com>
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4A418728.5020506@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f129e965
    • L
      trace_stat: Don't increment @pos in seq start() · 2961bf34
      Li Zefan 提交于
      It's wrong to increment @pos in stat_seq_start(). It causes some
      stat entries lost when reading stat file, if the output of the file
      is larger than PAGE_SIZE.
      Reviewed-by: NLiming Wang <liming.wang@windriver.com>
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4A418716.90209@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2961bf34
    • L
      tracing_bprintk: Don't increment @pos in t_start() · c8961ec6
      Li Zefan 提交于
      It's wrong to increment @pos in t_start(), otherwise we'll lose
      some entries when reading printk_formats, if the output is larger
      than PAGE_SIZE.
      Reported-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Reviewed-by: NLiming Wang <liming.wang@windriver.com>
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4A4186FA.1020106@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c8961ec6
    • L
      tracing/events: Don't increment @pos in s_start() · e1c7e2a6
      Li Zefan 提交于
      While testing syscall tracepoints posted by Jason, I found 3 entries
      were missing when reading available_events. The output size of
      available_events is < 4 pages, which means we lost 1 entry per page.
      
      The cause is, it's wrong to increment @pos in s_start().
      
      Actually there's another bug here -- reading avaiable_events/set_events
      can race with module unload:
      
        # cat available_events               |
            s_start()                        |
            s_stop()                         |
                                             | # rmmod foo.ko
            s_start()                        |
              call = list_entry(m->private)  |
      
      @call might be freed and accessing it will lead to crash.
      Reviewed-by: NLiming Wang <liming.wang@windriver.com>
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4A4186DD.6090405@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e1c7e2a6
    • A
      Fix rule eviction order for AUDIT_DIR · 916d7576
      Al Viro 提交于
      If syscall removes the root of subtree being watched, we
      definitely do not want the rules refering that subtree
      to be destroyed without the syscall in question having
      a chance to match them.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      916d7576
    • E
      Audit: clean up all op= output to include string quoting · 9d960985
      Eric Paris 提交于
      A number of places in the audit system we send an op= followed by a string
      that includes spaces.  Somehow this works but it's just wrong.  This patch
      moves all of those that I could find to be quoted.
      
      Example:
      
      Change From: type=CONFIG_CHANGE msg=audit(1244666690.117:31): auid=0 ses=1
      subj=unconfined_u:unconfined_r:auditctl_t:s0-s0:c0.c1023 op=remove rule
      key="number2" list=4 res=0
      
      Change To: type=CONFIG_CHANGE msg=audit(1244666690.117:31): auid=0 ses=1
      subj=unconfined_u:unconfined_r:auditctl_t:s0-s0:c0.c1023 op="remove rule"
      key="number2" list=4 res=0
      Signed-off-by: NEric Paris <eparis@redhat.com>
      9d960985
    • E
      Audit: move audit_get_nd completely into audit_watch · 35fe4d0b
      Eric Paris 提交于
      audit_get_nd() is only used  by audit_watch and could be more cleanly
      implemented by having the audit watch functions call it when needed rather
      than making the generic audit rule parsing code deal with those objects.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      35fe4d0b
    • E
      audit: seperate audit inode watches into a subfile · cfcad62c
      Eric Paris 提交于
      In preparation for converting audit to use fsnotify instead of inotify we
      seperate the inode watching code into it's own file.  This is similar to
      how the audit tree watching code is already seperated into audit_tree.c
      Signed-off-by: NEric Paris <eparis@redhat.com>
      cfcad62c
    • E
      Audit: clean up audit_receive_skb · ea7ae60b
      Eric Paris 提交于
      audit_receive_skb is hard to clearly parse what it is doing to the netlink
      message.  Clean the function up so it is easy and clear to see what is going
      on.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      ea7ae60b
    • E
      Audit: cleanup netlink mesg handling · ee080e6c
      Eric Paris 提交于
      The audit handling of netlink messages is all over the place.  Clean things
      up, use predetermined macros, generally make it more readable.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      ee080e6c
    • E
      Audit: unify the printk of an skb when auditd not around · 038cbcf6
      Eric Paris 提交于
      Remove code duplication of skb printk when auditd is not around in userspace
      to deal with this message.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      038cbcf6
    • E
      Audit: dereferencing krule as if it were an audit_watch · e85188f4
      Eric Paris 提交于
      audit_update_watch() runs all of the rules for a given watch and duplicates
      them, attaches a new watch to them, and then when it finishes that process
      and has called free on all of the old rules (ok maybe still inside the rcu
      grace period) it proceeds to use the last element from list_for_each_entry_safe()
      as if it were a krule rather than being the audit_watch which was anchoring
      the list to output a message about audit rules changing.
      
      This patch unfies the audit message from two different places into a helper
      function and calls it from the correct location in audit_update_rules().  We
      will now get an audit message about the config changing for each rule (with
      each rules filterkey) rather than the previous garbage.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      e85188f4
    • E
      Audit: better estimation of execve record length · b87ce6e4
      Eric Paris 提交于
      The audit execve record splitting code estimates the length of the message
      generated.  But it forgot to include the "" that wrap each string in its
      estimation.  This means that execve messages with lots of tiny (1-2 byte)
      arguments could still cause records greater than 8k to be emitted.  Simply
      fix the estimate.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      b87ce6e4
    • E
      Audit: fix audit watch use after free · 35aa901c
      Eric Paris 提交于
      When an audit watch is added to a parent the temporary watch inside the
      original krule from userspace is freed.  Yet the original watch is used after
      the real watch was created in audit_add_rules()
      Signed-off-by: NEric Paris <eparis@redhat.com>
      35aa901c
  8. 23 6月, 2009 1 次提交