1. 15 3月, 2013 9 次提交
    • S
      tracing: Clear all trace buffers when unloaded module event was used · 873c642f
      Steven Rostedt (Red Hat) 提交于
      Currently we do not know what buffer a module event was enabled in.
      On unload, it is safest to clear all buffer instances, not just the
      top level buffer.
      
      Todo: Clear only the buffer that the event was used in. The
      infrastructure is there to do this, but it makes the code a bit
      more complex. Lets get the current code vetted before we add that.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      873c642f
    • S
      tracing: Only clear trace buffer on module unload if event was traced · 575380da
      Steven Rostedt (Red Hat) 提交于
      Currently, when a module with events is unloaded, the trace buffer is
      cleared. This is just a safety net in case the module might have some
      strange callback when its event is outputted. But there's no reason
      to reset the buffer if the module didn't have any of its events traced.
      
      Add a flag to the event "call" structure called WAS_ENABLED and gets set
      when the event is ever enabled, and this flag never gets cleared. When a
      module gets unloaded, if any of its events have this flag set, then the
      trace buffer will get cleared.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      575380da
    • S
      tracing: Fix trace events build without modules · 315326c1
      Steven Rostedt (Red Hat) 提交于
      The new multi-buffers added a descriptor that kept track of module
      events, and the directories they use, with struct ftace_module_file_ops.
      This is used to add a ref count to keep modules from unloading while
      their files are being accessed.
      
      As the descriptor is only needed when CONFIG_MODULES is enabled, it
      is only declared when the config is enabled. But that struct is
      dereferenced in a few areas outside the #ifdef CONFIG_MODULES.
      
      By adding some helper routines and moving code around a little,
      events can be compiled again without modules.
      Reported-by: NFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      315326c1
    • S
      tracing: Use direct field, type and system names · 92edca07
      Steven Rostedt 提交于
      The names used to display the field and type in the event format
      files are copied, as well as the system name that is displayed.
      
      All these names are created by constant values passed in.
      If one of theses values were to be removed by a module, the module
      would also be required to remove any event it created.
      
      By using the strings directly, we can save over 100K of memory.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      92edca07
    • S
      tracing: Use kmem_cache_alloc instead of kmalloc in trace_events.c · d1a29143
      Steven Rostedt 提交于
      The event structures used by the trace events are mostly persistent,
      but they are also allocated by kmalloc, which is not the best at
      allocating space for what is used. By converting these kmallocs
      into kmem_cache_allocs, we can save over 50K of space that is
      permanently allocated.
      
      After boot we have:
      
       slab name          active allocated size
       ---------          ------ --------- ----
      ftrace_event_file    979   1005     56   67    1
      ftrace_event_field   2301   2310     48   77    1
      
      The ftrace_event_file has at boot up 979 active objects out of
      1005 allocated in the slabs. Each object is 56 bytes. In a normal
      kmalloc, that would allocate 64 bytes for each object.
      
       1005 - 979  = 26 objects not used
       26 * 56 = 1456 bytes wasted
      
      But if we used kmalloc:
      
       64 - 56 = 8 bytes unused per allocation
       8 * 979 = 7832 bytes wasted
      
       7832 - 1456 = 6376 bytes in savings
      
      Doing the same for ftrace_event_field where there's 2301 objects
      allocated in a slab that can hold 2310 with 48 bytes each we have:
      
       2310 - 2301 = 9 objects not used
       9 * 48 = 432 bytes wasted
      
      A kmalloc would also use 64 bytes per object:
      
       64 - 48 = 16 bytes unused per allocation
       16 * 2301 = 36816 bytes wasted!
      
       36816 - 432 = 36384 bytes in savings
      
      This change gives us a total of 42760 bytes in savings. At least
      on my machine, but as there's a lot of these persistent objects
      for all configurations that use trace points, this is a net win.
      
      Thanks to Ezequiel Garcia for his trace_analyze presentation which
      pointed out the wasted space in my code.
      
      Cc: Ezequiel Garcia <elezegarcia@gmail.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      d1a29143
    • S
      tracing: Get trace_events kernel command line working again · 77248221
      Steven Rostedt 提交于
      With the new descriptors used to allow multiple buffers in the
      tracing directory added, the kernel command line parameter
      trace_events=... no longer works. This is because the top level
      (global) trace array now has a list of descriptors associated
      with the events and the files in the debugfs directory. But in
      early bootup, when the command line is processed and the events
      enabled, the trace array list of events has not been set up yet.
      
      Without the list of events in the trace array, the setting of
      events to record will fail because it would not match any events.
      
      The solution is to set up the top level array in two stages.
      The first is to just add the ftrace file descriptors that just point
      to the events. This will allow events to be enabled and start tracing.
      The second stage is called after the filesystem is set up, and this
      stage will create the debugfs event files and directories associated
      with the trace array events.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      77248221
    • S
      tracing: Add rmdir to remove multibuffer instances · 0c8916c3
      Steven Rostedt 提交于
      Add a method to the hijacked dentry descriptor of the
      "instances" directory to allow for rmdir to remove an
      instance of a multibuffer.
      
      Example:
      
        cd /debug/tracing/instances
        mkdir hello
        ls
      hello/
        rmdir hello
        ls
      
      Like the mkdir method, the i_mutex is dropped for the instances
      directory. The instances directory is created at boot up and can
      not be renamed or removed. The trace_types_lock mutex is used to
      synchronize adding and removing of instances.
      
      I've run several stress tests with different threads trying to
      create and delete directories of the same name, and it has stood
      up fine.
      
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      0c8916c3
    • S
      tracing: Add interface to allow multiple trace buffers · 277ba044
      Steven Rostedt 提交于
      Add the interface ("instances" directory) to add multiple buffers
      to ftrace. To create a new instance, simply do a mkdir in the
      instances directory:
      
      This will create a directory with the following:
      
       # cd instances
       # mkdir foo
       # ls foo
      buffer_size_kb        free_buffer  trace_clock    trace_pipe
      buffer_total_size_kb  set_event    trace_marker   tracing_enabled
      events/               trace        trace_options  tracing_on
      
      Currently only events are able to be set, and there isn't a way
      to delete a buffer when one is created (yet).
      
      Note, the i_mutex lock is dropped from the parent "instances"
      directory during the mkdir operation. As the "instances" directory
      can not be renamed or deleted (created on boot), I do not see
      any harm in dropping the lock. The creation of the sub directories
      is protected by trace_types_lock mutex, which only lets one
      instance get into the code path at a time. If two tasks try to
      create or delete directories of the same name, only one will occur
      and the other will fail with -EEXIST.
      
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      277ba044
    • S
      tracing: Separate out trace events from global variables · ae63b31e
      Steven Rostedt 提交于
      The trace events for ftrace are all defined via global variables.
      The arrays of events and event systems are linked to a global list.
      This prevents multiple users of the event system (what to enable and
      what not to).
      
      By adding descriptors to represent the event/file relation, as well
      as to which trace_array descriptor they are associated with, allows
      for more than one set of events to be defined. Once the trace events
      files have a link between the trace event and the trace_array they
      are associated with, we can create multiple trace_arrays that can
      record separate events in separate buffers.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      ae63b31e
  2. 22 1月, 2013 1 次提交
    • S
      tracing: Remove the extra 4 bytes of padding in events · b000c806
      Steven Rostedt 提交于
      Due to a userspace issue with PowerTop v2beta, which hardcoded
      the offset of event fields that it was using, it broke when
      we removed the Big Kernel Lock counter from the event header.
      
       (commit e6e1e259 "tracing: Remove lock_depth from event entry")
      
      Because this broke userspace, it was determined that we must
      keep those 4 bytes around.
      
       (commit a3a4a5ac "Regression: partial revert "tracing: Remove lock_depth from event entry"")
      
      This unfortunately wastes space in the ring buffer. 4 bytes per
      event, where a lot of events are just 24 bytes. That's 16% of the
      buffer wasted. A million events will add 4 megs of white space
      into the buffer.
      
      It was later noticed that PowerTop v2beta could not work on systems
      where the kernel was 64 bit but the userspace was 32 bits.
      The reason was because the offsets are different between the
      two and the hard coded offset of one would not work with the other.
      
      With PowerTop v2 final, it implemented the same interface that both
      perf and trace-cmd use. That is, it reads the format file of
      the event to find the offsets of the fields it needs. This fixes
      the problem with running powertop on a 32 bit userspace running
      on a 64 bit kernel. It also no longer requires the 4 byte padding.
      
      As PowerTop v2 has been out for a while, and is included in all
      major distributions, it is time that we can safely remove the
      4 bytes of padding. Users of PowerTop v2beta should upgrade to
      PowerTop v2 final.
      
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Acked-by: NArjan van de Ven <arjan@linux.intel.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      b000c806
  3. 02 11月, 2012 2 次提交
    • S
      tracing: Use irq_work for wake ups and remove *_nowake_*() functions · 0d5c6e1c
      Steven Rostedt 提交于
      Have the ring buffer commit function use the irq_work infrastructure to
      wake up any waiters waiting on the ring buffer for new data. The irq_work
      was created for such a purpose, where doing the actual wake up at the
      time of adding data is too dangerous, as an event or function trace may
      be in the midst of the work queue locks and cause deadlocks. The irq_work
      will either delay the action to the next timer interrupt, or trigger an IPI
      to itself forcing an interrupt to do the work (in a safe location).
      
      With irq_work, all ring buffer commits can safely do wakeups, removing
      the need for the ring buffer commit "nowake" variants, which were used
      by events and function tracing. All commits can now safely use the
      normal commit, and the "nowake" variants can be removed.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      0d5c6e1c
    • S
      tracing: Separate open function from set_event and available_events · 15075cac
      Steven Rostedt 提交于
      The open function used by available_events is the same as set_event even
      though it uses different seq functions. This causes a side effect of
      writing into available_events clearing all events, even though
      available_events is suppose to be read only.
      
      There's no reason to keep a single function for just the open and have
      both use different functions for everything else. It is a little
      confusing and causes strange behavior. Just have each have their own
      function.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      15075cac
  4. 01 11月, 2012 1 次提交
    • S
      tracing: Enable comm recording if trace_printk() is used · 81698831
      Steven Rostedt 提交于
      If comm recording is not enabled when trace_printk() is used then
      you just get this type of output:
      
      [ adding trace_printk("hello! %d", irq); in do_IRQ ]
      
                 <...>-2843  [001] d.h.    80.812300: do_IRQ: hello! 14
                 <...>-2734  [002] d.h2    80.824664: do_IRQ: hello! 14
                 <...>-2713  [003] d.h.    80.829971: do_IRQ: hello! 14
                 <...>-2814  [000] d.h.    80.833026: do_IRQ: hello! 14
      
      By enabling the comm recorder when trace_printk is enabled:
      
             hackbench-6715  [001] d.h.   193.233776: do_IRQ: hello! 21
                  sshd-2659  [001] d.h.   193.665862: do_IRQ: hello! 21
                <idle>-0     [001] d.h1   193.665996: do_IRQ: hello! 21
      Suggested-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      81698831
  5. 25 9月, 2012 1 次提交
  6. 14 9月, 2012 1 次提交
  7. 31 7月, 2012 1 次提交
    • S
      ftrace: Add default recursion protection for function tracing · 4740974a
      Steven Rostedt 提交于
      As more users of the function tracer utility are being added, they do
      not always add the necessary recursion protection. To protect from
      function recursion due to tracing, if the callback ftrace_ops does not
      specifically specify that it protects against recursion (by setting
      the FTRACE_OPS_FL_RECURSION_SAFE flag), the list operation will be
      called by the mcount trampoline which adds recursion protection.
      
      If the flag is set, then the function will be called directly with no
      extra protection.
      
      Note, the list operation is called if more than one function callback
      is registered, or if the arch does not support all of the function
      tracer features.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      4740974a
  8. 20 7月, 2012 2 次提交
  9. 11 5月, 2012 1 次提交
    • S
      tracing: Do not enable function event with enable · 9b63776f
      Steven Rostedt 提交于
      With the adding of function tracing event to perf, it caused a
      side effect that produces the following warning when enabling all
      events in ftrace:
      
       # echo 1 > /sys/kernel/debug/tracing/events/enable
      
      [console]
      event trace: Could not enable event function
      
      This is because when enabling all events via the debugfs system
      it ignores events that do not have a ->reg() function assigned.
      This was to skip over the ftrace internal events (as they are
      not TRACE_EVENTs). But as the ftrace function event now has
      a ->reg() function attached to it for use with perf, it is no
      longer ignored.
      
      Worse yet, this ->reg() function is being called when it should
      not be. It returns an error and causes the above warning to
      be printed.
      
      By adding a new event_call flag (TRACE_EVENT_FL_IGNORE_ENABLE)
      and have all ftrace internel event structures have it set,
      setting the events/enable will no longe try to incorrectly enable
      the function event and does not warn.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      9b63776f
  10. 22 2月, 2012 2 次提交
  11. 06 12月, 2011 1 次提交
  12. 01 11月, 2011 1 次提交
  13. 07 7月, 2011 2 次提交
    • S
      tracing: Have "enable" file use refcounts like the "filter" file · 40ee4dff
      Steven Rostedt 提交于
      The "enable" file for the event system can be removed when a module
      is unloaded and the event system only has events from that module.
      As the event system nr_events count goes to zero, it may be freed
      if its ref_count is also set to zero.
      
      Like the "filter" file, the "enable" file may be opened by a task and
      referenced later, after a module has been unloaded and the events for
      that event system have been removed.
      
      Although the "filter" file referenced the event system structure,
      the "enable" file only references a pointer to the event system
      name. Since the name is freed when the event system is removed,
      it is possible that an access to the "enable" file may reference
      a freed pointer.
      
      Update the "enable" file to use the subsystem_open() routine that
      the "filter" file uses, to keep a reference to the event system
      structure while the "enable" file is opened.
      
      Cc: <stable@kernel.org>
      Reported-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      40ee4dff
    • S
      tracing: Fix bug when reading system filters on module removal · e9dbfae5
      Steven Rostedt 提交于
      The event system is freed when its nr_events is set to zero. This happens
      when a module created an event system and then later the module is
      removed. Modules may share systems, so the system is allocated when
      it is created and freed when the modules are unloaded and all the
      events under the system are removed (nr_events set to zero).
      
      The problem arises when a task opened the "filter" file for the
      system. If the module is unloaded and it removed the last event for
      that system, the system structure is freed. If the task that opened
      the filter file accesses the "filter" file after the system has
      been freed, the system will access an invalid pointer.
      
      By adding a ref_count, and using it to keep track of what
      is using the event system, we can free it after all users
      are finished with the event system.
      
      Cc: <stable@kernel.org>
      Reported-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      e9dbfae5
  14. 15 6月, 2011 1 次提交
  15. 26 5月, 2011 1 次提交
  16. 07 5月, 2011 1 次提交
  17. 10 3月, 2011 2 次提交
    • Y
      tracing: Export trace_set_clr_event() · 56355b83
      Yuanhan Liu 提交于
      Trace events belonging to a module only exists when the module is
      loaded. Well, we can use trace_set_clr_event funtion to enable some
      trace event at the module init routine, so that we will not miss
      something while loading then module.
      
      So, Export the trace_set_clr_event function so that module can use it.
      Signed-off-by: NYuanhan Liu <yuanhan.liu@linux.intel.com>
      LKML-Reference: <1289196312-25323-1-git-send-email-yuanhan.liu@linux.intel.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      56355b83
    • S
      tracing: Remove lock_depth from event entry · e6e1e259
      Steven Rostedt 提交于
      The lock_depth field in the event headers was added as a temporary
      data point for help in removing the BKL. Now that the BKL is pretty
      much been removed, we can remove this field.
      
      This in turn changes the header from 12 bytes to 8 bytes,
      removing the 4 byte buffer that gcc would insert if the first field
      in the data load was 8 bytes in size.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      e6e1e259
  18. 03 2月, 2011 1 次提交
    • S
      tracing: Replace trace_event struct array with pointer array · e4a9ea5e
      Steven Rostedt 提交于
      Currently the trace_event structures are placed in the _ftrace_events
      section, and at link time, the linker makes one large array of all
      the trace_event structures. On boot up, this array is read (much like
      the initcall sections) and the events are processed.
      
      The problem is that there is no guarantee that gcc will place complex
      structures nicely together in an array format. Two structures in the
      same file may be placed awkwardly, because gcc has no clue that they
      are suppose to be in an array.
      
      A hack was used previous to force the alignment to 4, to pack the
      structures together. But this caused alignment issues with other
      architectures (sparc).
      
      Instead of packing the structures into an array, the structures' addresses
      are now put into the _ftrace_event section. As pointers are always the
      natural alignment, gcc should always pack them tightly together
      (otherwise initcall, extable, etc would also fail).
      
      By having the pointers to the structures in the section, we can still
      iterate the trace_events without causing unnecessary alignment problems
      with other architectures, or depending on the current behaviour of
      gcc that will likely change in the future just to tick us kernel developers
      off a little more.
      
      The _ftrace_event section is also moved into the .init.data section
      as it is now only needed at boot up.
      Suggested-by: NDavid Miller <davem@davemloft.net>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      e4a9ea5e
  19. 19 11月, 2010 1 次提交
    • S
      tracing/events: Show real number in array fields · 04295780
      Steven Rostedt 提交于
      Currently we have in something like the sched_switch event:
      
        field:char prev_comm[TASK_COMM_LEN];	offset:12;	size:16;	signed:1;
      
      When a userspace tool such as perf tries to parse this, the
      TASK_COMM_LEN is meaningless. This is done because the TRACE_EVENT() macro
      simply uses a #len to show the string of the length. When the length is
      an enum, we get a string that means nothing for tools.
      
      By adding a static buffer and a mutex to protect it, we can store the
      string into that buffer with snprintf and show the actual number.
      Now we get:
      
        field:char prev_comm[16];       offset:12;      size:16;        signed:1;
      
      Something much more useful.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      04295780
  20. 15 10月, 2010 1 次提交
    • A
      llseek: automatically add .llseek fop · 6038f373
      Arnd Bergmann 提交于
      All file_operations should get a .llseek operation so we can make
      nonseekable_open the default for future file operations without a
      .llseek pointer.
      
      The three cases that we can automatically detect are no_llseek, seq_lseek
      and default_llseek. For cases where we can we can automatically prove that
      the file offset is always ignored, we use noop_llseek, which maintains
      the current behavior of not returning an error from a seek.
      
      New drivers should normally not use noop_llseek but instead use no_llseek
      and call nonseekable_open at open time.  Existing drivers can be converted
      to do the same when the maintainer knows for certain that no user code
      relies on calling seek on the device file.
      
      The generated code is often incorrectly indented and right now contains
      comments that clarify for each added line why a specific variant was
      chosen. In the version that gets submitted upstream, the comments will
      be gone and I will manually fix the indentation, because there does not
      seem to be a way to do that using coccinelle.
      
      Some amount of new code is currently sitting in linux-next that should get
      the same modifications, which I will do at the end of the merge window.
      
      Many thanks to Julia Lawall for helping me learn to write a semantic
      patch that does all this.
      
      ===== begin semantic patch =====
      // This adds an llseek= method to all file operations,
      // as a preparation for making no_llseek the default.
      //
      // The rules are
      // - use no_llseek explicitly if we do nonseekable_open
      // - use seq_lseek for sequential files
      // - use default_llseek if we know we access f_pos
      // - use noop_llseek if we know we don't access f_pos,
      //   but we still want to allow users to call lseek
      //
      @ open1 exists @
      identifier nested_open;
      @@
      nested_open(...)
      {
      <+...
      nonseekable_open(...)
      ...+>
      }
      
      @ open exists@
      identifier open_f;
      identifier i, f;
      identifier open1.nested_open;
      @@
      int open_f(struct inode *i, struct file *f)
      {
      <+...
      (
      nonseekable_open(...)
      |
      nested_open(...)
      )
      ...+>
      }
      
      @ read disable optional_qualifier exists @
      identifier read_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      expression E;
      identifier func;
      @@
      ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
      {
      <+...
      (
         *off = E
      |
         *off += E
      |
         func(..., off, ...)
      |
         E = *off
      )
      ...+>
      }
      
      @ read_no_fpos disable optional_qualifier exists @
      identifier read_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      @@
      ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
      {
      ... when != off
      }
      
      @ write @
      identifier write_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      expression E;
      identifier func;
      @@
      ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
      {
      <+...
      (
        *off = E
      |
        *off += E
      |
        func(..., off, ...)
      |
        E = *off
      )
      ...+>
      }
      
      @ write_no_fpos @
      identifier write_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      @@
      ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
      {
      ... when != off
      }
      
      @ fops0 @
      identifier fops;
      @@
      struct file_operations fops = {
       ...
      };
      
      @ has_llseek depends on fops0 @
      identifier fops0.fops;
      identifier llseek_f;
      @@
      struct file_operations fops = {
      ...
       .llseek = llseek_f,
      ...
      };
      
      @ has_read depends on fops0 @
      identifier fops0.fops;
      identifier read_f;
      @@
      struct file_operations fops = {
      ...
       .read = read_f,
      ...
      };
      
      @ has_write depends on fops0 @
      identifier fops0.fops;
      identifier write_f;
      @@
      struct file_operations fops = {
      ...
       .write = write_f,
      ...
      };
      
      @ has_open depends on fops0 @
      identifier fops0.fops;
      identifier open_f;
      @@
      struct file_operations fops = {
      ...
       .open = open_f,
      ...
      };
      
      // use no_llseek if we call nonseekable_open
      ////////////////////////////////////////////
      @ nonseekable1 depends on !has_llseek && has_open @
      identifier fops0.fops;
      identifier nso ~= "nonseekable_open";
      @@
      struct file_operations fops = {
      ...  .open = nso, ...
      +.llseek = no_llseek, /* nonseekable */
      };
      
      @ nonseekable2 depends on !has_llseek @
      identifier fops0.fops;
      identifier open.open_f;
      @@
      struct file_operations fops = {
      ...  .open = open_f, ...
      +.llseek = no_llseek, /* open uses nonseekable */
      };
      
      // use seq_lseek for sequential files
      /////////////////////////////////////
      @ seq depends on !has_llseek @
      identifier fops0.fops;
      identifier sr ~= "seq_read";
      @@
      struct file_operations fops = {
      ...  .read = sr, ...
      +.llseek = seq_lseek, /* we have seq_read */
      };
      
      // use default_llseek if there is a readdir
      ///////////////////////////////////////////
      @ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier readdir_e;
      @@
      // any other fop is used that changes pos
      struct file_operations fops = {
      ... .readdir = readdir_e, ...
      +.llseek = default_llseek, /* readdir is present */
      };
      
      // use default_llseek if at least one of read/write touches f_pos
      /////////////////////////////////////////////////////////////////
      @ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier read.read_f;
      @@
      // read fops use offset
      struct file_operations fops = {
      ... .read = read_f, ...
      +.llseek = default_llseek, /* read accesses f_pos */
      };
      
      @ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier write.write_f;
      @@
      // write fops use offset
      struct file_operations fops = {
      ... .write = write_f, ...
      +	.llseek = default_llseek, /* write accesses f_pos */
      };
      
      // Use noop_llseek if neither read nor write accesses f_pos
      ///////////////////////////////////////////////////////////
      
      @ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier read_no_fpos.read_f;
      identifier write_no_fpos.write_f;
      @@
      // write fops use offset
      struct file_operations fops = {
      ...
       .write = write_f,
       .read = read_f,
      ...
      +.llseek = noop_llseek, /* read and write both use no f_pos */
      };
      
      @ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier write_no_fpos.write_f;
      @@
      struct file_operations fops = {
      ... .write = write_f, ...
      +.llseek = noop_llseek, /* write uses no f_pos */
      };
      
      @ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier read_no_fpos.read_f;
      @@
      struct file_operations fops = {
      ... .read = read_f, ...
      +.llseek = noop_llseek, /* read uses no f_pos */
      };
      
      @ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      @@
      struct file_operations fops = {
      ...
      +.llseek = noop_llseek, /* no read or write fn */
      };
      ===== End semantic patch =====
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Julia Lawall <julia@diku.dk>
      Cc: Christoph Hellwig <hch@infradead.org>
      6038f373
  21. 18 8月, 2010 1 次提交
  22. 13 8月, 2010 1 次提交
    • S
      tracing/events: Convert format output to seq_file · 2a37a3df
      Steven Rostedt 提交于
      Two new events were added that broke the current format output.
      
      Both from the SCSI system: scsi_dispatch_cmd_done and scsi_dispatch_cmd_timeout
      
      The reason is that their print_fmt exceeded a page size. Since the output
      of the format used simple_read_from_buffer and trace_seq, it was limited
      to a page size in output.
      
      This patch converts the printing of the format of an event into seq_file,
      which allows greater than a page size to be shown.
      
      I diffed all event formats comparing the output with and without this
      patch. All matched except for the above two, which showed just:
      
        FORMAT TOO BIG
      
      without this patch, but now properly displays the output with this patch.
      
      v2: Remove updating *pos in seq start function.
         [ Thanks to Li Zefan for pointing that out ]
      Reviewed-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Martin K. Petersen <martin.petersen@oracle.com>
      Cc: Kei Tokunaga <tokunaga.keiich@jp.fujitsu.com>
      Cc: James Bottomley <James.Bottomley@suse.de>
      Cc: Tomohiro Kusumi <kusumi.tomohiro@jp.fujitsu.com>
      Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      2a37a3df
  23. 21 7月, 2010 1 次提交
    • L
      tracing: Allow to disable cmdline recording · e870e9a1
      Li Zefan 提交于
      We found that even enabling a single trace event that will rarely be
      triggered can add big overhead to context switch.
      
      (lmbench context switch test)
       -------------------------------------------------
       2p/0K 2p/16K 2p/64K 8p/16K 8p/64K 16p/16K 16p/64K
       ctxsw  ctxsw  ctxsw ctxsw  ctxsw   ctxsw   ctxsw
      ------ ------ ------ ------ ------ ------- -------
        2.19   2.3   2.21   2.56   2.13     2.54    2.07
        2.39   2.51  2.35   2.75   2.27     2.81    2.24
      
      The overhead is 6% ~ 11%.
      
      It's because when a trace event is enabled 3 tracepoints (sched_switch,
      sched_wakeup, sched_wakeup_new) will be activated to map pid to cmdname.
      
      We'd like to avoid this overhead, so add a trace option '(no)record-cmd'
      to allow to disable cmdline recording.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      LKML-Reference: <4C2D57F4.2050204@cn.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      e870e9a1
  24. 29 6月, 2010 4 次提交