1. 06 8月, 2010 1 次提交
  2. 22 5月, 2010 1 次提交
    • E
      sysfs: Implement sysfs tagged directory support. · 3ff195b0
      Eric W. Biederman 提交于
      The problem.  When implementing a network namespace I need to be able
      to have multiple network devices with the same name.  Currently this
      is a problem for /sys/class/net/*, /sys/devices/virtual/net/*, and
      potentially a few other directories of the form /sys/ ... /net/*.
      
      What this patch does is to add an additional tag field to the
      sysfs dirent structure.  For directories that should show different
      contents depending on the context such as /sys/class/net/, and
      /sys/devices/virtual/net/ this tag field is used to specify the
      context in which those directories should be visible.  Effectively
      this is the same as creating multiple distinct directories with
      the same name but internally to sysfs the result is nicer.
      
      I am calling the concept of a single directory that looks like multiple
      directories all at the same path in the filesystem tagged directories.
      
      For the networking namespace the set of directories whose contents I need
      to filter with tags can depend on the presence or absence of hotplug
      hardware or which modules are currently loaded.  Which means I need
      a simple race free way to setup those directories as tagged.
      
      To achieve a reace free design all tagged directories are created
      and managed by sysfs itself.
      
      Users of this interface:
      - define a type in the sysfs_tag_type enumeration.
      - call sysfs_register_ns_types with the type and it's operations
      - sysfs_exit_ns when an individual tag is no longer valid
      
      - Implement mount_ns() which returns the ns of the calling process
        so we can attach it to a sysfs superblock.
      - Implement ktype.namespace() which returns the ns of a syfs kobject.
      
      Everything else is left up to sysfs and the driver layer.
      
      For the network namespace mount_ns and namespace() are essentially
      one line functions, and look to remain that.
      
      Tags are currently represented a const void * pointers as that is
      both generic, prevides enough information for equality comparisons,
      and is trivial to create for current users, as it is just the
      existing namespace pointer.
      
      The work needed in sysfs is more extensive.  At each directory
      or symlink creating I need to check if the directory it is being
      created in is a tagged directory and if so generate the appropriate
      tag to place on the sysfs_dirent.  Likewise at each symlink or
      directory removal I need to check if the sysfs directory it is
      being removed from is a tagged directory and if so figure out
      which tag goes along with the name I am deleting.
      
      Currently only directories which hold kobjects, and
      symlinks are supported.  There is not enough information
      in the current file attribute interfaces to give us anything
      to discriminate on which makes it useless, and there are
      no potential users which makes it an uninteresting problem
      to solve.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NBenjamin Thery <benjamin.thery@bull.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      3ff195b0
  3. 08 3月, 2010 4 次提交
  4. 12 12月, 2009 2 次提交
  5. 15 10月, 2009 1 次提交
    • N
      sysfs: Allow sysfs_notify_dirent to be called from interrupt context. · 83db93f4
      Neil Brown 提交于
      sysfs_notify_dirent is a simple atomic operation that can be used to
      alert user-space that new data can be read from a sysfs attribute.
      
      Unfortunately it cannot currently be called from non-process context
      because of its use of spin_lock which is sometimes taken with
      interrupts enabled.
      
      So change all lockers of sysfs_open_dirent_lock to disable interrupts,
      thus making sysfs_notify_dirent safe to be called from non-process
      context (as drivers/md does in md_safemode_timeout).
      
      sysfs_get_open_dirent is (documented as being) only called from
      process context, so it uses spin_lock_irq.  Other places
      use spin_lock_irqsave.
      
      The usage for sysfs_notify_dirent in md_safemode_timeout was
      introduced in 2.6.28, so this patch is suitable for that and more
      recent kernels.
      Reported-by: NJoel Andres Granados <jgranado@redhat.com>
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      Cc: stable <stable@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      83db93f4
  6. 29 5月, 2009 1 次提交
  7. 17 4月, 2009 2 次提交
    • K
      sysfs: sysfs poll keep the poll rule of regular file. · 1af3557a
      KOSAKI Motohiro 提交于
      Currently, following test programs don't finished.
      
      % ruby -e '
      Thread.new { sleep }
      File.read("/sys/devices/system/cpu/cpu0/cpufreq/scaling_available_frequencies")
      '
      
      strace expose the reason.
      
      ...
      open("/sys/devices/system/cpu/cpu0/cpufreq/scaling_available_frequencies", O_RDONLY|O_LARGEFILE) = 3
      ioctl(3, SNDCTL_TMR_TIMEBASE or TCGETS, 0xbf9fa6b8) = -1 ENOTTY (Inappropriate ioctl for device)
      fstat64(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0
      _llseek(3, 0, [0], SEEK_CUR)            = 0
      select(4, [3], NULL, NULL, NULL)        = 1 (in [3])
      read(3, "1400000 1300000 1200000 1100000 1"..., 4096) = 62
      select(4, [3], NULL, NULL, NULL
      
      
      Because Ruby (the scripting language) VM assume select system-call
      against regular file don't block.  it because SUSv3 says "Regular files
      shall always poll TRUE for reading and writing".  see
      http://www.opengroup.org/onlinepubs/009695399/functions/poll.html it
      seems valid assumption.
      
      But sysfs_poll() don't keep this rule although sysfs file can read and
      write always.
      
      This patch restore proper poll behavior to sysfs.
      /sys/block/md*/md/sync_action polling application and another sysfs
      updating sensitive application still can use POLLERR and POLLPRI.
      
      Cc: Neil Brown <neilb@suse.de>
      Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      1af3557a
    • A
      sysfs: don't use global workqueue in sysfs_schedule_callback() · d110271e
      Alex Chiang 提交于
      A sysfs attribute using sysfs_schedule_callback() to commit suicide
      may end up calling device_unregister(), which will eventually call
      a driver's ->remove function.
      
      Drivers may call flush_scheduled_work() in their shutdown routines,
      in which case lockdep will complain with something like the following:
      
        =============================================
        [ INFO: possible recursive locking detected ]
        2.6.29-rc8-kk #1
        ---------------------------------------------
        events/4/56 is trying to acquire lock:
        (events){--..}, at: [<ffffffff80257fc0>] flush_workqueue+0x0/0xa0
      
        but task is already holding lock:
        (events){--..}, at: [<ffffffff80257648>] run_workqueue+0x108/0x230
      
        other info that might help us debug this:
        3 locks held by events/4/56:
        #0:  (events){--..}, at: [<ffffffff80257648>] run_workqueue+0x108/0x230
        #1:  (&ss->work){--..}, at: [<ffffffff80257648>] run_workqueue+0x108/0x230
        #2:  (pci_remove_rescan_mutex){--..}, at: [<ffffffff803c10d1>] remove_callback+0x21/0x40
      
        stack backtrace:
        Pid: 56, comm: events/4 Not tainted 2.6.29-rc8-kk #1
        Call Trace:
        [<ffffffff8026dfcd>] validate_chain+0xb7d/0x1260
        [<ffffffff8026eade>] __lock_acquire+0x42e/0xa40
        [<ffffffff8026f148>] lock_acquire+0x58/0x80
        [<ffffffff80257fc0>] ? flush_workqueue+0x0/0xa0
        [<ffffffff8025800d>] flush_workqueue+0x4d/0xa0
        [<ffffffff80257fc0>] ? flush_workqueue+0x0/0xa0
        [<ffffffff80258070>] flush_scheduled_work+0x10/0x20
        [<ffffffffa0144065>] e1000_remove+0x55/0xfe [e1000e]
        [<ffffffff8033ee30>] ? sysfs_schedule_callback_work+0x0/0x50
        [<ffffffff803bfeb2>] pci_device_remove+0x32/0x70
        [<ffffffff80441da9>] __device_release_driver+0x59/0x90
        [<ffffffff80441edb>] device_release_driver+0x2b/0x40
        [<ffffffff804419d6>] bus_remove_device+0xa6/0x120
        [<ffffffff8043e46b>] device_del+0x12b/0x190
        [<ffffffff8043e4f6>] device_unregister+0x26/0x70
        [<ffffffff803ba969>] pci_stop_dev+0x49/0x60
        [<ffffffff803baab0>] pci_remove_bus_device+0x40/0xc0
        [<ffffffff803c10d9>] remove_callback+0x29/0x40
        [<ffffffff8033ee4f>] sysfs_schedule_callback_work+0x1f/0x50
        [<ffffffff8025769a>] run_workqueue+0x15a/0x230
        [<ffffffff80257648>] ? run_workqueue+0x108/0x230
        [<ffffffff8025846f>] worker_thread+0x9f/0x100
        [<ffffffff8025bce0>] ? autoremove_wake_function+0x0/0x40
        [<ffffffff802583d0>] ? worker_thread+0x0/0x100
        [<ffffffff8025b89d>] kthread+0x4d/0x80
        [<ffffffff8020d4ba>] child_rip+0xa/0x20
        [<ffffffff8020cebc>] ? restore_args+0x0/0x30
        [<ffffffff8025b850>] ? kthread+0x0/0x80
        [<ffffffff8020d4b0>] ? child_rip+0x0/0x20
      
      Although we know that the device_unregister path will never acquire
      a lock that a driver might try to acquire in its ->remove, in general
      we should never attempt to flush a workqueue from within the same
      workqueue, and lockdep rightly complains.
      
      So as long as sysfs attributes cannot commit suicide directly and we
      are stuck with this callback mechanism, put the sysfs callbacks on
      their own workqueue instead of the global one.
      
      This has the side benefit that if a suicidal sysfs attribute kicks
      off a long chain of ->remove callbacks, we no longer induce a long
      delay on the global queue.
      
      This also fixes a missing module_put in the error path introduced
      by sysfs-only-allow-one-scheduled-removal-callback-per-kobj.patch.
      
      We never destroy the workqueue, but I'm not sure that's a
      problem.
      Reported-by: NKenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
      Tested-by: NKenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
      Signed-off-by: NAlex Chiang <achiang@hp.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      d110271e
  8. 25 3月, 2009 1 次提交
    • A
      sysfs: only allow one scheduled removal callback per kobj · 66942064
      Alex Chiang 提交于
      The only way for a sysfs attribute to remove itself (without
      deadlock) is to use the sysfs_schedule_callback() interface.
      
      Vegard Nossum discovered that a poorly written sysfs ->store
      callback can repeatedly schedule remove callbacks on the same
      device over and over, e.g.
      
      	$ while true ; do echo 1 > /sys/devices/.../remove ; done
      
      If the 'remove' attribute uses the sysfs_schedule_callback API
      and also does not protect itself from concurrent accesses, its
      callback handler will be called multiple times, and will
      eventually attempt to perform operations on a freed kobject,
      leading to many problems.
      
      Instead of requiring all callers of sysfs_schedule_callback to
      implement their own synchronization, provide the protection in
      the infrastructure.
      
      Now, sysfs_schedule_callback will only allow one scheduled
      callback per kobject. On subsequent calls with the same kobject,
      return -EAGAIN.
      
      This is a short term fix. The long term fix is to allow sysfs
      attributes to remove themselves directly, without any of this
      callback hokey pokey.
      
      [cornelia.huck@de.ibm.com: s390 ccwgroup bits]
      
      Reported-by: vegard.nossum@gmail.com
      Signed-off-by: NAlex Chiang <achiang@hp.com>
      Acked-by: NCornelia Huck <cornelia.huck@de.ibm.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      66942064
  9. 17 10月, 2008 3 次提交
  10. 27 7月, 2008 1 次提交
  11. 22 7月, 2008 1 次提交
  12. 30 4月, 2008 1 次提交
  13. 23 4月, 2008 1 次提交
  14. 20 4月, 2008 2 次提交
  15. 25 3月, 2008 1 次提交
  16. 25 1月, 2008 3 次提交
  17. 24 1月, 2008 1 次提交
  18. 29 11月, 2007 1 次提交
    • M
      sysfs: fix off-by-one error in fill_read_buffer() · 8118a859
      Miao Xie 提交于
      I found that there is a off-by-one problem in the following code.
      
      Version:	2.6.24-rc2
      File:		fs/sysfs/file.c:118-122
      Function:	fill_read_buffer
      --------------------------------------------------------------------
      	count = ops->show(kobj, attr_sd->s_attr.attr, buffer->page);
      
      	sysfs_put_active_two(attr_sd);
      
      	BUG_ON(count > (ssize_t)PAGE_SIZE);
      --------------------------------------------------------------------
      
      Because according to the specification of the sysfs and the implement of
      the show methods, the show methods return the number of bytes which would
      be generated for the given input, excluding the trailing null.So if the
      return value of the show methods equals PAGE_SIZE - 1, the buffer is full
      in fact.  And if the return value equals PAGE_SIZE, the resulting string
      was already truncated,or buffer overflow occurred.
      
      This patch fixes an off-by-one error in fill_read_buffer.
      Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Acked-by: NTejun Heo <teheo@suse.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      8118a859
  19. 20 10月, 2007 1 次提交
  20. 13 10月, 2007 11 次提交