1. 17 12月, 2010 5 次提交
    • T
      sr: implement sr_check_events() · 93aae17a
      Tejun Heo 提交于
      Replace sr_media_change() with sr_check_events().  It normally only
      uses GET_EVENT_STATUS_NOTIFICATION to check both media change and
      eject request.  If @clearing includes DISK_EVENT_MEDIA_CHANGE, it
      issues TUR and compares whether media presence has changed.  The SCSI
      specific media change uevent is kept for compatibility.
      
      sr_media_change() was doing both media change check and revalidation.
      The revalidation part is split into sr_block_revalidate_disk().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Kay Sievers <kay.sievers@vrfy.org>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      93aae17a
    • T
      cdrom: add ->check_events() support · 2d921729
      Tejun Heo 提交于
      In principle, cdrom just needs to pass through ->check_events() but
      CDROM_MEDIA_CHANGED ioctl makes things a bit more complex.  Just as
      with ->media_changed() support, cdrom code needs to buffer the events
      and serve them to ioctl and vfs as requested.
      
      As the code has to deal with both ->check_events() and
      ->media_changed(), and vfs and ioctl event buffering, this patch adds
      check_events caching on top of the existing cdi->mc_flags buffering.
      
      It may be a good idea to deprecate CDROM_MEDIA_CHANGED ioctl and
      remove all this mess.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      2d921729
    • T
      implement in-kernel gendisk events handling · 77ea887e
      Tejun Heo 提交于
      Currently, media presence polling for removeable block devices is done
      from userland.  There are several issues with this.
      
      * Polling is done by periodically opening the device.  For SCSI
        devices, the command sequence generated by such action involves a
        few different commands including TEST_UNIT_READY.  This behavior,
        while perfectly legal, is different from Windows which only issues
        single command, GET_EVENT_STATUS_NOTIFICATION.  Unfortunately, some
        ATAPI devices lock up after being periodically queried such command
        sequences.
      
      * There is no reliable and unintrusive way for a userland program to
        tell whether the target device is safe for media presence polling.
        For example, polling for media presence during an on-going burning
        session can make it fail.  The polling program can avoid this by
        opening the device with O_EXCL but then it risks making a valid
        exclusive user of the device fail w/ -EBUSY.
      
      * Userland polling is unnecessarily heavy and in-kernel implementation
        is lighter and better coordinated (workqueue, timer slack).
      
      This patch implements framework for in-kernel disk event handling,
      which includes media presence polling.
      
      * bdops->check_events() is added, which supercedes ->media_changed().
        It should check whether there's any pending event and return if so.
        Currently, two events are defined - DISK_EVENT_MEDIA_CHANGE and
        DISK_EVENT_EJECT_REQUEST.  ->check_events() is guaranteed not to be
        called parallelly.
      
      * gendisk->events and ->async_events are added.  These should be
        initialized by block driver before passing the device to add_disk().
        The former contains the mask of all supported events and the latter
        the mask of all events which the device can report without polling.
        /sys/block/*/events[_async] export these to userland.
      
      * Kernel parameter block.events_dfl_poll_msecs controls the system
        polling interval (default is 0 which means disable) and
        /sys/block/*/events_poll_msecs control polling intervals for
        individual devices (default is -1 meaning use system setting).  Note
        that if a device can report all supported events asynchronously and
        its polling interval isn't explicitly set, the device won't be
        polled regardless of the system polling interval.
      
      * If a device is opened exclusively with write access, event checking
        is automatically disabled until all write exclusive accesses are
        released.
      
      * There are event 'clearing' events.  For example, both of currently
        defined events are cleared after the device has been successfully
        opened.  This information is passed to ->check_events() callback
        using @clearing argument as a hint.
      
      * Event checking is always performed from system_nrt_wq and timer
        slack is set to 25% for polling.
      
      * Nothing changes for drivers which implement ->media_changed() but
        not ->check_events().  Going forward, all drivers will be converted
        to ->check_events() and ->media_change() will be dropped.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Kay Sievers <kay.sievers@vrfy.org>
      Cc: Jan Kara <jack@suse.cz>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      77ea887e
    • T
      block: move register_disk() and del_gendisk() to block/genhd.c · d2bf1b67
      Tejun Heo 提交于
      There's no reason for register_disk() and del_gendisk() to be in
      fs/partitions/check.c.  Move both to genhd.c.  While at it, collapse
      unlink_gendisk(), which was artificially in a separate function due to
      genhd.c / check.c split, into del_gendisk().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      d2bf1b67
    • T
      block: kill genhd_media_change_notify() · dddd9dc3
      Tejun Heo 提交于
      There's no user of the facility.  Kill it.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      dddd9dc3
  2. 16 11月, 2010 5 次提交
  3. 15 11月, 2010 1 次提交
  4. 13 11月, 2010 4 次提交
    • T
      block: clean up blkdev_get() wrappers and their users · d4d77629
      Tejun Heo 提交于
      After recent blkdev_get() modifications, open_by_devnum() and
      open_bdev_exclusive() are simple wrappers around blkdev_get().
      Replace them with blkdev_get_by_dev() and blkdev_get_by_path().
      
      blkdev_get_by_dev() is identical to open_by_devnum().
      blkdev_get_by_path() is slightly different in that it doesn't
      automatically add %FMODE_EXCL to @mode.
      
      All users are converted.  Most conversions are mechanical and don't
      introduce any behavior difference.  There are several exceptions.
      
      * btrfs now sets FMODE_EXCL in btrfs_device->mode, so there's no
        reason to OR it explicitly on blkdev_put().
      
      * gfs2, nilfs2 and the generic mount_bdev() now set FMODE_EXCL in
        sb->s_mode.
      
      * With the above changes, sb->s_mode now always should contain
        FMODE_EXCL.  WARN_ON_ONCE() added to kill_block_super() to detect
        errors.
      
      The new blkdev_get_*() functions are with proper docbook comments.
      While at it, add function description to blkdev_get() too.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Philipp Reisner <philipp.reisner@linbit.com>
      Cc: Neil Brown <neilb@suse.de>
      Cc: Mike Snitzer <snitzer@redhat.com>
      Cc: Joern Engel <joern@lazybastard.org>
      Cc: Chris Mason <chris.mason@oracle.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: KONISHI Ryusuke <konishi.ryusuke@lab.ntt.co.jp>
      Cc: reiserfs-devel@vger.kernel.org
      Cc: xfs-masters@oss.sgi.com
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      d4d77629
    • T
      block: make blkdev_get/put() handle exclusive access · e525fd89
      Tejun Heo 提交于
      Over time, block layer has accumulated a set of APIs dealing with bdev
      open, close, claim and release.
      
      * blkdev_get/put() are the primary open and close functions.
      
      * bd_claim/release() deal with exclusive open.
      
      * open/close_bdev_exclusive() are combination of open and claim and
        the other way around, respectively.
      
      * bd_link/unlink_disk_holder() to create and remove holder/slave
        symlinks.
      
      * open_by_devnum() wraps bdget() + blkdev_get().
      
      The interface is a bit confusing and the decoupling of open and claim
      makes it impossible to properly guarantee exclusive access as
      in-kernel open + claim sequence can disturb the existing exclusive
      open even before the block layer knows the current open if for another
      exclusive access.  Reorganize the interface such that,
      
      * blkdev_get() is extended to include exclusive access management.
        @holder argument is added and, if is @FMODE_EXCL specified, it will
        gain exclusive access atomically w.r.t. other exclusive accesses.
      
      * blkdev_put() is similarly extended.  It now takes @mode argument and
        if @FMODE_EXCL is set, it releases an exclusive access.  Also, when
        the last exclusive claim is released, the holder/slave symlinks are
        removed automatically.
      
      * bd_claim/release() and close_bdev_exclusive() are no longer
        necessary and either made static or removed.
      
      * bd_link_disk_holder() remains the same but bd_unlink_disk_holder()
        is no longer necessary and removed.
      
      * open_bdev_exclusive() becomes a simple wrapper around lookup_bdev()
        and blkdev_get().  It also has an unexpected extra bdev_read_only()
        test which probably should be moved into blkdev_get().
      
      * open_by_devnum() is modified to take @holder argument and pass it to
        blkdev_get().
      
      Most of bdev open/close operations are unified into blkdev_get/put()
      and most exclusive accesses are tested atomically at the open time (as
      it should).  This cleans up code and removes some, both valid and
      invalid, but unnecessary all the same, corner cases.
      
      open_bdev_exclusive() and open_by_devnum() can use further cleanup -
      rename to blkdev_get_by_path() and blkdev_get_by_devt() and drop
      special features.  Well, let's leave them for another day.
      
      Most conversions are straight-forward.  drbd conversion is a bit more
      involved as there was some reordering, but the logic should stay the
      same.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NNeil Brown <neilb@suse.de>
      Acked-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
      Acked-by: NMike Snitzer <snitzer@redhat.com>
      Acked-by: NPhilipp Reisner <philipp.reisner@linbit.com>
      Cc: Peter Osterlund <petero2@telia.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andreas Dilger <adilger.kernel@dilger.ca>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <joel.becker@oracle.com>
      Cc: Alex Elder <aelder@sgi.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: dm-devel@redhat.com
      Cc: drbd-dev@lists.linbit.com
      Cc: Leo Chen <leochen@broadcom.com>
      Cc: Scott Branden <sbranden@broadcom.com>
      Cc: Chris Mason <chris.mason@oracle.com>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Cc: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
      Cc: Joern Engel <joern@logfs.org>
      Cc: reiserfs-devel@vger.kernel.org
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      e525fd89
    • T
      block: simplify holder symlink handling · e09b457b
      Tejun Heo 提交于
      Code to manage symlinks in /sys/block/*/{holders|slaves} are overly
      complex with multiple holder considerations, redundant extra
      references to all involved kobjects, unused generic kobject holder
      support and unnecessary mixup with bd_claim/release functionalities.
      
      Strip it down to what's necessary (single gendisk holder) and make it
      use a separate interface.  This is a step for cleaning up
      bd_claim/release.  This patch makes dm-table slightly more complex but
      it will be simplified again with further changes.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NNeil Brown <neilb@suse.de>
      Acked-by: NMike Snitzer <snitzer@redhat.com>
      Cc: dm-devel@redhat.com
      e09b457b
    • H
      vlan: Add function to retrieve EtherType from vlan packets. · 0a85df00
      Hao Zheng 提交于
      Depending on how a packet is vlan tagged (i.e. hardware accelerated or
      not), the encapsulated protocol is stored in different locations.  This
      provides a consistent method of accessing that protocol, which is needed
      by drivers, security checks, etc.
      Signed-off-by: NHao Zheng <hzheng@nicira.com>
      Signed-off-by: NJesse Gross <jesse@nicira.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0a85df00
  5. 12 11月, 2010 10 次提交
    • A
      backlight: add low threshold to pwm backlight · fef7764f
      Arun Murthy 提交于
      The intensity of the backlight can be varied from a range of
      max_brightness to zero.  Though most, if not all the pwm based backlight
      devices start flickering at lower brightness value.  And also for each
      device there exists a brightness value below which the backlight appears
      to be turned off though the value is not equal to zero.
      
      If the range of brightness for a device is from zero to max_brightness.  A
      graph is plotted for brightness Vs intensity for the pwm based backlight
      device has to be a linear graph.
      
      intensity
      	  |   /
      	  |  /
      	  | /
      	  |/
      	  ---------
      	 0	max_brightness
      
      But pratically on measuring the above we note that the intensity of
      backlight goes to zero(OFF) when the value in not zero almost nearing to
      zero(some x%).  so the graph looks like
      
      intensity
      	  |    /
      	  |   /
      	  |  /
      	  |  |
      	  ------------
      	 0   x	 max_brightness
      
      In order to overcome this drawback knowing this x% i.e nothing but the low
      threshold beyond which the backlight is off and will have no effect, the
      brightness value is being offset by the low threshold value(retaining the
      linearity of the graph).  Now the graph becomes
      
      intensity
      	  |     /
      	  |    /
      	  |   /
      	  |  /
      	  -------------
      	   0	  max_brightness
      
      With this for each and every digit increment in the brightness from zero
      there is a change in the intensity of backlight.  Devices having this
      behaviour can set the low threshold brightness(lth_brightness) and pass
      the same as platform data else can have it as zero.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NArun Murthy <arun.murthy@stericsson.com>
      Acked-by: NLinus Walleij <linus.walleij@stericsson.com>
      Acked-by: NRichard Purdie <rpurdie@linux.intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fef7764f
    • S
      leds: driver for National Semiconductors LP5523 chip · 0efba16c
      Samu Onkalo 提交于
      LP5523 chip is nine channel led driver with programmable engines.  Driver
      provides support for that chip for direct access via led class or via
      programmable engines.
      Signed-off-by: NSamu Onkalo <samu.p.onkalo@nokia.com>
      Cc: Richard Purdie <rpurdie@rpsys.net>
      Cc: Jean Delvare <khali@linux-fr.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0efba16c
    • S
      leds: driver for National Semiconductor LP5521 chip · 500fe141
      Samu Onkalo 提交于
      This patchset provides support for LP5521 and LP5523 LED driver chips from
      National Semicondutor.  Both drivers supports programmable engines and
      naturally LED class features.
      
      Documentation is provided as a part of the patchset.  I created "leds"
      subdirectory under Documentation.  Perhaps the rest of the leds*
      documentation should be moved there.
      
      Datasheets are freely available at National Semiconductor www pages.
      
      This patch:
      
      LP5521 chip is three channel led driver with programmable engines.  Driver
      provides support for that chip for direct access via led class or via
      programmable engines.
      Signed-off-by: NSamu Onkalo <samu.p.onkalo@nokia.com>
      Cc: Richard Purdie <rpurdie@rpsys.net>
      Cc: Jean Delvare <khali@linux-fr.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      500fe141
    • J
      led-class: always implement blinking · 5ada28bf
      Johannes Berg 提交于
      Currently, blinking LEDs can be awkward because it is not guaranteed that
      all LEDs implement blinking.  The trigger that wants it to blink then
      needs to implement its own timer solution.
      
      Rather than require that, add led_blink_set() API that triggers can use.
      This function will attempt to use hw blinking, but if that fails
      implements a timer for it.  To stop blinking again, brightness_set() also
      needs to be wrapped into API that will stop the software blink.
      
      As a result of this, the timer trigger becomes a very trivial one, and
      hopefully we can finally see triggers using blinking as well because it's
      always easy to use.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Acked-by: NRichard Purdie <rpurdie@linux.intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5ada28bf
    • N
      radix-tree: fix RCU bug · 27d20fdd
      Nick Piggin 提交于
      Salman Qazi describes the following radix-tree bug:
      
      In the following case, we get can get a deadlock:
      
      0.  The radix tree contains two items, one has the index 0.
      1.  The reader (in this case find_get_pages) takes the rcu_read_lock.
      2.  The reader acquires slot(s) for item(s) including the index 0 item.
      3.  The non-zero index item is deleted, and as a consequence the other item is
          moved to the root of the tree. The place where it used to be is queued for
          deletion after the readers finish.
      3b. The zero item is deleted, removing it from the direct slot, it remains in
          the rcu-delayed indirect node.
      4.  The reader looks at the index 0 slot, and finds that the page has 0 ref
          count
      5.  The reader looks at it again, hoping that the item will either be freed or
          the ref count will increase. This never happens, as the slot it is looking
          at will never be updated. Also, this slot can never be reclaimed because
          the reader is holding rcu_read_lock and is in an infinite loop.
      
      The fix is to re-use the same "indirect" pointer case that requires a slot
      lookup retry into a general "retry the lookup" bit.
      Signed-off-by: NNick Piggin <npiggin@kernel.dk>
      Reported-by: NSalman Qazi <sqazi@google.com>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      27d20fdd
    • D
      Restrict unprivileged access to kernel syslog · eaf06b24
      Dan Rosenberg 提交于
      The kernel syslog contains debugging information that is often useful
      during exploitation of other vulnerabilities, such as kernel heap
      addresses.  Rather than futilely attempt to sanitize hundreds (or
      thousands) of printk statements and simultaneously cripple useful
      debugging functionality, it is far simpler to create an option that
      prevents unprivileged users from reading the syslog.
      
      This patch, loosely based on grsecurity's GRKERNSEC_DMESG, creates the
      dmesg_restrict sysctl.  When set to "0", the default, no restrictions are
      enforced.  When set to "1", only users with CAP_SYS_ADMIN can read the
      kernel syslog via dmesg(8) or other mechanisms.
      
      [akpm@linux-foundation.org: explain the config option in kernel.txt]
      Signed-off-by: NDan Rosenberg <drosenberg@vsecurity.com>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Acked-by: NEugene Teo <eugeneteo@kernel.org>
      Acked-by: NKees Cook <kees.cook@canonical.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      eaf06b24
    • C
      include/linux/highmem.h needs hardirq.h · 43b3a0c7
      Catalin Marinas 提交于
      Commit 3e4d3af5 ("mm: stack based kmap_atomic()") introduced the
      kmap_atomic_idx_push() function which warns on in_irq() with
      CONFIG_DEBUG_HIGHMEM enabled.  This patch includes linux/hardirq.h for
      the in_irq definition.
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      43b3a0c7
    • E
      atomic: add atomic_inc_not_zero_hint() · 3f9d35b9
      Eric Dumazet 提交于
      Followup of perf tools session in Netfilter WorkShop 2010
      
      In the network stack we make high usage of atomic_inc_not_zero() in
      contexts we know the probable value of atomic before increment (2 for udp
      sockets for example)
      
      Using a special version of atomic_inc_not_zero() giving this hint can help
      processor to use less bus transactions.
      
      On x86 (MESI protocol) for example, this avoids entering Shared state,
      because "lock cmpxchg" issues an RFO (Read For Ownership)
      
      akpm: Adds a new include/linux/atomic.h.  This means that new code should
      henceforth include linux/atomic.h and not asm/atomic.h.  The presence of
      include/linux/atomic.h will in fact cause checkpatch.pl to warn about use
      of asm/atomic.h.  The new include/linux/atomic.h becomes the place where
      arch-neutral atomic_t code should be placed.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: David Miller <davem@davemloft.net>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Reviewed-by: N"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3f9d35b9
    • J
      include/linux/resource.h needs types.h · 8705a1ba
      Jean Delvare 提交于
      Fix the following warning:
      usr/include/linux/resource.h:49: found __[us]{8,16,32,64} type without #include <linux/types.h>
      Signed-off-by: NJean Delvare <khali@linux-fr.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8705a1ba
    • E
      netfilter: NF_HOOK_COND has wrong conditional · ac5aa2e3
      Eric Paris 提交于
      The NF_HOOK_COND returns 0 when it shouldn't due to what I believe to be an
      error in the code as the order of operations is not what was intended.  C will
      evalutate == before =.  Which means ret is getting set to the bool result,
      rather than the return value of the function call.  The code says
      
      if (ret = function() == 1)
      when it meant to say:
      if ((ret = function()) == 1)
      
      Normally the compiler would warn, but it doesn't notice it because its
      a actually complex conditional and so the wrong code is wrapped in an explict
      set of () [exactly what the compiler wants you to do if this was intentional].
      Fixing this means that errors when netfilter denies a packet get propagated
      back up the stack rather than lost.
      
      Problem introduced by commit 2249065f (netfilter: get rid of the grossness
      in netfilter.h).
      Signed-off-by: NEric Paris <eparis@redhat.com>
      Cc: stable@kernel.org
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      ac5aa2e3
  6. 11 11月, 2010 3 次提交
    • J
      block: remove unused copy_io_context() · cedb4a7d
      Jens Axboe 提交于
      Reported-by: NOleg Nesterov <oleg@redhat.com>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      cedb4a7d
    • S
      perf_events: Fix time tracking in samples · eed01528
      Stephane Eranian 提交于
      This patch corrects time tracking in samples. Without this patch
      both time_enabled and time_running are bogus when user asks for
      PERF_SAMPLE_READ.
      
      One uses PERF_SAMPLE_READ to sample the values of other counters
      in each sample. Because of multiplexing, it is necessary to know
      both time_enabled, time_running to be able to scale counts correctly.
      
      In this second version of the patch, we maintain a shadow
      copy of ctx->time which allows us to compute ctx->time without
      calling update_context_time() from NMI context. We avoid the
      issue that update_context_time() must always be called with
      ctx->lock held.
      
      We do not keep shadow copies of the other event timings
      because if the lead event is overflowing then it is active
      and thus it's been scheduled in via event_sched_in() in
      which case neither tstamp_stopped, tstamp_running can be modified.
      
      This timing logic only applies to samples when PERF_SAMPLE_READ
      is used.
      
      Note that this patch does not address timing issues related
      to sampling inheritance between tasks. This will be addressed
      in a future patch.
      
      With this patch, the libpfm4 example task_smpl now reports
      correct counts (shown on 2.4GHz Core 2):
      
      $ task_smpl -p 2400000000 -e unhalted_core_cycles:u,instructions_retired:u,baclears  noploop 5
      noploop for 5 seconds
      IIP:0x000000004006d6 PID:5596 TID:5596 TIME:466,210,211,430 STREAM_ID:33 PERIOD:2,400,000,000 ENA=1,010,157,814 RUN=1,010,157,814 NR=3
      	2,400,000,254 unhalted_core_cycles:u (33)
      	2,399,273,744 instructions_retired:u (34)
      	53,340 baclears (35)
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <4cc6e14b.1e07e30a.256e.5190@mx.google.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      eed01528
    • E
      net: avoid limits overflow · 8d987e5c
      Eric Dumazet 提交于
      Robin Holt tried to boot a 16TB machine and found some limits were
      reached : sysctl_tcp_mem[2], sysctl_udp_mem[2]
      
      We can switch infrastructure to use long "instead" of "int", now
      atomic_long_t primitives are available for free.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Reported-by: NRobin Holt <holt@sgi.com>
      Reviewed-by: NRobin Holt <holt@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8d987e5c
  7. 10 11月, 2010 3 次提交
  8. 09 11月, 2010 6 次提交
  9. 08 11月, 2010 2 次提交
    • P
      net dst: need linux/cache.h for ____cacheline_aligned_in_smp. · 43b81f85
      Paul Mundt 提交于
      Presently the b43legacy build fails on an sh randconfig:
      
      In file included from include/net/dst.h:12,
                       from drivers/net/wireless/b43legacy/xmit.c:32:
      include/net/dst_ops.h:28: error: expected ':', ',', ';', '}' or '__attribute__' before '____cacheline_aligned_in_smp'
      include/net/dst_ops.h: In function 'dst_entries_get_fast':
      include/net/dst_ops.h:33: error: 'struct dst_ops' has no member named 'pcpuc_entries'
      include/net/dst_ops.h: In function 'dst_entries_get_slow':
      include/net/dst_ops.h:41: error: 'struct dst_ops' has no member named 'pcpuc_entries'
      include/net/dst_ops.h: In function 'dst_entries_add':
      include/net/dst_ops.h:49: error: 'struct dst_ops' has no member named 'pcpuc_entries'
      include/net/dst_ops.h: In function 'dst_entries_init':
      include/net/dst_ops.h:55: error: 'struct dst_ops' has no member named 'pcpuc_entries'
      include/net/dst_ops.h: In function 'dst_entries_destroy':
      include/net/dst_ops.h:60: error: 'struct dst_ops' has no member named 'pcpuc_entries'
      make[5]: *** [drivers/net/wireless/b43legacy/xmit.o] Error 1
      make[5]: *** Waiting for unfinished jobs....
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      43b81f85
    • G
      sh: add clk_round_parent() to optimize parent clock rate · 6af26c6c
      Guennadi Liakhovetski 提交于
      Sometimes it is possible and reasonable to adjust the parent clock rate to
      improve precision of the child clock, e.g., if the child clock has no siblings.
      clk_round_parent() is a new addition to the SH clock-framework API, that
      implements such an optimization for child clocks with divisors, taking all
      integer values in a range.
      Signed-off-by: NGuennadi Liakhovetski <g.liakhovetski@gmx.de>
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      6af26c6c
  10. 05 11月, 2010 1 次提交