1. 26 4月, 2011 5 次提交
    • T
      eCryptfs: Flush dirty pages in setattr · 5be79de2
      Tyler Hicks 提交于
      After 57db4e8d changed eCryptfs to
      write-back caching, eCryptfs page writeback updates the lower inode
      times due to the use of vfs_write() on the lower file.
      
      To preserve inode metadata changes, such as 'cp -p' does with
      utimensat(), we need to flush all dirty pages early in
      ecryptfs_setattr() so that the user-updated lower inode metadata isn't
      clobbered later in writeback.
      
      https://bugzilla.kernel.org/show_bug.cgi?id=33372Reported-by: NRocko <rockorequin@hotmail.com>
      Signed-off-by: NTyler Hicks <tyhicks@linux.vnet.ibm.com>
      5be79de2
    • T
      eCryptfs: Handle failed metadata read in lookup · 3aeb86ea
      Tyler Hicks 提交于
      When failing to read the lower file's crypto metadata during a lookup,
      eCryptfs must continue on without throwing an error. For example, there
      may be a plaintext file in the lower mount point that the user wants to
      delete through the eCryptfs mount.
      
      If an error is encountered while reading the metadata in lookup(), the
      eCryptfs inode's size could be incorrect. We must be sure to reread the
      plaintext inode size from the metadata when performing an open() or
      setattr(). The metadata is already being read in those paths, so this
      adds minimal performance overhead.
      
      This patch introduces a flag which will track whether or not the
      plaintext inode size has been read so that an incorrect i_size can be
      fixed in the open() or setattr() paths.
      
      https://bugs.launchpad.net/bugs/509180
      
      Cc: <stable@kernel.org>
      Signed-off-by: NTyler Hicks <tyhicks@linux.vnet.ibm.com>
      3aeb86ea
    • T
      eCryptfs: Add reference counting to lower files · 332ab16f
      Tyler Hicks 提交于
      For any given lower inode, eCryptfs keeps only one lower file open and
      multiplexes all eCryptfs file operations through that lower file. The
      lower file was considered "persistent" and stayed open from the first
      lookup through the lifetime of the inode.
      
      This patch keeps the notion of a single, per-inode lower file, but adds
      reference counting around the lower file so that it is closed when not
      currently in use. If the reference count is at 0 when an operation (such
      as open, create, etc.) needs to use the lower file, a new lower file is
      opened. Since the file is no longer persistent, all references to the
      term persistent file are changed to lower file.
      
      Locking is added around the sections of code that opens the lower file
      and assign the pointer in the inode info, as well as the code the fputs
      the lower file when all eCryptfs users are done with it.
      
      This patch is needed to fix issues, when mounted on top of the NFSv3
      client, where the lower file is left silly renamed until the eCryptfs
      inode is destroyed.
      Signed-off-by: NTyler Hicks <tyhicks@linux.vnet.ibm.com>
      332ab16f
    • T
      eCryptfs: dput dentries returned from dget_parent · dd55c898
      Tyler Hicks 提交于
      Call dput on the dentries previously returned by dget_parent() in
      ecryptfs_rename(). This is needed for supported eCryptfs mounts on top
      of the NFSv3 client.
      Signed-off-by: NTyler Hicks <tyhicks@linux.vnet.ibm.com>
      dd55c898
    • T
      eCryptfs: Remove extra d_delete in ecryptfs_rmdir · 35ffa948
      Tyler Hicks 提交于
      vfs_rmdir() already calls d_delete() on the lower dentry. That was being
      duplicated in ecryptfs_rmdir() and caused a NULL pointer dereference
      when NFSv3 was the lower filesystem.
      Signed-off-by: NTyler Hicks <tyhicks@linux.vnet.ibm.com>
      35ffa948
  2. 24 4月, 2011 17 次提交
  3. 23 4月, 2011 5 次提交
  4. 22 4月, 2011 13 次提交
    • P
      perf, x86: Update/fix Intel Nehalem cache events · f4929bd3
      Peter Zijlstra 提交于
      Change the Nehalem cache events to use retired memory instruction counters
      (similar to Westmere), this greatly improves the provided stats.
      
      Using:
      
      main ()
      {
              int i;
      
              for (i = 0; i < 1000000000; i++) {
                      asm("mov (%%rsp), %%rbx;"
                          "mov %%rbx, (%%rsp);" : : : "rbx");
              }
      }
      
      We find:
      
       $ perf stat --repeat 10 -e instructions:u -e l1-dcache-loads:u -e l1-dcache-stores:u ./loop_1b_loads+stores
        Performance counter stats for './loop_1b_loads+stores' (10 runs):
            4,000,081,056 instructions:u           #      0.000 IPC ( +-   0.000% )
            4,999,502,846 l1-dcache-loads:u          ( +-   0.008% )
            1,000,034,832 l1-dcache-stores:u         ( +-   0.000% )
               1.565184942  seconds time elapsed   ( +-   0.005% )
      
      The 5b is surprising - we'd expect 1b:
      
       $ perf stat --repeat 10 -e instructions:u -e r10b:u -e l1-dcache-stores:u ./loop_1b_loads+stores
        Performance counter stats for './loop_1b_loads+stores' (10 runs):
            4,000,081,054 instructions:u           #      0.000 IPC ( +-   0.000% )
            1,000,021,961 r10b:u                     ( +-   0.000% )
            1,000,030,951 l1-dcache-stores:u         ( +-   0.000% )
               1.565055422  seconds time elapsed   ( +-   0.003% )
      
      Which this patch thus fixes.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Lin Ming <ming.m.lin@intel.com>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Link: http://lkml.kernel.org/n/tip-q9rtru7b7840tws75xzboapv@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
      f4929bd3
    • C
      perf, x86: P4 PMU - Don't forget to clear cpuc->active_mask on overflow · 1ea5a6af
      Cyrill Gorcunov 提交于
      It's not enough to simply disable event on overflow the
      cpuc->active_mask should be cleared as well otherwise counter
      may stall in "active" even in real being already disabled (which
      potentially may lead to the situation that user may not use this
      counter further).
      
      Don pointed out that:
      
       " I also noticed this patch fixed some unknown NMIs
         on a P4 when I stressed the box".
      Tested-by: NLin Ming <ming.m.lin@intel.com>
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Acked-by: NDon Zickus <dzickus@redhat.com>
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Link: http://lkml.kernel.org/r/1303398203-2918-3-git-send-email-dzickus@redhat.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      1ea5a6af
    • I
      x86, perf event: Turn off unstructured raw event access to offcore registers · b52c55c6
      Ingo Molnar 提交于
      Andi Kleen pointed out that the Intel offcore support patches were merged
      without user-space tool support to the functionality:
      
       |
       | The offcore_msr perf kernel code was merged into 2.6.39-rc*, but the
       | user space bits were not. This made it impossible to set the extra mask
       | and actually do the OFFCORE profiling
       |
      
      Andi submitted a preliminary patch for user-space support, as an
      extension to perf's raw event syntax:
      
       |
       | Some raw events -- like the Intel OFFCORE events -- support additional
       | parameters. These can be appended after a ':'.
       |
       | For example on a multi socket Intel Nehalem:
       |
       |    perf stat -e r1b7:20ff -a sleep 1
       |
       | Profile the OFFCORE_RESPONSE.ANY_REQUEST with event mask REMOTE_DRAM_0
       | that measures any access to DRAM on another socket.
       |
      
      But this kind of usability is absolutely unacceptable - users should not
      be expected to type in magic, CPU and model specific incantations to get
      access to useful hardware functionality.
      
      The proper solution is to expose useful offcore functionality via
      generalized events - that way users do not have to care which specific
      CPU model they are using, they can use the conceptual event and not some
      model specific quirky hexa number.
      
      We already have such generalization in place for CPU cache events,
      and it's all very extensible.
      
      "Offcore" events measure general DRAM access patters along various
      parameters. They are particularly useful in NUMA systems.
      
      We want to support them via generalized DRAM events: either as the
      fourth level of cache (after the last-level cache), or as a separate
      generalization category.
      
      That way user-space support would be very obvious, memory access
      profiling could be done via self-explanatory commands like:
      
        perf record -e dram ./myapp
        perf record -e dram-remote ./myapp
      
      ... to measure DRAM accesses or more expensive cross-node NUMA DRAM
      accesses.
      
      These generalized events would work on all CPUs and architectures that
      have comparable PMU features.
      
      ( Note, these are just examples: actual implementation could have more
        sophistication and more parameter - as long as they center around
        similarly simple usecases. )
      
      Now we do not want to revert *all* of the current offcore bits, as they
      are still somewhat useful for generic last-level-cache events, implemented
      in this commit:
      
        e994d7d2: perf: Fix LLC-* events on Intel Nehalem/Westmere
      
      But we definitely do not yet want to expose the unstructured raw events
      to user-space, until better generalization and usability is implemented
      for these hardware event features.
      
      ( Note: after generalization has been implemented raw offcore events can be
        supported as well: there can always be an odd event that is marginally
        useful but not useful enough to generalize. DRAM profiling is definitely
        *not* such a category so generalization must be done first. )
      
      Furthermore, PERF_TYPE_RAW access to these registers was not intended
      to go upstream without proper support - it was a side-effect of the above
      e994d7d2 commit, not mentioned in the changelog.
      
      As v2.6.39 is nearing release we go for the simplest approach: disable
      the PERF_TYPE_RAW offcore hack for now, before it escapes into a released
      kernel and becomes an ABI.
      
      Once proper structure is implemented for these hardware events and users
      are offered usable solutions we can revisit this issue.
      Reported-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/1302658203-4239-1-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
      b52c55c6
    • A
      perf: Support Xeon E7's via the Westmere PMU driver · b2508e82
      Andi Kleen 提交于
      There's a new model number public, 47, for Xeon E7 (aka Westmere EX).
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Cc: a.p.zijlstra@chello.nl
      Link: http://lkml.kernel.org/r/1303429715-10202-1-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
      b2508e82
    • L
      Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block · 91e8549b
      Linus Torvalds 提交于
      * 'for-linus' of git://git.kernel.dk/linux-2.6-block:
        ide: unexport DISK_EVENT_MEDIA_CHANGE for ide-gd and ide-cd
        block: don't propagate unlisted DISK_EVENTs to userland
        elevator: check for ELEVATOR_INSERT_SORT_MERGE in !elvpriv case too
      91e8549b
    • T
      ide: unexport DISK_EVENT_MEDIA_CHANGE for ide-gd and ide-cd · 7eec77a1
      Tejun Heo 提交于
      check_events() implementations in both ide-gd and ide-cd are
      inadequate for in-kernel event polling.  Both generate media change
      events continuously when certain conditions are met causing infinite
      event loop between the driver and userland event handler.
      
      As disk event now supports suppression of unlisted events, simply
      de-listing DISK_EVENT_MEDIA_CHANGE from disk->events resolves the
      problem.  Internal handling around media revalidation will behave the
      same while userland will fall back to userland event polling after
      detecting the device doesn't support disk events.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NJens Axboe <jaxboe@fusionio.com>
      Acked-by: N"David S. Miller" <davem@davemloft.net>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      7eec77a1
    • T
      block: don't propagate unlisted DISK_EVENTs to userland · 7c88a168
      Tejun Heo 提交于
      DISK_EVENT_MEDIA_CHANGE is used for both userland visible event and
      internal event for revalidation of removeable devices.  Some legacy
      drivers don't implement proper event detection and continuously
      generate events under certain circumstances.  For example, ide-cd
      generates media changed continuously if there's no media in the drive,
      which can lead to infinite loop of events jumping back and forth
      between the driver and userland event handler.
      
      This patch updates disk event infrastructure such that it never
      propagates events not listed in disk->events to userland.  Those
      events are processed the same for internal purposes but uevent
      generation is suppressed.
      
      This also ensures that userland only gets events which are advertised
      in the @events sysfs node lowering risk of confusion.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      7c88a168
    • J
      elevator: check for ELEVATOR_INSERT_SORT_MERGE in !elvpriv case too · 3aa72873
      Jens Axboe 提交于
      The sort insert is the one that goes to the IO scheduler. With
      the SORT_MERGE addition, we could bypass IO scheduler setup
      but still ask the IO scheduler to insert the request. This would
      cause an oops on switching IO schedulers through the sysfs
      interface, unless the disk just happened to be idle while it
      occured.
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      3aa72873
    • L
      Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs · 37fc67c9
      Linus Torvalds 提交于
      * 'for-linus' of git://oss.sgi.com/xfs/xfs:
        xfs: fix duplicate message output
      37fc67c9
    • L
      Merge branch 'x86-fixes-for-linus' of... · d20dc4d5
      Linus Torvalds 提交于
      Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
      
      * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
        x86, numa: Fix cpu nodemasks for NUMA emulation and CONFIG_DEBUG_PER_CPU_MAPS
        Revert "x86, NUMA: Fix fakenuma boot failure"
      d20dc4d5
    • R
      raid5: fix build error, sector_t usage · d76c8420
      Randy Dunlap 提交于
      Change <sectors> from unsigned long long to sector_t.
      This matches its source field.
      
        ERROR: "__udivdi3" [drivers/md/raid456.ko] undefined!
      Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d76c8420
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus · 83425eee
      Linus Torvalds 提交于
      * git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
        virtio: console: Enable call to hvc_remove() on console port remove
        virtio_pci: Prevent double-free of pci regions after device hot-unplug
        virtio: Decrement avail idx on buffer detach
      83425eee
    • L
      Merge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 · 8ed54bd5
      Linus Torvalds 提交于
      * 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
        agp: fix arbitrary kernel memory writes
        agp: fix OOM and buffer overflow
        drm/radeon/kms: fix IH writeback on r6xx+ on big endian machines
      8ed54bd5