1. 18 6月, 2011 1 次提交
    • D
      KEYS/DNS: Fix ____call_usermodehelper() to not lose the session keyring · 87966996
      David Howells 提交于
      ____call_usermodehelper() now erases any credentials set by the
      subprocess_inf::init() function.  The problem is that commit
      17f60a7d ("capabilites: allow the application of capability limits
      to usermode helpers") creates and commits new credentials with
      prepare_kernel_cred() after the call to the init() function.  This wipes
      all keyrings after umh_keys_init() is called.
      
      The best way to deal with this is to put the init() call just prior to
      the commit_creds() call, and pass the cred pointer to init().  That
      means that umh_keys_init() and suchlike can modify the credentials
      _before_ they are published and potentially in use by the rest of the
      system.
      
      This prevents request_key() from working as it is prevented from passing
      the session keyring it set up with the authorisation token to
      /sbin/request-key, and so the latter can't assume the authority to
      instantiate the key.  This causes the in-kernel DNS resolver to fail
      with ENOKEY unconditionally.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NEric Paris <eparis@redhat.com>
      Tested-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      87966996
  2. 16 6月, 2011 11 次提交
  3. 15 6月, 2011 1 次提交
    • S
      rcu: Use softirq to address performance regression · 09223371
      Shaohua Li 提交于
      Commit a26ac245(rcu: move TREE_RCU from softirq to kthread)
      introduced performance regression. In an AIM7 test, this commit degraded
      performance by about 40%.
      
      The commit runs rcu callbacks in a kthread instead of softirq. We observed
      high rate of context switch which is caused by this. Out test system has
      64 CPUs and HZ is 1000, so we saw more than 64k context switch per second
      which is caused by RCU's per-CPU kthread.  A trace showed that most of
      the time the RCU per-CPU kthread doesn't actually handle any callbacks,
      but instead just does a very small amount of work handling grace periods.
      This means that RCU's per-CPU kthreads are making the scheduler do quite
      a bit of work in order to allow a very small amount of RCU-related
      processing to be done.
      
      Alex Shi's analysis determined that this slowdown is due to lock
      contention within the scheduler.  Unfortunately, as Peter Zijlstra points
      out, the scheduler's real-time semantics require global action, which
      means that this contention is inherent in real-time scheduling.  (Yes,
      perhaps someone will come up with a workaround -- otherwise, -rt is not
      going to do well on large SMP systems -- but this patch will work around
      this issue in the meantime.  And "the meantime" might well be forever.)
      
      This patch therefore re-introduces softirq processing to RCU, but only
      for core RCU work.  RCU callbacks are still executed in kthread context,
      so that only a small amount of RCU work runs in softirq context in the
      common case.  This should minimize ksoftirqd execution, allowing us to
      skip boosting of ksoftirqd for CONFIG_RCU_BOOST=y kernels.
      Signed-off-by: NShaohua Li <shaohua.li@intel.com>
      Tested-by: N"Alex,Shi" <alex.shi@intel.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      09223371
  4. 13 6月, 2011 1 次提交
    • A
      Delay struct net freeing while there's a sysfs instance refering to it · a685e089
      Al Viro 提交于
      	* new refcount in struct net, controlling actual freeing of the memory
      	* new method in kobj_ns_type_operations (->drop_ns())
      	* ->current_ns() semantics change - it's supposed to be followed by
      corresponding ->drop_ns().  For struct net in case of CONFIG_NET_NS it bumps
      the new refcount; net_drop_ns() decrements it and calls net_free() if the
      last reference has been dropped.  Method renamed to ->grab_current_ns().
      	* old net_free() callers call net_drop_ns() instead.
      	* sysfs_exit_ns() is gone, along with a large part of callchain
      leading to it; now that the references stored in ->ns[...] stay valid we
      do not need to hunt them down and replace them with NULL.  That fixes
      problems in sysfs_lookup() and sysfs_readdir(), along with getting rid
      of sb->s_instances abuse.
      
      	Note that struct net *shutdown* logics has not changed - net_cleanup()
      is called exactly when it used to be called.  The only thing postponed by
      having a sysfs instance refering to that struct net is actual freeing of
      memory occupied by struct net.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      a685e089
  5. 12 6月, 2011 2 次提交
    • J
      vlan: Fix the ingress VLAN_FLAG_REORDER_HDR check · 0b5c9db1
      Jiri Pirko 提交于
      Testing of VLAN_FLAG_REORDER_HDR does not belong in vlan_untag
      but rather in vlan_do_receive.  Otherwise the vlan header
      will not be properly put on the packet in the case of
      vlan header accelleration.
      
      As we remove the check from vlan_check_reorder_header
      rename it vlan_reorder_header to keep the naming clean.
      
      Fix up the skb->pkt_type early so we don't look at the packet
      after adding the vlan tag, which guarantees we don't goof
      and look at the wrong field.
      
      Use a simple if statement instead of a complicated switch
      statement to decided that we need to increment rx_stats
      for a multicast packet.
      
      Hopefully at somepoint we will just declare the case where
      VLAN_FLAG_REORDER_HDR is cleared as unsupported and remove
      the code.  Until then this keeps it working correctly.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NJiri Pirko <jpirko@redhat.com>
      Acked-by: NChangli Gao <xiaosuo@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0b5c9db1
    • D
      linux/seqlock.h should #include asm/processor.h for cpu_relax() · 56a21052
      David Howells 提交于
      It uses cpu_relax(), and so needs <asm/processor.h>
      
      Without this patch, I see:
      
         CC      arch/mn10300/kernel/asm-offsets.s
        In file included from include/linux/time.h:8,
                         from include/linux/timex.h:56,
                         from include/linux/sched.h:57,
                         from arch/mn10300/kernel/asm-offsets.c:7:
        include/linux/seqlock.h: In function 'read_seqbegin':
        include/linux/seqlock.h:91: error: implicit declaration of function 'cpu_relax'
      
      whilst building asb2364_defconfig on MN10300.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      56a21052
  6. 10 6月, 2011 2 次提交
  7. 09 6月, 2011 1 次提交
    • L
      vfs: reorganize 'struct inode' layout a bit · 13e12d14
      Linus Torvalds 提交于
      This tries to make the 'struct inode' accesses denser in the data cache
      by moving a commonly accessed field (i_security) closer to other fields
      that are accessed often.
      
      It also makes 'i_state' just an 'unsigned int' rather than 'unsigned
      long', since we only use a few bits of that field, and moves it next to
      the existing 'i_flags' so that we potentially get better structure
      layout (although depending on config options, i_flags may already have
      packed in the same word as i_lock, so this improves packing only for the
      case of spinlock debugging)
      
      Out 'struct inode' is still way too big, and we should probably move
      some other fields around too (the acl fields in particular) for better
      data cache access density.  Other fields (like the inode hash) are
      likely to be entirely irrelevant under most loads.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      13e12d14
  8. 08 6月, 2011 2 次提交
    • G
      Staging: altera: move .h file to proper place · 85ab9ee9
      Greg Kroah-Hartman 提交于
      Staging drivers should be self-contained, without files in the include/
      directories.  So move the altera.h file back to the driver directory for
      now, until it moves out of the staging tree.
      
      Cc: Igor M. Liplianin <liplianin@netup.ru>
      Cc: Mauro Carvalho Chehab <mchehab@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      85ab9ee9
    • A
      usb-storage: redo incorrect reads · 21c13a4f
      Alan Stern 提交于
      Some USB mass-storage devices have bugs that cause them not to handle
      the first READ(10) command they receive correctly.  The Corsair
      Padlock v2 returns completely bogus data for its first read (possibly
      it returns the data in encrypted form even though the device is
      supposed to be unlocked).  The Feiya SD/SDHC card reader fails to
      complete the first READ(10) command after it is plugged in or after a
      new card is inserted, returning a status code that indicates it thinks
      the command was invalid, which prevents the kernel from retrying the
      read.
      
      Since the first read of a new device or a new medium is for the
      partition sector, the kernel is unable to retrieve the device's
      partition table.  Users have to manually issue an "hdparm -z" or
      "blockdev --rereadpt" command before they can access the device.
      
      This patch (as1470) works around the problem.  It adds a new quirk
      flag, US_FL_INVALID_READ10, indicating that the first READ(10) should
      always be retried immediately, as should any failing READ(10) commands
      (provided the preceding READ(10) command succeeded, to avoid getting
      stuck in a loop).  The patch also adds appropriate unusual_devs
      entries containing the new flag.
      Signed-off-by: NAlan Stern <stern@rowland.harvard.edu>
      Tested-by: NSven Geggus <sven-usbst@geggus.net>
      Tested-by: NPaul Hartman <paul.hartman+linux@gmail.com>
      CC: Matthew Dharm <mdharm-usb@one-eyed-alien.net>
      CC: <stable@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      21c13a4f
  9. 07 6月, 2011 3 次提交
  10. 06 6月, 2011 1 次提交
  11. 04 6月, 2011 4 次提交
    • V
      perf: Fix comments in include/linux/perf_event.h · d7ebe75b
      Vince Weaver 提交于
      Fix include/linux/perf_event.h comments to be consistent with
      the actual #define names. This is trivial, but it can be a bit
      confusing when first  reading through the file.
      Signed-off-by: NVince Weaver <vweaver1@eecs.utk.edu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: paulus@samba.org
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/alpine.DEB.2.00.1106031757090.29381@cl320.eecs.utk.eduSigned-off-by: NIngo Molnar <mingo@elte.hu>
      d7ebe75b
    • A
      more conservative S_NOSEC handling · 9e1f1de0
      Al Viro 提交于
      Caching "we have already removed suid/caps" was overenthusiastic as merged.
      On network filesystems we might have had suid/caps set on another client,
      silently picked by this client on revalidate, all of that *without* clearing
      the S_NOSEC flag.
      
      AFAICS, the only reasonably sane way to deal with that is
      	* new superblock flag; unless set, S_NOSEC is not going to be set.
      	* local block filesystems set it in their ->mount() (more accurately,
      mount_bdev() does, so does btrfs ->mount(), users of mount_bdev() other than
      local block ones clear it)
      	* if any network filesystem (or a cluster one) wants to use S_NOSEC,
      it'll need to set MS_NOSEC in sb->s_flags *AND* take care to clear S_NOSEC when
      inode attribute changes are picked from other clients.
      
      It's not an earth-shattering hole (anybody that can set suid on another client
      will almost certainly be able to write to the file before doing that anyway),
      but it's a bug that needs fixing.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      9e1f1de0
    • L
      Revert "tty: make receive_buf() return the amout of bytes received" · 55db4c64
      Linus Torvalds 提交于
      This reverts commit b1c43f82.
      
      It was broken in so many ways, and results in random odd pty issues.
      
      It re-introduced the buggy schedule_work() in flush_to_ldisc() that can
      cause endless work-loops (see commit a5660b41: "tty: fix endless
      work loop when the buffer fills up").
      
      It also used an "unsigned int" return value fo the ->receive_buf()
      function, but then made multiple functions return a negative error code,
      and didn't actually check for the error in the caller.
      
      And it didn't actually work at all.  BenH bisected down odd tty behavior
      to it:
        "It looks like the patch is causing some major malfunctions of the X
         server for me, possibly related to PTYs.  For example, cat'ing a
         large file in a gnome terminal hangs the kernel for -minutes- in a
         loop of what looks like flush_to_ldisc/workqueue code, (some ftrace
         data in the quoted bits further down).
      
         ...
      
         Some more data: It -looks- like what happens is that the
         flush_to_ldisc work queue entry constantly re-queues itself (because
         the PTY is full ?) and the workqueue thread will basically loop
         forver calling it without ever scheduling, thus starving the consumer
         process that could have emptied the PTY."
      
      which is pretty much exactly the problem we fixed in a5660b41.
      
      Milton Miller pointed out the 'unsigned int' issue.
      Reported-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Reported-by: NMilton Miller <miltonm@bga.com>
      Cc: Stefan Bigler <stefan.bigler@keymile.com>
      Cc: Toby Gray <toby.gray@realvnc.com>
      Cc: Felipe Balbi <balbi@ti.com>
      Cc: Greg Kroah-Hartman <gregkh@suse.de>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      55db4c64
    • C
      slub: always align cpu_slab to honor cmpxchg_double requirement · d4d84fef
      Chris Metcalf 提交于
      On an architecture without CMPXCHG_LOCAL but with DEBUG_VM enabled,
      the VM_BUG_ON() in __pcpu_double_call_return_bool() will cause an early
      panic during boot unless we always align cpu_slab properly.
      
      In principle we could remove the alignment-testing VM_BUG_ON() for
      architectures that don't have CMPXCHG_LOCAL, but leaving it in means
      that new code will tend not to break x86 even if it is introduced
      on another platform, and it's low cost to require alignment.
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Acked-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      Signed-off-by: NPekka Enberg <penberg@kernel.org>
      d4d84fef
  12. 03 6月, 2011 3 次提交
  13. 02 6月, 2011 3 次提交
  14. 01 6月, 2011 4 次提交
    • L
      [media] v4l: Fix media_entity_to_video_device macro argument name · 6e3ea0e7
      Laurent Pinchart 提交于
      The name 'entity' is used twice in the macro body, once as the macro
      argument, and once as a structure field name. This breaks compilation if
      the macro is called with its argument not named 'entity'.
      
      Fix this by renaming the macro argument '__e'. This should avoid
      namespace clashes.
      Signed-off-by: NLaurent Pinchart <laurent.pinchart@ideasonboard.com>
      Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>
      6e3ea0e7
    • Y
      intel-iommu: Enable super page (2MiB, 1GiB, etc.) support · 6dd9a7c7
      Youquan Song 提交于
      There are no externally-visible changes with this. In the loop in the
      internal __domain_mapping() function, we simply detect if we are mapping:
        - size >= 2MiB, and
        - virtual address aligned to 2MiB, and
        - physical address aligned to 2MiB, and
        - on hardware that supports superpages.
      
      (and likewise for larger superpages).
      
      We automatically use a superpage for such mappings. We never have to
      worry about *breaking* superpages, since we trust that we will always
      *unmap* the same range that was mapped. So all we need to do is ensure
      that dma_pte_clear_range() will also cope with superpages.
      
      Adjust pfn_to_dma_pte() to take a superpage 'level' as an argument, so
      it can return a PTE at the appropriate level rather than always
      extending the page tables all the way down to level 1. Again, this is
      simplified by the fact that we should never encounter existing small
      pages when we're creating a mapping; any old mapping that used the same
      virtual range will have been entirely removed and its obsolete page
      tables freed.
      
      Provide an 'intel_iommu=sp_off' argument on the command line as a
      chicken bit. Not that it should ever be required.
      
      ==
      
      The original commit seen in the iommu-2.6.git was Youquan's
      implementation (and completion) of my own half-baked code which I'd
      typed into an email. Followed by half a dozen subsequent 'fixes'.
      
      I've taken the unusual step of rewriting history and collapsing the
      original commits in order to keep the main history simpler, and make
      life easier for the people who are going to have to backport this to
      older kernels. And also so I can give it a more coherent commit comment
      which (hopefully) gives a better explanation of what's going on.
      
      The original sequence of commits leading to identical code was:
      
      Youquan Song (3):
            intel-iommu: super page support
            intel-iommu: Fix superpage alignment calculation error
            intel-iommu: Fix superpage level calculation error in dma_pfn_level_pte()
      
      David Woodhouse (4):
            intel-iommu: Precalculate superpage support for dmar_domain
            intel-iommu: Fix hardware_largepage_caps()
            intel-iommu: Fix inappropriate use of superpages in __domain_mapping()
            intel-iommu: Fix phys_pfn in __domain_mapping for sglist pages
      Signed-off-by: NYouquan Song <youquan.song@intel.com>
      Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
      6dd9a7c7
    • R
      mtd: fix physmap.h warnings · 63da0290
      Randy Dunlap 提交于
      Fix build warnings in physmap.h:
      
      include/linux/mtd/physmap.h:25: warning: 'struct platform_device' declared inside parameter list
      include/linux/mtd/physmap.h:25: warning: its scope is only this definition or declaration, which is probably not what you want
      include/linux/mtd/physmap.h:26: warning: 'struct platform_device' declared inside parameter list
      include/linux/mtd/physmap.h:27: warning: 'struct platform_device' declared inside parameter list
      Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
      Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
      63da0290
    • W
      sctp: stop pending timers and purge queues when peer restart asoc · a000c01e
      Wei Yongjun 提交于
      If the peer restart the asoc, we should not only fail any unsent/unacked
      data, but also stop the T3-rtx, SACK, T4-rto timers, and teardown ASCONF
      queues.
      Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a000c01e
  15. 31 5月, 2011 1 次提交