1. 30 6月, 2010 2 次提交
    • M
      compiler-gcc.h: gcc-4.5 needs noclone and noinline on __naked functions · 9c695203
      Mikael Pettersson 提交于
      A __naked function is defined in C but with a body completely implemented
      by asm(), including any prologue and epilogue.  These asm() bodies expect
      standard calling conventions for parameter passing.  Older GCCs implement
      that correctly, but 4.[56] currently do not, see GCC PR44290.  In the
      Linux kernel this breaks ARM, causing most arch/arm/mm/copypage-*.c
      modules to get miscompiled, resulting in kernel crashes during bootup.
      
      Part of the kernel fix is to augment the __naked function attribute to
      also imply noinline and noclone.  This patch implements that, and has been
      verified to fix boot failures with gcc-4.5 compiled 2.6.34 and 2.6.35-rc1
      kernels.  The patch is a no-op with older GCCs.
      Signed-off-by: NMikael Pettersson <mikpe@it.uu.se>
      Signed-off-by: NKhem Raj <raj.khem@gmail.com>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9c695203
    • N
      fs: fix superblock iteration race · 57439f87
      npiggin@suse.de 提交于
      list_for_each_entry_safe is not suitable to protect against concurrent
      modification of the list. 6754af64 introduced a race in sb walking.
      
      list_for_each_entry can use the trick of pinning the current entry in
      the list before we drop and retake the lock because it subsequently
      follows cur->next. However list_for_each_entry_safe saves n=cur->next
      for following before entering the loop body, so when the lock is
      dropped, n may be deleted.
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: John Stultz <johnstul@us.ibm.com>
      Cc: Frank Mayhar <fmayhar@google.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      57439f87
  2. 22 6月, 2010 1 次提交
  3. 14 6月, 2010 1 次提交
  4. 11 6月, 2010 2 次提交
    • C
      writeback: simplify and split bdi_start_writeback · c5444198
      Christoph Hellwig 提交于
      bdi_start_writeback now never gets a superblock passed, so we can just remove
      that case.  And to further untangle the code and flatten the call stack
      split it into two trivial helpers for it's two callers.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      c5444198
    • J
      net: deliver skbs on inactive slaves to exact matches · 597a264b
      John Fastabend 提交于
      Currently, the accelerated receive path for VLAN's will
      drop packets if the real device is an inactive slave and
      is not one of the special pkts tested for in
      skb_bond_should_drop().  This behavior is different then
      the non-accelerated path and for pkts over a bonded vlan.
      
      For example,
      
      vlanx -> bond0 -> ethx
      
      will be dropped in the vlan path and not delivered to any
      packet handlers at all.  However,
      
      bond0 -> vlanx -> ethx
      
      and
      
      bond0 -> ethx
      
      will be delivered to handlers that match the exact dev,
      because the VLAN path checks the real_dev which is not a
      slave and netif_recv_skb() doesn't drop frames but only
      delivers them to exact matches.
      
      This patch adds a sk_buff flag which is used for tagging
      skbs that would previously been dropped and allows the
      skb to continue to skb_netif_recv().  Here we add
      logic to check for the deliver_no_wcard flag and if it
      is set only deliver to handlers that match exactly.  This
      makes both paths above consistent and gives pkt handlers
      a way to identify skbs that come from inactive slaves.
      Without this patch in some configurations skbs will be
      delivered to handlers with exact matches and in others
      be dropped out right in the vlan path.
      
      I have tested the following 4 configurations in failover modes
      and load balancing modes.
      
      # bond0 -> ethx
      
      # vlanx -> bond0 -> ethx
      
      # bond0 -> vlanx -> ethx
      
      # bond0 -> ethx
                  |
        vlanx -> --
      Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      597a264b
  5. 10 6月, 2010 1 次提交
  6. 09 6月, 2010 3 次提交
    • A
      misc: Fix allocation 'borrowed' by vhost_net · 79907d89
      Alan Cox 提交于
      10, 233 is allocated officially to /dev/kmview which is shipping in
      Ubuntu and Debian distributions.  vhost_net seem to have borrowed it
      without making a proper request and this causes regressions in the other
      distributions.
      
      vhost_net can use a dynamic minor so use that instead.  Also update the
      file with a comment to try and avoid future misunderstandings.
      
      cc: stable@kernel.org
      Signed-off-by: NAlan Cox <device@lanana.org>
      [ We should have caught this before 2.6.34 got released.  - Linus ]
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      79907d89
    • D
      writeback: pay attention to wbc->nr_to_write in write_cache_pages · 0b564927
      Dave Chinner 提交于
      If a filesystem writes more than one page in ->writepage, write_cache_pages
      fails to notice this and continues to attempt writeback when wbc->nr_to_write
      has gone negative - this trace was captured from XFS:
      
          wbc_writeback_start: towrt=1024
          wbc_writepage: towrt=1024
          wbc_writepage: towrt=0
          wbc_writepage: towrt=-1
          wbc_writepage: towrt=-5
          wbc_writepage: towrt=-21
          wbc_writepage: towrt=-85
      
      This has adverse effects on filesystem writeback behaviour. write_cache_pages()
      needs to terminate after a certain number of pages are written, not after a
      certain number of calls to ->writepage are made.  This is a regression
      introduced by 17bc6c30 ("vfs: Add
      no_nrwrite_index_update writeback control flag"), but cannot be reverted
      directly due to subsequent bug fixes that have gone in on top of it.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0b564927
    • P
      sched: Fix PROVE_RCU vs cpu_cgroup · dc61b1d6
      Peter Zijlstra 提交于
      PROVE_RCU has a few issues with the cpu_cgroup because the scheduler
      typically holds rq->lock around the css rcu derefs but the generic
      cgroup code doesn't (and can't) know about that lock.
      
      Provide means to add extra checks to the css dereference and use that
      in the scheduler to annotate its users.
      
      The addition of rq->lock to these checks is correct because the
      cgroup_subsys::attach() method takes the rq->lock for each task it
      moves, therefore by holding that lock, we ensure the task is pinned to
      the current cgroup and the RCU derefence is valid.
      
      That leaves one genuine race in __sched_setscheduler() where we used
      task_group() without holding any of the required locks and thus raced
      with the cgroup code. Solve this by moving the check under the
      appropriate lock.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      dc61b1d6
  7. 08 6月, 2010 1 次提交
  8. 05 6月, 2010 5 次提交
  9. 03 6月, 2010 3 次提交
  10. 01 6月, 2010 7 次提交
  11. 31 5月, 2010 7 次提交
    • E
      netfilter: xtables: stackptr should be percpu · 7489aec8
      Eric Dumazet 提交于
      commit f3c5c1bf (netfilter: xtables: make ip_tables reentrant)
      introduced a performance regression, because stackptr array is shared by
      all cpus, adding cache line ping pongs. (16 cpus share a 64 bytes cache
      line)
      
      Fix this using alloc_percpu()
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Acked-By: NJan Engelhardt <jengelh@medozas.de>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      7489aec8
    • P
      perf_events: Fix races in group composition · 8a49542c
      Peter Zijlstra 提交于
      Group siblings don't pin each-other or the parent, so when we destroy
      events we must make sure to clean up all cross referencing pointers.
      
      In particular, for destruction of a group leader we must be able to
      find all its siblings and remove their reference to it.
      
      This means that detaching an event from its context must not detach it
      from the group, otherwise we can end up failing to clear all pointers.
      
      Solve this by clearly separating the attachment to a context and
      attachment to a group, and keep the group composed until we destroy
      the events.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8a49542c
    • P
      perf_events: Fix races and clean up perf_event and perf_mmap_data interaction · ac9721f3
      Peter Zijlstra 提交于
      In order to move toward separate buffer objects, rework the whole
      perf_mmap_data construct to be a more self-sufficient entity, one
      with its own lifetime rules.
      
      This greatly sanitizes the whole output redirection code, which
      was riddled with bugs and races.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: <stable@kernel.org>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ac9721f3
    • M
      sh: add boot code to MMCIF driver header · 8a768952
      Magnus Damm 提交于
      This patch adds a set of MMCIF functions for the romImage
      boot loader that allows the kernel to be booted directly
      from an MMC card.
      
      Thanks to Jeremy Baker for the initial prototype.
      Signed-off-by: NMagnus Damm <damm@opensource.se>
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      8a768952
    • M
      sh: prepare MMCIF driver header file · 487d9fc5
      Magnus Damm 提交于
      Update the MMCIF driver to include register information
      and register access functions in the header file.
      The MMCIF boot code builds on top of this.
      Signed-off-by: NMagnus Damm <damm@opensource.se>
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      487d9fc5
    • R
      rapidio: fix new kernel-doc warnings · 97ef6f74
      Randy Dunlap 提交于
      Fix a bunch of new rapidio kernel-doc warnings:
      
      Warning(include/linux/rio.h:123): No description found for parameter 'comp_tag'
      Warning(include/linux/rio.h:123): No description found for parameter 'phys_efptr'
      Warning(include/linux/rio.h:123): No description found for parameter 'em_efptr'
      Warning(include/linux/rio.h:123): No description found for parameter 'pwcback'
      Warning(include/linux/rio.h:247): No description found for parameter 'set_domain'
      Warning(include/linux/rio.h:247): No description found for parameter 'get_domain'
      Warning(drivers/rapidio/rio-scan.c:1133): No description found for parameter 'rdev'
      Warning(drivers/rapidio/rio-scan.c:1133): Excess function parameter 'port' description in 'rio_init_em'
      Warning(drivers/rapidio/rio.c:349): No description found for parameter 'rdev'
      Warning(drivers/rapidio/rio.c:349): Excess function parameter 'mport' description in 'rio_request_inb_pwrite'
      Warning(drivers/rapidio/rio.c:393): No description found for parameter 'port'
      Warning(drivers/rapidio/rio.c:393): No description found for parameter 'local'
      Warning(drivers/rapidio/rio.c:393): No description found for parameter 'destid'
      Warning(drivers/rapidio/rio.c:393): No description found for parameter 'hopcount'
      Warning(drivers/rapidio/rio.c:393): Excess function parameter 'rdev' description in 'rio_mport_get_physefb'
      Warning(drivers/rapidio/rio.c:845): Excess function parameter 'local' description in 'rio_std_route_clr_table'
      Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
      Cc: Alexandre Bounine <alexandre.bounine@idt.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      97ef6f74
    • L
      Revert "cpusets: randomize node rotor used in cpuset_mem_spread_node()" · 35926ff5
      Linus Torvalds 提交于
      This reverts commit 0ac0c0d0, which
      caused cross-architecture build problems for all the wrong reasons.
      IA64 already added its own version of __node_random(), but the fact is,
      there is nothing architectural about the function, and the original
      commit was just badly done. Revert it, since no fix is forthcoming.
      Requested-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      35926ff5
  12. 30 5月, 2010 2 次提交
  13. 29 5月, 2010 1 次提交
  14. 28 5月, 2010 4 次提交
    • N
      fs: introduce new truncate sequence · 7bb46a67
      npiggin@suse.de 提交于
      Introduce a new truncate calling sequence into fs/mm subsystems. Rather than
      setattr > vmtruncate > truncate, have filesystems call their truncate sequence
      from ->setattr if filesystem specific operations are required. vmtruncate is
      deprecated, and truncate_pagecache and inode_newsize_ok helpers introduced
      previously should be used.
      
      simple_setattr is introduced for simple in-ram filesystems to implement
      the new truncate sequence. Eventually all filesystems should be converted
      to implement a setattr, and the default code in notify_change should go
      away.
      
      simple_setsize is also introduced to perform just the ATTR_SIZE portion
      of simple_setattr (ie. changing i_size and trimming pagecache).
      
      To implement the new truncate sequence:
      - filesystem specific manipulations (eg freeing blocks) must be done in
        the setattr method rather than ->truncate.
      - vmtruncate can not be used by core code to trim blocks past i_size in
        the event of write failure after allocation, so this must be performed
        in the fs code.
      - convert usage of helpers block_write_begin, nobh_write_begin,
        cont_write_begin, and *blockdev_direct_IO* to use _newtrunc postfixed
        variants. These avoid calling vmtruncate to trim blocks (see previous).
      - inode_setattr should not be used. generic_setattr is a new function
        to be used to copy simple attributes into the generic inode.
      - make use of the better opportunity to handle errors with the new sequence.
      
      Big problem with the previous calling sequence: the filesystem is not called
      until i_size has already changed.  This means it is not allowed to fail the
      call, and also it does not know what the previous i_size was. Also, generic
      code calling vmtruncate to truncate allocated blocks in case of error had
      no good way to return a meaningful error (or, for example, atomically handle
      block deallocation).
      
      Cc: Christoph Hellwig <hch@lst.de>
      Acked-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      7bb46a67
    • C
      rename the generic fsync implementations · 1b061d92
      Christoph Hellwig 提交于
      We don't name our generic fsync implementations very well currently.
      The no-op implementation for in-memory filesystems currently is called
      simple_sync_file which doesn't make too much sense to start with,
      the the generic one for simple filesystems is called simple_fsync
      which can lead to some confusion.
      
      This patch renames the generic file fsync method to generic_file_fsync
      to match the other generic_file_* routines it is supposed to be used
      with, and the no-op implementation to noop_fsync to make it obvious
      what to expect.  In addition add some documentation for both methods.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      1b061d92
    • C
      drop unused dentry argument to ->fsync · 7ea80859
      Christoph Hellwig 提交于
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      7ea80859
    • A
      get rid of the magic around f_count in aio · d7065da0
      Al Viro 提交于
      __aio_put_req() plays sick games with file refcount.  What
      it wants is fput() from atomic context; it's almost always
      done with f_count > 1, so they only have to deal with delayed
      work in rare cases when their reference happens to be the
      last one.  Current code decrements f_count and if it hasn't
      hit 0, everything is fine.  Otherwise it keeps a pointer
      to struct file (with zero f_count!) around and has delayed
      work do __fput() on it.
      
      Better way to do it: use atomic_long_add_unless( , -1, 1)
      instead of !atomic_long_dec_and_test().  IOW, decrement it
      only if it's not the last reference, leave refcount alone
      if it was.  And use normal fput() in delayed work.
      
      I've made that atomic_long_add_unless call a new helper -
      fput_atomic().  Drops a reference to file if it's safe to
      do in atomic (i.e. if that's not the last one), tells if
      it had been able to do that.  aio.c converted to it, __fput()
      use is gone.  req->ki_file *always* contributes to refcount
      now.  And __fput() became static.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      d7065da0