1. 15 9月, 2010 1 次提交
    • H
      compat: Make compat_alloc_user_space() incorporate the access_ok() · c41d68a5
      H. Peter Anvin 提交于
      compat_alloc_user_space() expects the caller to independently call
      access_ok() to verify the returned area.  A missing call could
      introduce problems on some architectures.
      
      This patch incorporates the access_ok() check into
      compat_alloc_user_space() and also adds a sanity check on the length.
      The existing compat_alloc_user_space() implementations are renamed
      arch_compat_alloc_user_space() and are used as part of the
      implementation of the new global function.
      
      This patch assumes NULL will cause __get_user()/__put_user() to either
      fail or access userspace on all architectures.  This should be
      followed by checking the return value of compat_access_user_space()
      for NULL in the callers, at which time the access_ok() in the callers
      can also be removed.
      Reported-by: NBen Hawkes <hawkes@sota.gen.nz>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Acked-by: NChris Metcalf <cmetcalf@tilera.com>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NTony Luck <tony.luck@intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: James Bottomley <jejb@parisc-linux.org>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: <stable@kernel.org>
      c41d68a5
  2. 10 9月, 2010 13 次提交
  3. 09 9月, 2010 4 次提交
    • S
      dquot: do full inode dirty in allocating space · d530148a
      Shaohua Li 提交于
      Alex Shi found a regression when doing ffsb test. The test has several threads,
      and each thread creates a small file, write to it and then delete it. ffsb
      reports about 20% regression and Alex bisected it to 43d2932d. The test
      will call __mark_inode_dirty 3 times. without this commit, we only take
      inode_lock one time, while with it, we take the lock 3 times with flags (
      I_DIRTY_SYNC,I_DIRTY_PAGES,I_DIRTY). Perf shows the lock contention increased
      too much. Below proposed patch fixes it.
      
      fs is allocating blocks, which usually means file writes and the inode
      will be dirtied soon. We fully dirty the inode to reduce some inode_lock
      contention in several calls of __mark_inode_dirty.
      
      Jan Kara: Added comment.
      Signed-off-by: NShaohua Li <shaohua.li@intel.com>
      Signed-off-by: NAlex Shi <alex.shi@intel.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      d530148a
    • E
      udp: add rehash on connect() · 719f8358
      Eric Dumazet 提交于
      commit 30fff923 introduced in linux-2.6.33 (udp: bind() optimisation)
      added a secondary hash on UDP, hashed on (local addr, local port).
      
      Problem is that following sequence :
      
      fd = socket(...)
      connect(fd, &remote, ...)
      
      not only selects remote end point (address and port), but also sets
      local address, while UDP stack stored in secondary hash table the socket
      while its local address was INADDR_ANY (or ipv6 equivalent)
      
      Sequence is :
       - autobind() : choose a random local port, insert socket in hash tables
                    [while local address is INADDR_ANY]
       - connect() : set remote address and port, change local address to IP
                    given by a route lookup.
      
      When an incoming UDP frame comes, if more than 10 sockets are found in
      primary hash table, we switch to secondary table, and fail to find
      socket because its local address changed.
      
      One solution to this problem is to rehash datagram socket if needed.
      
      We add a new rehash(struct socket *) method in "struct proto", and
      implement this method for UDP v4 & v6, using a common helper.
      
      This rehashing only takes care of secondary hash table, since primary
      hash (based on local port only) is not changed.
      Reported-by: NKrzysztof Piotr Oledzki <ole@ans.pl>
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Tested-by: NKrzysztof Piotr Oledzki <ole@ans.pl>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      719f8358
    • J
      ipvs: fix active FTP · 6523ce15
      Julian Anastasov 提交于
      - Do not create expectation when forwarding the PORT
        command to avoid blocking the connection. The problem is that
        nf_conntrack_ftp.c:help() tries to create the same expectation later in
        POST_ROUTING and drops the packet with "dropping packet" message after
        failure in nf_ct_expect_related.
      
      - Change ip_vs_update_conntrack to alter the conntrack
        for related connections from real server. If we do not alter the reply in
        this direction the next packet from client sent to vport 20 comes as NEW
        connection. We alter it but may be some collision happens for both
        conntracks and the second conntrack gets destroyed immediately. The
        connection stucks too.
      Signed-off-by: NJulian Anastasov <ja@ssi.bg>
      Signed-off-by: NSimon Horman <horms@verge.net.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6523ce15
    • F
      spi/dw_spi: clean the cs_control code · e3e55ff5
      Feng Tang 提交于
      commit 052dc7c4i "spi/dw_spi: conditional transfer mode change"
      introduced cs_control code, which has a bug by using bit offset
      for spi mode to set transfer mode in control register. Also it
      forces devices who don't need cs_control to re-configure the
      control registers for each spi transfer. This patch will fix them
      Signed-off-by: NFeng Tang <feng.tang@intel.com>
      Signed-off-by: NGrant Likely <grant.likely@secretlab.ca>
      e3e55ff5
  4. 08 9月, 2010 1 次提交
    • T
      semaphore: Add DEFINE_SEMAPHORE · febc88c5
      Thomas Gleixner 提交于
      The full cleanup of init_MUTEX[_LOCKED] and DECLARE_MUTEX has not been
      done. Some of the users are real semaphores and we should name them as
      such instead of confusing everyone with "MUTEX".
      
      Provide the infrastructure to get finally rid of init_MUTEX[_LOCKED]
      and DECLARE_MUTEX.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      LKML-Reference: <20100907125054.795929962@linutronix.de>
      febc88c5
  5. 07 9月, 2010 1 次提交
  6. 05 9月, 2010 2 次提交
  7. 04 9月, 2010 2 次提交
  8. 03 9月, 2010 1 次提交
  9. 01 9月, 2010 2 次提交
  10. 29 8月, 2010 1 次提交
  11. 28 8月, 2010 1 次提交
  12. 27 8月, 2010 1 次提交
  13. 25 8月, 2010 6 次提交
    • D
      tcp: Combat per-cpu skew in orphan tests. · ad1af0fe
      David S. Miller 提交于
      As reported by Anton Blanchard when we use
      percpu_counter_read_positive() to make our orphan socket limit checks,
      the check can be off by up to num_cpus_online() * batch (which is 32
      by default) which on a 128 cpu machine can be as large as the default
      orphan limit itself.
      
      Fix this by doing the full expensive sum check if the optimized check
      triggers.
      Reported-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
      ad1af0fe
    • T
      workqueue: fix cwq->nr_active underflow · 8a2e8e5d
      Tejun Heo 提交于
      cwq->nr_active is used to keep track of how many work items are active
      for the cpu workqueue, where 'active' is defined as either pending on
      global worklist or executing.  This is used to implement the
      max_active limit and workqueue freezing.  If a work item is queued
      after nr_active has already reached max_active, the work item doesn't
      increment nr_active and is put on the delayed queue and gets activated
      later as previous active work items retire.
      
      try_to_grab_pending() which is used in the cancellation path
      unconditionally decremented nr_active whether the work item being
      cancelled is currently active or delayed, so cancelling a delayed work
      item makes nr_active underflow.  This breaks max_active enforcement
      and triggers BUG_ON() in destroy_workqueue() later on.
      
      This patch fixes this bug by adding a flag WORK_STRUCT_DELAYED, which
      is set while a work item in on the delayed list and making
      try_to_grab_pending() decrement nr_active iff the work item is
      currently active.
      
      The addition of the flag enlarges cwq alignment to 256 bytes which is
      getting a bit too large.  It's scheduled to be reduced back to 128
      bytes by merging WORK_STRUCT_PENDING and WORK_STRUCT_CWQ in the next
      devel cycle.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NJohannes Berg <johannes@sipsolutions.net>
      8a2e8e5d
    • R
      ACPI/PCI: Negotiate _OSC control bits before requesting them · 75fb60f2
      Rafael J. Wysocki 提交于
      It is possible that the BIOS will not grant control of all _OSC
      features requested via acpi_pci_osc_control_set(), so it is
      recommended to negotiate the final set of _OSC features with the
      query flag set before calling _OSC to request control of these
      features.
      
      To implement it, rework acpi_pci_osc_control_set() so that the caller
      can specify the mask of _OSC control bits to negotiate and the mask
      of _OSC control bits that are absolutely necessary to it.  Then,
      acpi_pci_osc_control_set() will run _OSC queries in a loop until
      the mask of _OSC control bits returned by the BIOS is equal to the
      mask passed to it.  Also, before running the _OSC request
      acpi_pci_osc_control_set() will check if the caller's required
      control bits are present in the final mask.
      
      Using this mechanism we will be able to avoid situations in which the
      BIOS doesn't grant control of certain _OSC features, because they
      depend on some other _OSC features that have not been requested.
      Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      75fb60f2
    • R
      ACPI/PCI: Do not preserve _OSC control bits returned by a query · 2b8fd918
      Rafael J. Wysocki 提交于
      There is the assumption in acpi_pci_osc_control_set() that it is
      always sufficient to compare the mask of _OSC control bits to be
      requested with the result of an _OSC query where all of the known
      control bits have been checked.  However, in general, that need not
      be the case.  For example, if an _OSC feature A depends on an _OSC
      feature B and control of A, B plus another _OSC feature C is
      requested simultaneously, the BIOS may return A, B, C, while it would
      only return C if A and C were requested without B.
      
      That may result in passing a wrong mask of _OSC control bits to an
      _OSC control request, in which case the BIOS may only grant control
      of a subset of the requested features.  Moreover, acpi_pci_run_osc()
      will return error code if that happens and the caller of
      acpi_pci_osc_control_set() will not know that it's been granted
      control of some _OSC features.  Consequently, the system will
      generally not work as expected.
      
      Apart from this acpi_pci_osc_control_set() always uses the mask
      of _OSC control bits returned by the very first invocation of
      acpi_pci_query_osc(), but that is done with the second argument
      equal to OSC_PCI_SEGMENT_GROUPS_SUPPORT which generally happens
      to affect the returned _OSC control bits.
      
      For these reasons, make acpi_pci_osc_control_set() always check if
      control of the requested _OSC features will be granted before making
      the final control request.  As a result, the osc_control_qry and
      osc_queried members of struct acpi_pci_root are not necessary any
      more, so drop them and remove the remaining code referring to them.
      Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      2b8fd918
    • L
      guard page for stacks that grow upwards · 8ca3eb08
      Luck, Tony 提交于
      pa-risc and ia64 have stacks that grow upwards. Check that
      they do not run into other mappings. By making VM_GROWSUP
      0x0 on architectures that do not ever use it, we can avoid
      some unpleasant #ifdefs in check_stack_guard_page().
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8ca3eb08
    • T
      workqueue: improve destroy_workqueue() debuggability · e41e704b
      Tejun Heo 提交于
      Now that the worklist is global, having works pending after wq
      destruction can easily lead to oops and destroy_workqueue() have
      several BUG_ON()s to catch these cases.  Unfortunately, BUG_ON()
      doesn't tell much about how the work became pending after the final
      flush_workqueue().
      
      This patch adds WQ_DYING which is set before the final flush begins.
      If a work is requested to be queued on a dying workqueue,
      WARN_ON_ONCE() is triggered and the request is ignored.  This clearly
      indicates which caller is trying to queue a work on a dying workqueue
      and keeps the system working in most cases.
      
      Locking rule comment is updated such that the 'I' rule includes
      modifying the field from destruction path.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      e41e704b
  14. 24 8月, 2010 2 次提交
  15. 23 8月, 2010 2 次提交