1. 27 8月, 2012 1 次提交
  2. 15 8月, 2012 1 次提交
    • A
      Bluetooth: Fix use-after-free bug in SMP · 61a0cfb0
      Andre Guedes 提交于
      If SMP fails, we should always cancel security_timer delayed work.
      Otherwise, security_timer function may run after l2cap_conn object
      has been freed.
      
      This patch fixes the following warning reported by ODEBUG:
      
      WARNING: at lib/debugobjects.c:261 debug_print_object+0x7c/0x8d()
      Hardware name: Bochs
      ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x27
      Modules linked in: btusb bluetooth
      Pid: 440, comm: kworker/u:2 Not tainted 3.5.0-rc1+ #4
      Call Trace:
       [<ffffffff81174600>] ? free_obj_work+0x4a/0x7f
       [<ffffffff81023eb8>] warn_slowpath_common+0x7e/0x97
       [<ffffffff81023f65>] warn_slowpath_fmt+0x41/0x43
       [<ffffffff811746b1>] debug_print_object+0x7c/0x8d
       [<ffffffff810394f0>] ? __queue_work+0x241/0x241
       [<ffffffff81174fdd>] debug_check_no_obj_freed+0x92/0x159
       [<ffffffff810ac08e>] slab_free_hook+0x6f/0x77
       [<ffffffffa0019145>] ? l2cap_conn_del+0x148/0x157 [bluetooth]
       [<ffffffff810ae408>] kfree+0x59/0xac
       [<ffffffffa0019145>] l2cap_conn_del+0x148/0x157 [bluetooth]
       [<ffffffffa001b9a2>] l2cap_recv_frame+0xa77/0xfa4 [bluetooth]
       [<ffffffff810592f9>] ? trace_hardirqs_on_caller+0x112/0x1ad
       [<ffffffffa001c86c>] l2cap_recv_acldata+0xe2/0x264 [bluetooth]
       [<ffffffffa0002b2f>] hci_rx_work+0x235/0x33c [bluetooth]
       [<ffffffff81038dc3>] ? process_one_work+0x126/0x2fe
       [<ffffffff81038e22>] process_one_work+0x185/0x2fe
       [<ffffffff81038dc3>] ? process_one_work+0x126/0x2fe
       [<ffffffff81059f2e>] ? lock_acquired+0x1b5/0x1cf
       [<ffffffffa00028fa>] ? le_scan_work+0x11d/0x11d [bluetooth]
       [<ffffffff81036fb6>] ? spin_lock_irq+0x9/0xb
       [<ffffffff81039209>] worker_thread+0xcf/0x175
       [<ffffffff8103913a>] ? rescuer_thread+0x175/0x175
       [<ffffffff8103cfe0>] kthread+0x95/0x9d
       [<ffffffff812c5054>] kernel_threadi_helper+0x4/0x10
       [<ffffffff812c36b0>] ? retint_restore_args+0x13/0x13
       [<ffffffff8103cf4b>] ? flush_kthread_worker+0xdb/0xdb
       [<ffffffff812c5050>] ? gs_change+0x13/0x13
      
      This bug can be reproduced using hctool lecc or l2test tools and
      bluetoothd not running.
      Signed-off-by: NAndre Guedes <andre.guedes@openbossa.org>
      Signed-off-by: NGustavo Padovan <gustavo.padovan@collabora.co.uk>
      61a0cfb0
  3. 07 8月, 2012 8 次提交
    • D
      cfg80211: process pending events when unregistering net device · 1f6fc43e
      Daniel Drake 提交于
      libertas currently calls cfg80211_disconnected() when it is being
      brought down. This causes an event to be allocated, but since the
      wdev is already removed from the rdev by the time that the event
      processing work executes, the event is never processed or freed.
      http://article.gmane.org/gmane.linux.kernel.wireless.general/95666
      
      Fix this leak, and other possible situations, by processing the event
      queue when a device is being unregistered. Thanks to Johannes Berg for
      the suggestion.
      Signed-off-by: NDaniel Drake <dsd@laptop.org>
      Cc: stable@vger.kernel.org
      Reviewed-by: NJohannes Berg <johannes@sipsolutions.net>
      Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
      1f6fc43e
    • J
      Bluetooth: Fix socket not getting freed if l2cap channel create fails · 49dfbb91
      Jaganath Kanakkassery 提交于
      If l2cap_chan_create() fails then it will return from l2cap_sock_kill
      since zapped flag of sk is reset.
      Signed-off-by: NJaganath Kanakkassery <jaganath.k@samsung.com>
      Signed-off-by: NGustavo Padovan <gustavo.padovan@collabora.co.uk>
      49dfbb91
    • A
      Bluetooth: smp: Fix possible NULL dereference · d08fd0e7
      Andrei Emeltchenko 提交于
      smp_chan_create might return NULL so we need to check before
      dereferencing smp.
      Signed-off-by: NAndrei Emeltchenko <andrei.emeltchenko@intel.com>
      Signed-off-by: NGustavo Padovan <gustavo.padovan@collabora.co.uk>
      d08fd0e7
    • R
      Bluetooth: Set name_state to unknown when entry name is empty · c3e7c0d9
      Ram Malovany 提交于
      When the name of the given entry is empty , the state needs to be
      updated accordingly.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NRam Malovany <ramm@ti.com>
      Signed-off-by: NGustavo Padovan <gustavo.padovan@collabora.co.uk>
      c3e7c0d9
    • R
      Bluetooth: Fix using a NULL inquiry cache entry · 7cc8380e
      Ram Malovany 提交于
      If the device was not found in a list of found devices names of which
      are pending.This may happen in a case when HCI Remote Name Request
      was sent as a part of incoming connection establishment procedure.
      Hence there is no need to continue resolving a next name as it will
      be done upon receiving another Remote Name Request Complete Event.
      This will fix a kernel crash when trying to use this entry to resolve
      the next name.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NRam Malovany <ramm@ti.com>
      Signed-off-by: NGustavo Padovan <gustavo.padovan@collabora.co.uk>
      7cc8380e
    • R
      Bluetooth: Fix using NULL inquiry entry · c810089c
      Ram Malovany 提交于
      If entry wasn't found in the hci_inquiry_cache_lookup_resolve do not
      resolve the name.This will fix a kernel crash when trying to use NULL
      pointer.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NRam Malovany <ramm@ti.com>
      Signed-off-by: NGustavo Padovan <gustavo.padovan@collabora.co.uk>
      c810089c
    • S
      Bluetooth: Fix legacy pairing with some devices · a9ea3ed9
      Szymon Janc 提交于
      Some devices e.g. some Android based phones don't do SDP search before
      pairing and cancel legacy pairing when ACL is disconnected.
      
      PIN Code Request event which changes ACL timeout to HCI_PAIRING_TIMEOUT
      is only received after remote user entered PIN.
      
      In that case no L2CAP is connected so default HCI_DISCONN_TIMEOUT
      (2 seconds) is being used to timeout ACL connection. This results in
      problems with legacy pairing as remote user has only few seconds to
      enter PIN before ACL is disconnected.
      
      Increase disconnect timeout for incomming connection to
      HCI_PAIRING_TIMEOUT if SSP is disabled and no linkey exists.
      
      To avoid keeping ACL alive for too long after SDP search set ACL
      timeout back to HCI_DISCONN_TIMEOUT when L2CAP is connected.
      
      2012-07-19 13:24:43.413521 < HCI Command: Create Connection (0x01|0x0005) plen 13
          bdaddr 00:02:72:D6:6A:3F ptype 0xcc18 rswitch 0x01 clkoffset 0x0000
          Packet type: DM1 DM3 DM5 DH1 DH3 DH5
      2012-07-19 13:24:43.425224 > HCI Event: Command Status (0x0f) plen 4
          Create Connection (0x01|0x0005) status 0x00 ncmd 1
      2012-07-19 13:24:43.885222 > HCI Event: Role Change (0x12) plen 8
          status 0x00 bdaddr 00:02:72:D6:6A:3F role 0x01
          Role: Slave
      2012-07-19 13:24:44.054221 > HCI Event: Connect Complete (0x03) plen 11
          status 0x00 handle 42 bdaddr 00:02:72:D6:6A:3F type ACL encrypt 0x00
      2012-07-19 13:24:44.054313 < HCI Command: Read Remote Supported Features (0x01|0x001b) plen 2
          handle 42
      2012-07-19 13:24:44.055176 > HCI Event: Page Scan Repetition Mode Change (0x20) plen 7
          bdaddr 00:02:72:D6:6A:3F mode 0
      2012-07-19 13:24:44.056217 > HCI Event: Max Slots Change (0x1b) plen 3
          handle 42 slots 5
      2012-07-19 13:24:44.059218 > HCI Event: Command Status (0x0f) plen 4
          Read Remote Supported Features (0x01|0x001b) status 0x00 ncmd 0
      2012-07-19 13:24:44.062192 > HCI Event: Command Status (0x0f) plen 4
          Unknown (0x00|0x0000) status 0x00 ncmd 1
      2012-07-19 13:24:44.067219 > HCI Event: Read Remote Supported Features (0x0b) plen 11
          status 0x00 handle 42
          Features: 0xbf 0xfe 0xcf 0xfe 0xdb 0xff 0x7b 0x87
      2012-07-19 13:24:44.067248 < HCI Command: Read Remote Extended Features (0x01|0x001c) plen 3
          handle 42 page 1
      2012-07-19 13:24:44.071217 > HCI Event: Command Status (0x0f) plen 4
          Read Remote Extended Features (0x01|0x001c) status 0x00 ncmd 1
      2012-07-19 13:24:44.076218 > HCI Event: Read Remote Extended Features (0x23) plen 13
          status 0x00 handle 42 page 1 max 1
          Features: 0x01 0x00 0x00 0x00 0x00 0x00 0x00 0x00
      2012-07-19 13:24:44.076249 < HCI Command: Remote Name Request (0x01|0x0019) plen 10
          bdaddr 00:02:72:D6:6A:3F mode 2 clkoffset 0x0000
      2012-07-19 13:24:44.081218 > HCI Event: Command Status (0x0f) plen 4
          Remote Name Request (0x01|0x0019) status 0x00 ncmd 1
      2012-07-19 13:24:44.105214 > HCI Event: Remote Name Req Complete (0x07) plen 255
          status 0x00 bdaddr 00:02:72:D6:6A:3F name 'uw000951-0'
      2012-07-19 13:24:44.105284 < HCI Command: Authentication Requested (0x01|0x0011) plen 2
          handle 42
      2012-07-19 13:24:44.111207 > HCI Event: Command Status (0x0f) plen 4
          Authentication Requested (0x01|0x0011) status 0x00 ncmd 1
      2012-07-19 13:24:44.112220 > HCI Event: Link Key Request (0x17) plen 6
          bdaddr 00:02:72:D6:6A:3F
      2012-07-19 13:24:44.112249 < HCI Command: Link Key Request Negative Reply (0x01|0x000c) plen 6
          bdaddr 00:02:72:D6:6A:3F
      2012-07-19 13:24:44.115215 > HCI Event: Command Complete (0x0e) plen 10
          Link Key Request Negative Reply (0x01|0x000c) ncmd 1
          status 0x00 bdaddr 00:02:72:D6:6A:3F
      2012-07-19 13:24:44.116215 > HCI Event: PIN Code Request (0x16) plen 6
          bdaddr 00:02:72:D6:6A:3F
      2012-07-19 13:24:48.099184 > HCI Event: Auth Complete (0x06) plen 3
          status 0x13 handle 42
          Error: Remote User Terminated Connection
      2012-07-19 13:24:48.179182 > HCI Event: Disconn Complete (0x05) plen 4
          status 0x00 handle 42 reason 0x13
          Reason: Remote User Terminated Connection
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NSzymon Janc <szymon.janc@tieto.com>
      Acked-by: NJohan Hedberg <johan.hedberg@intel.com>
      Signed-off-by: NGustavo Padovan <gustavo.padovan@collabora.co.uk>
      a9ea3ed9
    • G
      Bluetooth: Fix possible deadlock in SCO code · 269c4845
      Gustavo Padovan 提交于
      sco_chan_del() only has conn != NULL when called from sco_conn_del() so
      just move the code from it that deal with conn to sco_conn_del().
      
      [  120.765529]
      [  120.765529] ======================================================
      [  120.766529] [ INFO: possible circular locking dependency detected ]
      [  120.766529] 3.5.0-rc1-10292-g3701f944-dirty #70 Tainted: G        W
      [  120.766529] -------------------------------------------------------
      [  120.766529] kworker/u:3/1497 is trying to acquire lock:
      [  120.766529]  (&(&conn->lock)->rlock#2){+.+...}, at:
      [<ffffffffa00b7ecc>] sco_chan_del+0x4c/0x170 [bluetooth]
      [  120.766529]
      [  120.766529] but task is already holding lock:
      [  120.766529]  (slock-AF_BLUETOOTH-BTPROTO_SCO){+.+...}, at:
      [<ffffffffa00b8401>] sco_conn_del+0x61/0xe0 [bluetooth]
      [  120.766529]
      [  120.766529] which lock already depends on the new lock.
      [  120.766529]
      [  120.766529]
      [  120.766529] the existing dependency chain (in reverse order) is:
      [  120.766529]
      [  120.766529] -> #1 (slock-AF_BLUETOOTH-BTPROTO_SCO){+.+...}:
      [  120.766529]        [<ffffffff8107980e>] lock_acquire+0x8e/0xb0
      [  120.766529]        [<ffffffff813c19e0>] _raw_spin_lock+0x40/0x80
      [  120.766529]        [<ffffffffa00b85e9>] sco_connect_cfm+0x79/0x300
      [bluetooth]
      [  120.766529]        [<ffffffffa0094b13>]
      hci_sync_conn_complete_evt.isra.90+0x343/0x400 [bluetooth]
      [  120.766529]        [<ffffffffa009d447>] hci_event_packet+0x317/0xfb0
      [bluetooth]
      [  120.766529]        [<ffffffffa008aa68>] hci_rx_work+0x2c8/0x890
      [bluetooth]
      [  120.766529]        [<ffffffff81047db7>] process_one_work+0x197/0x460
      [  120.766529]        [<ffffffff810489d6>] worker_thread+0x126/0x2d0
      [  120.766529]        [<ffffffff8104ee4d>] kthread+0x9d/0xb0
      [  120.766529]        [<ffffffff813c4294>] kernel_thread_helper+0x4/0x10
      [  120.766529]
      [  120.766529] -> #0 (&(&conn->lock)->rlock#2){+.+...}:
      [  120.766529]        [<ffffffff81078a8a>] __lock_acquire+0x154a/0x1d30
      [  120.766529]        [<ffffffff8107980e>] lock_acquire+0x8e/0xb0
      [  120.766529]        [<ffffffff813c19e0>] _raw_spin_lock+0x40/0x80
      [  120.766529]        [<ffffffffa00b7ecc>] sco_chan_del+0x4c/0x170
      [bluetooth]
      [  120.766529]        [<ffffffffa00b8414>] sco_conn_del+0x74/0xe0
      [bluetooth]
      [  120.766529]        [<ffffffffa00b88a2>] sco_disconn_cfm+0x32/0x60
      [bluetooth]
      [  120.766529]        [<ffffffffa0093a82>]
      hci_disconn_complete_evt.isra.53+0x242/0x390 [bluetooth]
      [  120.766529]        [<ffffffffa009d747>] hci_event_packet+0x617/0xfb0
      [bluetooth]
      [  120.766529]        [<ffffffffa008aa68>] hci_rx_work+0x2c8/0x890
      [bluetooth]
      [  120.766529]        [<ffffffff81047db7>] process_one_work+0x197/0x460
      [  120.766529]        [<ffffffff810489d6>] worker_thread+0x126/0x2d0
      [  120.766529]        [<ffffffff8104ee4d>] kthread+0x9d/0xb0
      [  120.766529]        [<ffffffff813c4294>] kernel_thread_helper+0x4/0x10
      [  120.766529]
      [  120.766529] other info that might help us debug this:
      [  120.766529]
      [  120.766529]  Possible unsafe locking scenario:
      [  120.766529]
      [  120.766529]        CPU0                    CPU1
      [  120.766529]        ----                    ----
      [  120.766529]   lock(slock-AF_BLUETOOTH-BTPROTO_SCO);
      [  120.766529]
      lock(&(&conn->lock)->rlock#2);
      [  120.766529]
      lock(slock-AF_BLUETOOTH-BTPROTO_SCO);
      [  120.766529]   lock(&(&conn->lock)->rlock#2);
      [  120.766529]
      [  120.766529]  *** DEADLOCK ***
      Signed-off-by: NGustavo Padovan <gustavo.padovan@collabora.co.uk>
      269c4845
  4. 02 8月, 2012 8 次提交
  5. 01 8月, 2012 11 次提交
    • M
      nfs: enable swap on NFS · a564b8f0
      Mel Gorman 提交于
      Implement the new swapfile a_ops for NFS and hook up ->direct_IO.  This
      will set the NFS socket to SOCK_MEMALLOC and run socket reconnect under
      PF_MEMALLOC as well as reset SOCK_MEMALLOC before engaging the protocol
      ->connect() method.
      
      PF_MEMALLOC should allow the allocation of struct socket and related
      objects and the early (re)setting of SOCK_MEMALLOC should allow us to
      receive the packets required for the TCP connection buildup.
      
      [jlayton@redhat.com: Restore PF_MEMALLOC task flags in all cases]
      [dfeng@redhat.com: Fix handling of multiple swap files]
      [a.p.zijlstra@chello.nl: Original patch]
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Acked-by: NRik van Riel <riel@redhat.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Eric B Munson <emunson@mgebm.net>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: James Morris <jmorris@namei.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Mike Christie <michaelc@cs.wisc.edu>
      Cc: Neil Brown <neilb@suse.de>
      Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
      Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
      Cc: Xiaotian Feng <dfeng@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a564b8f0
    • M
      netvm: prevent a stream-specific deadlock · c76562b6
      Mel Gorman 提交于
      This patch series is based on top of "Swap-over-NBD without deadlocking
      v15" as it depends on the same reservation of PF_MEMALLOC reserves logic.
      
      When a user or administrator requires swap for their application, they
      create a swap partition and file, format it with mkswap and activate it
      with swapon.  In diskless systems this is not an option so if swap if
      required then swapping over the network is considered.  The two likely
      scenarios are when blade servers are used as part of a cluster where the
      form factor or maintenance costs do not allow the use of disks and thin
      clients.
      
      The Linux Terminal Server Project recommends the use of the Network Block
      Device (NBD) for swap but this is not always an option.  There is no
      guarantee that the network attached storage (NAS) device is running Linux
      or supports NBD.  However, it is likely that it supports NFS so there are
      users that want support for swapping over NFS despite any performance
      concern.  Some distributions currently carry patches that support swapping
      over NFS but it would be preferable to support it in the mainline kernel.
      
      Patch 1 avoids a stream-specific deadlock that potentially affects TCP.
      
      Patch 2 is a small modification to SELinux to avoid using PFMEMALLOC
      	reserves.
      
      Patch 3 adds three helpers for filesystems to handle swap cache pages.
      	For example, page_file_mapping() returns page->mapping for
      	file-backed pages and the address_space of the underlying
      	swap file for swap cache pages.
      
      Patch 4 adds two address_space_operations to allow a filesystem
      	to pin all metadata relevant to a swapfile in memory. Upon
      	successful activation, the swapfile is marked SWP_FILE and
      	the address space operation ->direct_IO is used for writing
      	and ->readpage for reading in swap pages.
      
      Patch 5 notes that patch 3 is bolting
      	filesystem-specific-swapfile-support onto the side and that
      	the default handlers have different information to what
      	is available to the filesystem. This patch refactors the
      	code so that there are generic handlers for each of the new
      	address_space operations.
      
      Patch 6 adds an API to allow a vector of kernel addresses to be
      	translated to struct pages and pinned for IO.
      
      Patch 7 adds support for using highmem pages for swap by kmapping
      	the pages before calling the direct_IO handler.
      
      Patch 8 updates NFS to use the helpers from patch 3 where necessary.
      
      Patch 9 avoids setting PF_private on PG_swapcache pages within NFS.
      
      Patch 10 implements the new swapfile-related address_space operations
      	for NFS and teaches the direct IO handler how to manage
      	kernel addresses.
      
      Patch 11 prevents page allocator recursions in NFS by using GFP_NOIO
      	where appropriate.
      
      Patch 12 fixes a NULL pointer dereference that occurs when using
      	swap-over-NFS.
      
      With the patches applied, it is possible to mount a swapfile that is on an
      NFS filesystem.  Swap performance is not great with a swap stress test
      taking roughly twice as long to complete than if the swap device was
      backed by NBD.
      
      This patch: netvm: prevent a stream-specific deadlock
      
      It could happen that all !SOCK_MEMALLOC sockets have buffered so much data
      that we're over the global rmem limit.  This will prevent SOCK_MEMALLOC
      buffers from receiving data, which will prevent userspace from running,
      which is needed to reduce the buffered data.
      
      Fix this by exempting the SOCK_MEMALLOC sockets from the rmem limit.  Once
      this change it applied, it is important that sockets that set
      SOCK_MEMALLOC do not clear the flag until the socket is being torn down.
      If this happens, a warning is generated and the tokens reclaimed to avoid
      accounting errors until the bug is fixed.
      
      [davem@davemloft.net: Warning about clearing SOCK_MEMALLOC]
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Acked-by: NRik van Riel <riel@redhat.com>
      Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
      Cc: Neil Brown <neilb@suse.de>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Mike Christie <michaelc@cs.wisc.edu>
      Cc: Eric B Munson <emunson@mgebm.net>
      Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
      Cc: Mel Gorman <mgorman@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c76562b6
    • M
      netvm: set PF_MEMALLOC as appropriate during SKB processing · b4b9e355
      Mel Gorman 提交于
      In order to make sure pfmemalloc packets receive all memory needed to
      proceed, ensure processing of pfmemalloc SKBs happens under PF_MEMALLOC.
      This is limited to a subset of protocols that are expected to be used for
      writing to swap.  Taps are not allowed to use PF_MEMALLOC as these are
      expected to communicate with userspace processes which could be paged out.
      
      [a.p.zijlstra@chello.nl: Ideas taken from various patches]
      [jslaby@suse.cz: Lock imbalance fix]
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Cc: Neil Brown <neilb@suse.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Christie <michaelc@cs.wisc.edu>
      Cc: Eric B Munson <emunson@mgebm.net>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Christoph Lameter <cl@linux.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b4b9e355
    • M
      netvm: allow skb allocation to use PFMEMALLOC reserves · c93bdd0e
      Mel Gorman 提交于
      Change the skb allocation API to indicate RX usage and use this to fall
      back to the PFMEMALLOC reserve when needed.  SKBs allocated from the
      reserve are tagged in skb->pfmemalloc.  If an SKB is allocated from the
      reserve and the socket is later found to be unrelated to page reclaim, the
      packet is dropped so that the memory remains available for page reclaim.
      Network protocols are expected to recover from this packet loss.
      
      [a.p.zijlstra@chello.nl: Ideas taken from various patches]
      [davem@davemloft.net: Use static branches, coding style corrections]
      [sebastian@breakpoint.cc: Avoid unnecessary cast, fix !CONFIG_NET build]
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Cc: Neil Brown <neilb@suse.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Christie <michaelc@cs.wisc.edu>
      Cc: Eric B Munson <emunson@mgebm.net>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Christoph Lameter <cl@linux.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c93bdd0e
    • M
      netvm: allow the use of __GFP_MEMALLOC by specific sockets · 7cb02404
      Mel Gorman 提交于
      Allow specific sockets to be tagged SOCK_MEMALLOC and use __GFP_MEMALLOC
      for their allocations.  These sockets will be able to go below watermarks
      and allocate from the emergency reserve.  Such sockets are to be used to
      service the VM (iow.  to swap over).  They must be handled kernel side,
      exposing such a socket to user-space is a bug.
      
      There is a risk that the reserves be depleted so for now, the
      administrator is responsible for increasing min_free_kbytes as necessary
      to prevent deadlock for their workloads.
      
      [a.p.zijlstra@chello.nl: Original patches]
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Cc: Neil Brown <neilb@suse.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Christie <michaelc@cs.wisc.edu>
      Cc: Eric B Munson <emunson@mgebm.net>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Christoph Lameter <cl@linux.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7cb02404
    • M
      net: introduce sk_gfp_atomic() to allow addition of GFP flags depending on the individual socket · 99a1dec7
      Mel Gorman 提交于
      Introduce sk_gfp_atomic(), this function allows to inject sock specific
      flags to each sock related allocation.  It is only used on allocation
      paths that may be required for writing pages back to network storage.
      
      [davem@davemloft.net: Use sk_gfp_atomic only when necessary]
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Cc: Neil Brown <neilb@suse.de>
      Cc: Mike Christie <michaelc@cs.wisc.edu>
      Cc: Eric B Munson <emunson@mgebm.net>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Christoph Lameter <cl@linux.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      99a1dec7
    • A
      memcg: rename config variables · c255a458
      Andrew Morton 提交于
      Sanity:
      
      CONFIG_CGROUP_MEM_RES_CTLR -> CONFIG_MEMCG
      CONFIG_CGROUP_MEM_RES_CTLR_SWAP -> CONFIG_MEMCG_SWAP
      CONFIG_CGROUP_MEM_RES_CTLR_SWAP_ENABLED -> CONFIG_MEMCG_SWAP_ENABLED
      CONFIG_CGROUP_MEM_RES_CTLR_KMEM -> CONFIG_MEMCG_KMEM
      
      [mhocko@suse.cz: fix missed bits]
      Cc: Glauber Costa <glommer@parallels.com>
      Acked-by: NMichal Hocko <mhocko@suse.cz>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c255a458
    • D
      ipv4: Properly purge netdev references on uncached routes. · caacf05e
      David S. Miller 提交于
      When a device is unregistered, we have to purge all of the
      references to it that may exist in the entire system.
      
      If a route is uncached, we currently have no way of accomplishing
      this.
      
      So create a global list that is scanned when a network device goes
      down.  This mirrors the logic in net/core/dst.c's dst_ifdown().
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      caacf05e
    • D
      c5038a83
    • E
      ipv4: percpu nh_rth_output cache · d26b3a7c
      Eric Dumazet 提交于
      Input path is mostly run under RCU and doesnt touch dst refcnt
      
      But output path on forwarding or UDP workloads hits
      badly dst refcount, and we have lot of false sharing, for example
      in ipv4_mtu() when reading rt->rt_pmtu
      
      Using a percpu cache for nh_rth_output gives a nice performance
      increase at a small cost.
      
      24 udpflood test on my 24 cpu machine (dummy0 output device)
      (each process sends 1.000.000 udp frames, 24 processes are started)
      
      before : 5.24 s
      after : 2.06 s
      For reference, time on linux-3.5 : 6.60 s
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Tested-by: NAlexander Duyck <alexander.h.duyck@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d26b3a7c
    • E
      ipv4: Restore old dst_free() behavior. · 54764bb6
      Eric Dumazet 提交于
      commit 404e0a8b (net: ipv4: fix RCU races on dst refcounts) tried
      to solve a race but added a problem at device/fib dismantle time :
      
      We really want to call dst_free() as soon as possible, even if sockets
      still have dst in their cache.
      dst_release() calls in free_fib_info_rcu() are not welcomed.
      
      Root of the problem was that now we also cache output routes (in
      nh_rth_output), we must use call_rcu() instead of call_rcu_bh() in
      rt_free(), because output route lookups are done in process context.
      
      Based on feedback and initial patch from David Miller (adding another
      call_rcu_bh() call in fib, but it appears it was not the right fix)
      
      I left the inet_sk_rx_dst_set() helper and added __rcu attributes
      to nh_rth_output and nh_rth_input to better document what is going on in
      this code.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      54764bb6
  6. 31 7月, 2012 11 次提交