- 05 10月, 2009 1 次提交
-
-
由 Chris Mason 提交于
The btrfs async worker threads are used for a wide variety of things, including processing bio end_io functions. This means that when the endio threads aren't running, the rest of the FS isn't able to do the final processing required to clear PageWriteback. The endio threads also try to exit as they become idle and start more as the work piles up. The problem is that starting more threads means kthreadd may need to allocate ram, and that allocation may wait until the global number of writeback pages on the system is below a certain limit. The result of that throttling is that end IO threads wait on kthreadd, who is waiting on IO to end, which will never happen. This commit fixes the deadlock by handing off thread startup to a dedicated thread. It also fixes a bug where the on-demand thread creation was creating far too many threads because it didn't take into account threads being started by other procs. Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
- 16 9月, 2009 3 次提交
-
-
由 Chris Mason 提交于
It was possible for an async worker thread to be selected to receive a new work item, but exit before the work item was actually placed into that thread's work list. This commit fixes the race by incrementing the num_pending counter earlier, and making sure to check the number of pending work items before a thread exits. Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Chris Mason 提交于
The exit-on-idle code for async worker threads was incorrectly calling spin_lock_irq with interrupts already off. Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Chris Mason 提交于
After a new worker thread starts, it is placed into the list of idle threads. But, this may race with a check for idle done by the worker thread itself, resulting in a double list_add operation. This fix adds a check to make sure the idle thread addition is done properly. Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
- 12 9月, 2009 3 次提交
-
-
由 Chris Mason 提交于
This changes the btrfs worker threads to batch work items into a local list. It allows us to pull work items in large chunks and significantly reduces the number of times we need to take the worker thread spinlock. Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Chris Mason 提交于
The btrfs worker thread spinlock was being used both for the queueing of IO and for the processing of ordered events. The ordered events never happen from end_io handlers, and so they don't need to use the _irq version of spinlocks. This adds a dedicated lock to the ordered lists so they don't have to run with irqs off. Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Chris Mason 提交于
The Btrfs worker threads don't currently die off after they have been idle for a while, leading to a lot of threads sitting around doing nothing for each mount. Also, they are unable to start atomically (from end_io hanlders). This commit reworks the worker threads so they can be started from end_io handlers (just setting a flag that asks for a thread to be added at a later date) and so they can exit if they have been idle for a long time. Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
- 23 7月, 2009 1 次提交
-
-
由 Julia Lawall 提交于
If spin_lock_irqsave is called twice in a row with the same second argument, the interrupt state at the point of the second call overwrites the value saved by the first call. Indeed, the second call does not need to save the interrupt state, so it is changed to a simple spin_lock. Signed-off-by: NJulia Lawall <julia@diku.dk> Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
- 03 7月, 2009 1 次提交
-
-
由 Jiri Slaby 提交于
worker memory is already freed on one fail path in btrfs_start_workers, but is still dereferenced. Switch the dereference and kfree. Signed-off-by: NJiri Slaby <jirislaby@gmail.com> Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
- 11 6月, 2009 1 次提交
-
-
由 Shin Hong 提交于
This patch fixes a bug which may result race condition between btrfs_start_workers() and worker_loop(). btrfs_start_workers() executed in a parent thread writes on workers->worker and worker_loop() in a child thread reads workers->worker. However, there is no synchronization enforcing the order of two operations. This patch makes btrfs_start_workers() fill workers->worker before it starts a child thread with worker_loop() Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
- 21 4月, 2009 1 次提交
-
-
由 Chris Mason 提交于
Btrfs is using WRITE_SYNC_PLUG to send down synchronous IOs with a higher priority. But, the checksumming helper threads prevent it from being fully effective. There are two problems. First, a big queue of pending checksumming will delay the synchronous IO behind other lower priority writes. Second, the checksumming uses an ordered async work queue. The ordering makes sure that IOs are sent to the block layer in the same order they are sent to the checksumming threads. Usually this gives us less seeky IO. But, when we start mixing IO priorities, the lower priority IO can delay the higher priority IO. This patch solves both problems by adding a high priority list to the async helper threads, and a new btrfs_set_work_high_prio(), which is used to make put a new async work item onto the higher priority list. The ordering is still done on high priority IO, but all of the high priority bios are ordered separately from the low priority bios. This ordering is purely an IO optimization, it is not involved in data or metadata integrity. Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
- 03 4月, 2009 2 次提交
-
-
由 Jim Owens 提交于
Signed-off-by: Njim owens <jowens@hp.com> Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Amit Gud 提交于
Need to check kthread_should_stop after schedule_timeout() before calling schedule(). This causes threads to sleep with potentially no one to wake them up causing mount(2) to hang in btrfs_stop_workers waiting for threads to stop. Signed-off-by: NAmit Gud <gud@ksu.edu> Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
- 04 2月, 2009 2 次提交
-
-
由 Chris Mason 提交于
Tracing shows the delay between when an async thread goes to sleep and when more work is added is often very short. This commit adds a little bit of delay and extra checking to the code right before we schedule out. It allows more work to be added to the worker without requiring notifications from other procs. Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Chris Mason 提交于
The async bio submission thread was missing some bios that were added after it had decided there was no work left to do. Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
- 21 1月, 2009 1 次提交
-
-
由 Huang Weiyi 提交于
Removed unused #include <version.h>'s in btrfs Signed-off-by: NHuang Weiyi <weiyi.huang@gmail.com> Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
- 06 1月, 2009 1 次提交
-
-
由 Chris Mason 提交于
There were many, most are fixed now. struct-funcs.c generates some warnings but these are bogus. Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
- 13 11月, 2008 1 次提交
-
-
由 yanhai zhu 提交于
In worker_loop(), the func should check whether it has been requested to stop before it decides to schedule out. Otherwise if the stop request(also the last wake_up()) sent by btrfs_stop_workers() happens when worker_loop() running after the "while" judgement and before schedule(), woker_loop() will schedule away and never be woken up, which will also cause btrfs_stop_workers() wait forever. Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
- 07 11月, 2008 1 次提交
-
-
由 Chris Mason 提交于
Btrfs uses kernel threads to create async work queues for cpu intensive operations such as checksumming and decompression. These work well, but they make it difficult to keep IO order intact. A single writepages call from pdflush or fsync will turn into a number of bios, and each bio is checksummed in parallel. Once the checksum is computed, the bio is sent down to the disk, and since we don't control the order in which the parallel operations happen, they might go down to the disk in almost any order. The code deals with this somewhat by having deep work queues for a single kernel thread, making it very likely that a single thread will process all the bios for a single inode. This patch introduces an explicitly ordered work queue. As work structs are placed into the queue they are put onto the tail of a list. They have three callbacks: ->func (cpu intensive processing here) ->ordered_func (order sensitive processing here) ->ordered_free (free the work struct, all processing is done) The work struct has three callbacks. The func callback does the cpu intensive work, and when it completes the work struct is marked as done. Every time a work struct completes, the list is checked to see if the head is marked as done. If so the ordered_func callback is used to do the order sensitive processing and the ordered_free callback is used to do any cleanup. Then we loop back and check the head of the list again. This patch also changes the checksumming code to use the ordered workqueues. One a 4 drive array, it increases streaming writes from 280MB/s to 350MB/s. Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
- 01 10月, 2008 1 次提交
-
-
由 Chris Mason 提交于
When reading in block groups, a global mask of the available raid policies should be adjusted based on the types of block groups found on disk. This global mask is then used to decide which raid policy to use for new block groups. The recent allocator changes dropped the call that updated the global mask, making all the block groups allocated at run time single striped onto a single drive. This also fixes the async worker threads to set any thread that uses the requeue mechanism as busy. This allows us to avoid blocking on get_request_wait for the async bio submission threads. Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
- 30 9月, 2008 1 次提交
-
-
由 Chris Mason 提交于
This improves the comments at the top of many functions. It didn't dive into the guts of functions because I was trying to avoid merging problems with the new allocator and back reference work. extent-tree.c and volumes.c were both skipped, and there is definitely more work todo in cleaning and commenting the code. Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
- 26 9月, 2008 1 次提交
-
-
由 Chris Mason 提交于
Btrfs had compatibility code for kernels back to 2.6.18. These have been removed, and will be maintained in a separate backport git tree from now on. Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
- 25 9月, 2008 8 次提交
-
-
由 Chris Mason 提交于
This takes the csum mutex deeper in the call chain and releases it more often. Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Chris Mason 提交于
Before this change, btrfs would use a bdi congestion function to make sure there weren't too many pending async checksum work items. This change makes the process creating async work items wait instead, leading to fewer congestion returns from the bdi. This improves pdflush background_writeout scanning. Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Chris Mason 提交于
Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Chris Mason 提交于
Large streaming reads make for large bios, which means each entry on the list async work queues represents a large amount of data. IO congestion throttling on the device was kicking in before the async worker threads decided a single thread was busy and needed some help. The end result was that a streaming read would result in a single CPU running at 100% instead of balancing the work off to other CPUs. This patch also changes the pre-IO checksum lookup done by reads to work on a per-bio basis instead of a per-page. This results in many extra btree lookups on large streaming reads. Doing the checksum lookup right before bio submit allows us to reuse searches while processing adjacent offsets. Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Li Zefan 提交于
When kthread_run() returns failure, this worker hasn't been added to the list, so btrfs_stop_workers() won't free it. Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Chris Mason 提交于
This changes the worker thread pool to maintain a list of idle threads, avoiding a complex search for a good thread to wake up. Threads have two states: idle - we try to reuse the last thread used in hopes of improving the batching ratios busy - each time a new work item is added to a busy task, the task is rotated to the end of the line. Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Chris Mason 提交于
Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Chris Mason 提交于
Btrfs has been using workqueues to spread the checksumming load across other CPUs in the system. But, workqueues only schedule work on the same CPU that queued the work, giving them a limited benefit for systems with higher CPU counts. This code adds a generic facility to schedule work with pools of kthreads, and changes the bio submission code to queue bios up. The queueing is important to make sure large numbers of procs on the system don't turn streaming workloads into random workloads by sending IO down concurrently. The end result of all of this is much higher performance (and CPU usage) when doing checksumming on large machines. Two worker pools are created, one for writes and one for endio processing. The two could deadlock if we tried to service both from a single pool. Signed-off-by: NChris Mason <chris.mason@oracle.com>
-