- 28 7月, 2010 26 次提交
-
-
由 Eric Paris 提交于
This is a new f_mode which can only be set by the kernel. It indicates that the fd was opened by fanotify and should not cause future fanotify events. This is needed to prevent fanotify livelock. An example of obvious livelock is from fanotify close events. Process A closes file1 This creates a close event for file1. fanotify opens file1 for Listener X Listener X deals with the event and closes its fd for file1. This creates a close event for file1. fanotify opens file1 for Listener X Listener X deals with the event and closes its fd for file1. This creates a close event for file1. fanotify opens file1 for Listener X Listener X deals with the event and closes its fd for file1. notice a pattern? The fix is to add the FMODE_NONOTIFY bit to the open filp done by the kernel for fanotify. Thus when that file is used it will not generate future events. This patch simply defines the bit. Signed-off-by: NEric Paris <eparis@redhat.com>
-
由 Eric Paris 提交于
previously I used mark_entry when talking about marks on inodes. The _entry is pretty useless. Just use "mark" instead. Signed-off-by: NEric Paris <eparis@redhat.com>
-
由 Eric Paris 提交于
the _entry portion of fsnotify functions is useless. Drop it. Signed-off-by: NEric Paris <eparis@redhat.com>
-
由 Eric Paris 提交于
The name is long and it serves no real purpose. So rename fsnotify_mark_entry to just fsnotify_mark. Signed-off-by: NEric Paris <eparis@redhat.com>
-
由 Andreas Gruenbacher 提交于
Some fsnotify operations send a struct file. This is more information than we technically need. We instead send a struct path in all cases instead of sometimes a path and sometimes a file. Signed-off-by: NAndreas Gruenbacher <agruen@suse.de> Signed-off-by: NEric Paris <eparis@redhat.com>
-
由 Eric Paris 提交于
To differentiate between inode and vfsmount (or other future) types of marks we add a flags field and set the inode bit on inode marks (the only currently supported type of mark) Signed-off-by: NEric Paris <eparis@redhat.com>
-
由 Eric Paris 提交于
vfsmount marks need mostly the same data as inode specific fields, but for consistency and understandability we put that data in a vfsmount specific struct inside a union with inode specific data. Signed-off-by: NEric Paris <eparis@redhat.com>
-
由 Eric Paris 提交于
The addition of marks on vfs mounts will be simplified if the inode specific parts of a mark and the vfsmnt specific parts of a mark are actually in a union so naming can be easy. This patch just implements the inode struct and the union. Signed-off-by: NEric Paris <eparis@redhat.com>
-
由 Eric Paris 提交于
To ensure that a group will not duplicate events when it receives it based on the vfsmount and the inode should_send_event test we should distinguish those two cases. We pass a vfsmount to this function so groups can make their own determinations. Signed-off-by: NEric Paris <eparis@redhat.com>
-
由 Eric Paris 提交于
currently all of the notification systems implemented select which inodes they care about and receive messages only about those inodes (or the children of those inodes.) This patch begins to flesh out fsnotify support for the concept of listeners that want to hear notification for an inode accessed below a given monut point. This patch implements a second list of fsnotify groups to hold these types of groups and a second global mask to hold the events of interest for this type of group. The reason we want a second group list and mask is because the inode based notification should_send_event support which makes each group look for a mark on the given inode. With one nfsmount listener that means that every group would have to take the inode->i_lock, look for their mark, not find one, and return for every operation. By seperating vfsmount from inode listeners only when there is a inode listener will the inode groups have to look for their mark and take the inode lock. vfsmount listeners will have to grab the lock and look for a mark but there should be fewer of them, and one vfsmount listener won't cause the i_lock to be grabbed and released for every fsnotify group on every io operation. Signed-off-by: NEric Paris <eparis@redhat.com>
-
由 Eric Paris 提交于
Simple renaming patch. fsnotify is about to support mount point listeners so I am renaming fsnotify_groups and fsnotify_mask to indicate these are lists used only for groups which have watches on inodes. Signed-off-by: NEric Paris <eparis@redhat.com>
-
由 Eric Paris 提交于
Nothing uses the mask argument to fsnotify_alloc_group. This patch drops that argument. Signed-off-by: NEric Paris <eparis@redhat.com>
-
由 Eric Paris 提交于
fsnotify_obtain_group was intended to be able to find an already existing group. Nothing uses that functionality. This just renames it to fsnotify_alloc_group so it is clear what it is doing. Signed-off-by: NEric Paris <eparis@redhat.com>
-
由 Eric Paris 提交于
The original fsnotify interface has a group-num which was intended to be able to find a group after it was added. I no longer think this is a necessary thing to do and so we remove the group_num. Signed-off-by: NEric Paris <eparis@redhat.com>
-
由 Eric Paris 提交于
fanotify would like to clone events already on its notification list, make changes to the new event, and then replace the old event on the list with the new event. This patch implements the replace functionality of that process. Signed-off-by: NEric Paris <eparis@redhat.com>
-
由 Eric Paris 提交于
fsnotify_clone_event will take an event, clone it, and return the cloned event to the caller. Since events may be in use by multiple fsnotify groups simultaneously certain event entries (such as the mask) cannot be changed after the event was created. Since fanotify would like to merge events happening on the same file it needs a new clean event to work with so it can change any fields it wishes. Signed-off-by: NEric Paris <eparis@redhat.com>
-
由 Eric Paris 提交于
inotify only wishes to merge a new event with the last event on the notification fifo. fanotify is willing to merge any events including by means of bitwise OR masks of multiple events together. This patch moves the inotify event merging logic out of the generic fsnotify notification.c and into the inotify code. This allows each use of fsnotify to provide their own merge functionality. Signed-off-by: NEric Paris <eparis@redhat.com>
-
由 Eric Paris 提交于
fanotify needs a path in order to open an fd to the object which changed. Currently notifications to inode's parents are done using only the inode. For some parental notification we have the entire file, send that so fanotify can use it. Signed-off-by: NEric Paris <eparis@redhat.com>
-
由 Eric Paris 提交于
fanotify, the upcoming notification system actually needs a struct path so it can do opens in the context of listeners, and it needs a file so it can get f_flags from the original process. Close was the only operation that already was passing a struct file to the notification hook. This patch passes a file for access, modify, and open as well as they are easily available to these hooks. Signed-off-by: NEric Paris <eparis@redhat.com>
-
由 Eric Paris 提交于
fanotify is going to need to look at file->private_data to know if an event should be sent or not. This passes the data (which might be a file, dentry, inode, or none) to the should_send function calls so fanotify can get that information when available Signed-off-by: NEric Paris <eparis@redhat.com>
-
由 Eric Paris 提交于
fanotify is only interested in event types which contain enough information to open the original file in the context of the fanotify listener. Since fanotify may not want to send events if that data isn't present we pass the data type to the should_send_event function call so fanotify can express its lack of interest. Signed-off-by: NEric Paris <eparis@redhat.com>
-
由 Eric Paris 提交于
nothing uses inotify in the kernel, drop it! Signed-off-by: NEric Paris <eparis@redhat.com>
-
由 Eric Paris 提交于
Simply switch audit_trees from using inotify to using fsnotify for it's inode pinning and disappearing act information. Signed-off-by: NEric Paris <eparis@redhat.com>
-
由 Eric Paris 提交于
This patch allows a task to add a second fsnotify mark to an inode for the same group. This mark will be added to the end of the inode's list and this will never be found by the stand fsnotify_find_mark() function. This is useful if a user wants to add a new mark before removing the old one. Signed-off-by: NEric Paris <eparis@redhat.com>
-
由 Eric Paris 提交于
Simple copy fsnotify information from one mark to another in preparation for the second mark to replace the first. Signed-off-by: NEric Paris <eparis@redhat.com>
-
由 Eric Paris 提交于
Audit currently uses inotify to pin inodes in core and to detect when watched inodes are deleted or unmounted. This patch uses fsnotify instead of inotify. Signed-off-by: NEric Paris <eparis@redhat.com>
-
- 25 7月, 2010 2 次提交
-
-
由 stephen hemminger 提交于
This fixes hang when target device of mirred packet classifier action is removed. If a mirror or redirection action is configured to cause packets to go to another device, the classifier holds a ref count, but was assuming the adminstrator cleaned up all redirections before removing. The fix is to add a notifier and cleanup during unregister. The new list is implicitly protected by RTNL mutex because it is held during filter add/delete as well as notifier. Signed-off-by: NStephen Hemminger <shemminger@vyatta.com> Acked-by: NJamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Rafael J. Wysocki 提交于
Commit 2a6b6976 (ACPI: Store NVS state even when entering suspend to RAM) caused the ACPI suspend code save the NVS area during suspend and restore it during resume unconditionally, although it is known that some systems need to use acpi_sleep=s4_nonvs for hibernation to work. To allow the affected systems to avoid saving and restoring the NVS area during suspend to RAM and resume, introduce kernel command line option acpi_sleep=nonvs and make acpi_sleep=s4_nonvs work as its alias temporarily (add acpi_sleep=s4_nonvs to the feature removal file). Addresses https://bugzilla.kernel.org/show_bug.cgi?id=16396 . Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl> Reported-and-tested-by: Ntomas m <tmezzadra@gmail.com> Signed-off-by: NLen Brown <len.brown@intel.com>
-
- 23 7月, 2010 3 次提交
-
-
由 Sam Ravnborg 提交于
The .data..init_task output section was missing a load offset causing a popwerpc target to fail to boot. Sean MacLennan tracked it down to the definition of INIT_TASK_DATA_SECTION(). There are only two users of INIT_TASK_DATA_SECTION() in the kernel today: cris and popwerpc. cris do not support relocatable kernels and is thus not impacted by this change. Fix INIT_TASK_DATA_SECTION() to specify load offset like all other output sections. Reported-by: NSean MacLennan <smaclennan@pikatech.com> Signed-off-by: NSam Ravnborg <sam@ravnborg.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Len Brown 提交于
It turns out that there is a bit in the _CST for Intel FFH C3 that tells the OS if we should be checking BM_STS or not. Linux has been unconditionally checking BM_STS. If the chip-set is configured to enable BM_STS, it can retard or completely prevent entry into deep C-states -- as illustrated by turbostat: http://userweb.kernel.org/~lenb/acpi/utils/pmtools/turbostat/ ref: Intel Processor Vendor-Specific ACPI Interface Specification table 4 "_CST FFH GAS Field Encoding" Bit 1: Set to 1 if OSPM should use Bus Master avoidance for this C-state https://bugzilla.kernel.org/show_bug.cgi?id=15886Signed-off-by: NLen Brown <len.brown@intel.com>
-
由 Herbert Xu 提交于
Mark Wagner reported OOM symptoms when sending UDP traffic over a macvtap link to a kvm receiver. This appears to be caused by the fact that macvtap packet queues are unlimited in length. This means that if the receiver can't keep up with the rate of flow, then we will hit OOM. Of course it gets worse if the OOM killer then decides to kill the receiver. This patch imposes a cap on the packet queue length, in the same way as the tuntap driver, using the device TX queue length. Please note that macvtap currently has no way of giving congestion notification, that means the software device TX queue cannot be used and packets will always be dropped once the macvtap driver queue fills up. This shouldn't be a great problem for the scenario where macvtap is used to feed a kvm receiver, as the traffic is most likely external in origin so congestion notification can't be applied anyway. Of course, if anybody decides to complain about guest-to-guest UDP packet loss down the track, then we may have to revisit this. Incidentally, this patch also fixes a real memory leak when macvtap_get_queue fails. Chris Wright noticed that for this patch to work, we need a non-zero TX queue length. This patch includes his work to change the default macvtap TX queue length to 500. Reported-by: NMark Wagner <mwagner@redhat.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Acked-by: NChris Wright <chrisw@sous-sol.org> Acked-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 22 7月, 2010 1 次提交
-
-
由 Jason Wessel 提交于
The kdb code should not toggle the sysrq state in case an end user wants to try and resume the normal kernel execution. Signed-off-by: NJason Wessel <jason.wessel@windriver.com> Acked-by: NDmitry Torokhov <dmitry.torokhov@gmail.com>
-
- 21 7月, 2010 4 次提交
-
-
由 Mikael Pettersson 提交于
The kernel's math-emu code contains a macro _FP_FROM_INT() which is used to convert an integer to a raw normalized floating-point value. It does this basically in three steps: 1. Compute the exponent from the number of leading zero bits. 2. Downshift large fractions to put the MSB in the right position for normalized fractions. 3. Upshift small fractions to put the MSB in the right position. There is an boundary error in step 2, causing a fraction with its MSB exactly one bit above the normalized MSB position to not be downshifted. This results in a non-normalized raw float, which when packed becomes a massively inaccurate representation for that input. The impact of this depends on a number of arch-specific factors, but it is known to have broken emulation of FXTOD instructions on UltraSPARC III, which was originally reported as GCC bug 44631 <http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44631>. Any arch which uses math-emu to emulate conversions from integers to same-size floats may be affected. The fix is simple: the exponent comparison used to determine if the fraction should be downshifted must be "<=" not "<". I'm sending a kernel module to test this as a reply to this message. There are also SPARC user-space test cases in the GCC bug entry. Signed-off-by: NMikael Pettersson <mikpe@it.uu.se> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Doug Goldstein 提交于
vgaarb.h was missing the #define of the #ifndef at the top for the guard to prevent multiple #include's from causing re-define errors Signed-off-by: NDoug Goldstein <cardoe@gentoo.org> Cc: Dave Airlie <airlied@redhat.com> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NDave Airlie <airlied@redhat.com>
-
由 Paul E. McKenney 提交于
If a single-threaded process does a file-descriptor operation, and some other process accesses that same file descriptor via /proc, the current rcu_dereference_check_fdtable() can give a false-positive RCU-lockdep splat due to the reference count being increased by the /proc access after the reference-count check in fget_light() but before the check in rcu_dereference_check_fdtable(). This commit prevents this false positive by checking for a single-threaded process. To avoid #include hell, this commit uses the wrapper for thread_group_empty(current) defined by rcu_my_thread_group_empty() provided in a separate commit. Located-by: NMiles Lane <miles.lane@gmail.com> Located-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Sam Ravnborg 提交于
We define a number of symbols in the linker scipt like this: __start_syscalls_metadata = .; *(__syscalls_metadata) But we do not know the alignment of "." when we assign the __start_syscalls_metadata symbol. gcc started to uses bigger alignment for structs (32 bytes), so we saw situations where the linker due to alignment constraints increased the value of "." after the symbol assignment. This resulted in boot fails. Fix this by forcing a 32 byte alignment of "." before the assignment. This patch introduces the forced alignment for ftrace_events and syscalls_metadata. It may be required in more places. Reported-by: NZeev Tarantov <zeev.tarantov@gmail.com> Signed-off-by: NSam Ravnborg <sam@ravnborg.org> LKML-Reference: <20100710063459.GA14596@merkur.ravnborg.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 20 7月, 2010 1 次提交
-
-
由 Dan Carpenter 提交于
If the kzalloc() fails we should return NULL. All the places that call alloc_apertures() check for this already. Signed-off-by: NDan Carpenter <error27@gmail.com> Acked-by: NJames Simmons <jsimmons@infradead.org> Acked-by: NMarcin Slusarz <marcin.slusarz@gmail.com> Signed-off-by: NDave Airlie <airlied@redhat.com>
-
- 19 7月, 2010 1 次提交
-
-
由 Dave Chinner 提交于
The current shrinker implementation requires the registered callback to have global state to work from. This makes it difficult to shrink caches that are not global (e.g. per-filesystem caches). Pass the shrinker structure to the callback so that users can embed the shrinker structure in the context the shrinker needs to operate on and get back to it in the callback via container_of(). Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de>
-
- 17 7月, 2010 1 次提交
-
-
由 Bjorn Helgaas 提交于
If we fail to assign resources to a PCI BAR, this patch makes us try the original address from BIOS rather than leaving it disabled. Linux tries to make sure all PCI device BARs are inside the upstream PCI host bridge or P2P bridge apertures, reassigning BARs if necessary. Windows does similar reassignment. Before this patch, if we could not move a BAR into an aperture, we left the resource unassigned, i.e., at address zero. Windows leaves such BARs at the original BIOS addresses, and this patch makes Linux do the same. This is a bit ugly because we disable the resource long before we try to reassign it, so we have to keep track of the BIOS BAR address somewhere. For lack of a better place, I put it in the struct pci_dev. I think it would be cleaner to attempt the assignment immediately when the claim fails, so we could easily remember the original address. But we currently claim motherboard resources in the middle, after attempting to claim PCI resources and before assigning new PCI resources, and changing that is a fairly big job. Addresses https://bugzilla.kernel.org/show_bug.cgi?id=16263Reported-by: NAndrew <nitr0@seti.kr.ua> Tested-by: NAndrew <nitr0@seti.kr.ua> Signed-off-by: NBjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
-
- 16 7月, 2010 1 次提交
-
-
由 Jan Kara 提交于
OCFS2 uses t_commit trigger to compute and store checksum of the just committed blocks. When a buffer has b_frozen_data, checksum is computed for it instead of b_data but this can result in an old checksum being written to the filesystem in the following scenario: 1) transaction1 is opened 2) handle1 is opened 3) journal_access(handle1, bh) - This sets jh->b_transaction to transaction1 4) modify(bh) 5) journal_dirty(handle1, bh) 6) handle1 is closed 7) start committing transaction1, opening transaction2 8) handle2 is opened 9) journal_access(handle2, bh) - This copies off b_frozen_data to make it safe for transaction1 to commit. jh->b_next_transaction is set to transaction2. 10) jbd2_journal_write_metadata() checksums b_frozen_data 11) the journal correctly writes b_frozen_data to the disk journal 12) handle2 is closed - There was no dirty call for the bh on handle2, so it is never queued for any more journal operation 13) Checkpointing finally happens, and it just spools the bh via normal buffer writeback. This will write b_data, which was never triggered on and thus contains a wrong (old) checksum. This patch fixes the problem by calling the trigger at the moment data is frozen for journal commit - i.e., either when b_frozen_data is created by do_get_write_access or just before we write a buffer to the log if b_frozen_data does not exist. We also rename the trigger to t_frozen as that better describes when it is called. Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: NMark Fasheh <mfasheh@suse.com> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-