- 18 1月, 2012 6 次提交
-
-
由 Eric Paris 提交于
The audit system previously expected arches calling to audit_syscall_exit to supply as arguments if the syscall was a success and what the return code was. Audit also provides a helper AUDITSC_RESULT which was supposed to simplify things by converting from negative retcodes to an audit internal magic value stating success or failure. This helper was wrong and could indicate that a valid pointer returned to userspace was a failed syscall. The fix is to fix the layering foolishness. We now pass audit_syscall_exit a struct pt_reg and it in turns calls back into arch code to collect the return value and to determine if the syscall was a success or failure. We also define a generic is_syscall_success() macro which determines success/failure based on if the value is < -MAX_ERRNO. This works for arches like x86 which do not use a separate mechanism to indicate syscall failure. We make both the is_syscall_success() and regs_return_value() static inlines instead of macros. The reason is because the audit function must take a void* for the regs. (uml calls theirs struct uml_pt_regs instead of just struct pt_regs so audit_syscall_exit can't take a struct pt_regs). Since the audit function takes a void* we need to use static inlines to cast it back to the arch correct structure to dereference it. The other major change is that on some arches, like ia64, MIPS and ppc, we change regs_return_value() to give us the negative value on syscall failure. THE only other user of this macro, kretprobe_example.c, won't notice and it makes the value signed consistently for the audit functions across all archs. In arch/sh/kernel/ptrace_64.c I see that we were using regs[9] in the old audit code as the return value. But the ptrace_64.h code defined the macro regs_return_value() as regs[3]. I have no idea which one is correct, but this patch now uses the regs_return_value() function, so it now uses regs[3]. For powerpc we previously used regs->result but now use the regs_return_value() function which uses regs->gprs[3]. regs->gprs[3] is always positive so the regs_return_value(), much like ia64 makes it negative before calling the audit code when appropriate. Signed-off-by: NEric Paris <eparis@redhat.com> Acked-by: H. Peter Anvin <hpa@zytor.com> [for x86 portion] Acked-by: Tony Luck <tony.luck@intel.com> [for ia64] Acked-by: Richard Weinberger <richard@nod.at> [for uml] Acked-by: David S. Miller <davem@davemloft.net> [for sparc] Acked-by: Ralf Baechle <ralf@linux-mips.org> [for mips] Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [for ppc]
-
由 Eric Paris 提交于
The audit system likes to collect information about processes that end abnormally (SIGSEGV) as this may me useful intrusion detection information. This patch adds audit support to collect information when seccomp forces a task to exit because of misbehavior in a similar way. Signed-off-by: NEric Paris <eparis@redhat.com>
-
由 Eric Paris 提交于
The audit system has the ability to filter on the major and minor number of the device containing the inode being operated upon. Lets say that /dev/sda1 has major,minor 8,1 and that we mount /dev/sda1 on /boot. Now lets say we add a watch with a filter on 8,1. If we proceed to open an inode inside /boot, such as /vboot/vmlinuz, we will match the major,minor filter. Lets instead assume that one were to use a tool like debugfs and were to open /dev/sda1 directly and to modify it's contents. We might hope that this would also be logged, but it isn't. The rules will check the major,minor of the device containing /dev/sda1. In other words the rule would match on the major/minor of the tmpfs mounted at /dev. I believe these rules should trigger on either device. The man page is devoid of useful information about the intended semantics. It only seems logical that if you want to know everything that happened on a major,minor that would include things that happened to the device itself... Signed-off-by: NEric Paris <eparis@redhat.com>
-
由 Eric Paris 提交于
userspace audit messages look like so: type=USER msg=audit(1271170549.415:24710): user pid=14722 uid=0 auid=500 ses=1 subj=unconfined_u:unconfined_r:auditctl_t:s0-s0:c0.c1023 msg='' That third field just says 'user'. That's useless and doesn't follow the key=value pair we are trying to enforce. We already know it came from the user based on the record type. Kill that word. Die. Signed-off-by: NEric Paris <eparis@redhat.com>
-
由 Eric Paris 提交于
This patch does 2 things. First it reduces the number of audit_names allocated in every audit context from 20 to 5. 5 should be enough for all 'normal' syscalls (rename being the worst). Some syscalls can still touch more the 5 inodes such as mount. When rpc filesystem is mounted it will create inodes and those can exceed 5. To handle that problem this patch will dynamically allocate audit_names if it needs more than 5. This should decrease the typicall memory usage while still supporting all the possible kernel operations. Signed-off-by: NEric Paris <eparis@redhat.com>
-
由 Eric Paris 提交于
Every other filter that matches part of the inodes list collected by audit will match against any of the inodes on that list. The filetype matching however had a strange way of doing things. It allowed userspace to indicated if it should match on the first of the second name collected by the kernel. Name collection ordering seems like a kernel internal and making userspace rules get that right just seems like a bad idea. As it turns out the userspace audit writers had no idea it was doing this and thus never overloaded the value field. The kernel always checked the first name collected which for the tested rules was always correct. This patch just makes the filetype matching like the major, minor, inode, and LSM rules in that it will match against any of the names collected. It also changes the rule validation to reject the old unused rule types. Noone knew it was there. Noone used it. Why keep around the extra code? Signed-off-by: NEric Paris <eparis@redhat.com>
-
- 12 1月, 2012 12 次提交
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip由 Linus Torvalds 提交于
* 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/intel config: Fix the APB_TIMER selection x86/mrst: Add additional debug prints for pb_keys x86/intel config: Revamp configuration to allow for Moorestown and Medfield x86/intel/scu/ipc: Match the changes in the x86 configuration x86/apb: Fix configuration constraints x86: Fix INTEL_MID silly x86/Kconfig: Cyclone-timer depends on x86-summit x86: Reduce clock calibration time during slave cpu startup x86/config: Revamp configuration for MID devices x86/sfi: Kill the IRQ as id hack
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip由 Linus Torvalds 提交于
* 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, reboot: Fix typo in nmi reboot path x86, NMI: Add to_cpumask() to silence compile warning x86, NMI: NMI selftest depends on the local apic x86: Add stack top margin for stack overflow checking x86, NMI: NMI-selftest should handle the UP case properly x86: Fix the 32-bit stackoverflow-debug build x86, NMI: Add knob to disable using NMI IPIs to stop cpus x86, NMI: Add NMI IPI selftest x86, reboot: Use NMI instead of REBOOT_VECTOR to stop cpus x86: Clean up the range of stack overflow checking x86: Panic on detection of stack overflow x86: Check stack overflow in detail
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip由 Linus Torvalds 提交于
* 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, efi: Break up large initrd reads x86, efi: EFI boot stub support efi: Add EFI file I/O data types efi.h: Add boottime->locate_handle search types efi.h: Add graphics protocol guids efi.h: Add allocation types for boottime->allocate_pages() efi.h: Add efi_image_loaded_t efi.h: Add struct definition for boot time services x86: Don't use magic strings for EFI loader signature x86: Add missing bzImage fields to struct setup_header
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip由 Linus Torvalds 提交于
* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/numa: Add constraints check for nid parameters mm, x86: Remove debug_pagealloc_enabled x86/mm: Initialize high mem before free_all_bootmem() arch/x86/kernel/e820.c: quiet sparse noise about plain integer as NULL pointer arch/x86/kernel/e820.c: Eliminate bubble sort from sanitize_e820_map() x86: Fix mmap random address range x86, mm: Unify zone_sizes_init() x86, mm: Prepare zone_sizes_init() for unification x86, mm: Use max_low_pfn for ZONE_NORMAL on 64-bit x86, mm: Wrap ZONE_DMA32 with CONFIG_ZONE_DMA32 x86, mm: Use max_pfn instead of highend_pfn x86, mm: Move zone init from paging_init() on 64-bit x86, mm: Use MAX_DMA_PFN for ZONE_DMA on 32-bit
-
git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq由 Linus Torvalds 提交于
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq: (23 commits) [CPUFREQ] EXYNOS: Removed useless headers and codes [CPUFREQ] EXYNOS: Make EXYNOS common cpufreq driver [CPUFREQ] powernow-k8: Update copyright, maintainer and documentation information [CPUFREQ] powernow-k8: Fix indexing issue [CPUFREQ] powernow-k8: Avoid Pstate MSR accesses on systems supporting CPB [CPUFREQ] update lpj only if frequency has changed [CPUFREQ] cpufreq:userspace: fix cpu_cur_freq updation [CPUFREQ] Remove wall variable from cpufreq_gov_dbs_init() [CPUFREQ] EXYNOS4210: cpufreq code is changed for stable working [CPUFREQ] EXYNOS4210: Update frequency table for cpu divider [CPUFREQ] EXYNOS4210: Remove code about bus on cpufreq [CPUFREQ] s3c64xx: Use pr_fmt() for consistent log messages cpufreq: OMAP: fixup for omap_device changes, include <linux/module.h> cpufreq: OMAP: fix freq_table leak cpufreq: OMAP: put clk if cpu_init failed cpufreq: OMAP: only supports OPP library cpufreq: OMAP: dont support !freq_table cpufreq: OMAP: deny initialization if no mpudev cpufreq: OMAP: move clk name decision to init cpufreq: OMAP: notify even with bad boot frequency ...
-
git://git.infradead.org/battery-2.6由 Linus Torvalds 提交于
* git://git.infradead.org/battery-2.6: (68 commits) power_supply: Mark da9052 driver as broken power_supply: Drop usage of nowarn variant of sysfs_create_link() s3c_adc_battery: Average over more than one adc sample power_supply: Add DA9052 battery driver isp1704_charger: Fix missing check jz4740-battery: Fix signedness bug power_supply: Assume mains power by default sbs-battery: Fix devicetree match table ARM: rx51: Add bq27200 i2c board info sbs-battery: Change power supply name devicetree-bindings: Propagate bq20z75->sbs rename to dt bindings devicetree-bindings: Add vendor entry for Smart Battery Systems sbs-battery: Rename internals to new name bq20z75: Rename to sbs-battery wm97xx_battery: Use DEFINE_MUTEX() for work_lock max8997_charger: Remove duplicate module.h lp8727_charger: Some minor fixes for the header lp8727_charger: Add header file power_supply: Convert drivers/power/* to use module_platform_driver() power_supply: Add "unknown" in power supply type ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux由 Linus Torvalds 提交于
* 'slab/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux: slub: disallow changing cpu_partial from userspace for debug caches slub: add missed accounting slub: Extract get_freelist from __slab_alloc slub: Switch per cpu partial page support off for debugging slub: fix a possible memleak in __slab_alloc() slub: fix slub_max_order Documentation slub: add missed accounting slab: add taint flag outputting to debug paths. slub: add taint flag outputting to debug paths slab: introduce slab_max_order kernel parameter slab: rename slab_break_gfp_order to slab_max_order
-
git://neil.brown.name/md由 Linus Torvalds 提交于
Two bugfixes for md. One is a recently introduced regression that affects an unusual configuration with a guaranteed BUG_ON. Has been tagged for -stable. The other is minor missing functionality. * tag 'md-3.3-fixes' of git://neil.brown.name/md: md/raid1: perform bad-block tests for WriteMostly devices too. md: notify the 'degraded' sysfs attribute on failure.
-
git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci由 Linus Torvalds 提交于
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci: (80 commits) x86/PCI: Expand the x86_msi_ops to have a restore MSIs. PCI: Increase resource array mask bit size in pcim_iomap_regions() PCI: DEVICE_COUNT_RESOURCE should be equal to PCI_NUM_RESOURCES PCI: pci_ids: add device ids for STA2X11 device (aka ConneXT) PNP: work around Dell 1536/1546 BIOS MMCONFIG bug that breaks USB x86/PCI: amd: factor out MMCONFIG discovery PCI: Enable ATS at the device state restore PCI: msi: fix imbalanced refcount of msi irq sysfs objects PCI: kconfig: English typo in pci/pcie/Kconfig PCI/PM/Runtime: make PCI traces quieter PCI: remove pci_create_bus() xtensa/PCI: convert to pci_scan_root_bus() for correct root bus resources x86/PCI: convert to pci_create_root_bus() and pci_scan_root_bus() x86/PCI: use pci_scan_bus() instead of pci_scan_bus_parented() x86/PCI: read Broadcom CNB20LE host bridge info before PCI scan sparc32, leon/PCI: convert to pci_scan_root_bus() for correct root bus resources sparc/PCI: convert to pci_create_root_bus() sh/PCI: convert to pci_scan_root_bus() for correct root bus resources powerpc/PCI: convert to pci_create_root_bus() powerpc/PCI: split PHB part out of pcibios_map_io_space() ... Fix up conflicts in drivers/pci/msi.c and include/linux/pci_regs.h due to the same patches being applied in other branches.
-
由 Ben Hutchings 提交于
frv, h8300, m68k, microblaze, openrisc, score, um and xtensa currently do not register a CPU device. Add the config option GENERIC_CPU_DEVICES which causes a generic CPU device to be registered for each present CPU, and make all these architectures select it. Richard Weinberger <richard@nod.at> covered UML and suggested using per_cpu. Signed-off-by: NBen Hutchings <ben@decadent.org.uk> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ben Hutchings 提交于
cpu_dev_init() is only called from driver_init(), which does not check its return value. Therefore make cpu_dev_init() return void. We must register the CPU subsystem, so panic if this fails. If sched_create_sysfs_power_savings_entries() fails, the damage is contained, so ignore this (as before). Signed-off-by: NBen Hutchings <ben@decadent.org.uk> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Pekka Enberg 提交于
-
- 11 1月, 2012 22 次提交
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6由 Linus Torvalds 提交于
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (54 commits) crypto: gf128mul - remove leftover "(EXPERIMENTAL)" in Kconfig crypto: serpent-sse2 - remove unneeded LRW/XTS #ifdefs crypto: serpent-sse2 - select LRW and XTS crypto: twofish-x86_64-3way - remove unneeded LRW/XTS #ifdefs crypto: twofish-x86_64-3way - select LRW and XTS crypto: xts - remove dependency on EXPERIMENTAL crypto: lrw - remove dependency on EXPERIMENTAL crypto: picoxcell - fix boolean and / or confusion crypto: caam - remove DECO access initialization code crypto: caam - fix polarity of "propagate error" logic crypto: caam - more desc.h cleanups crypto: caam - desc.h - convert spaces to tabs crypto: talitos - convert talitos_error to struct device crypto: talitos - remove NO_IRQ references crypto: talitos - fix bad kfree crypto: convert drivers/crypto/* to use module_platform_driver() char: hw_random: convert drivers/char/hw_random/* to use module_platform_driver() crypto: serpent-sse2 - should select CRYPTO_CRYPTD crypto: serpent - rename serpent.c to serpent_generic.c crypto: serpent - cleanup checkpatch errors and warnings ...
-
git://selinuxproject.org/~jmorris/linux-security由 Linus Torvalds 提交于
* 'for-linus' of git://selinuxproject.org/~jmorris/linux-security: (32 commits) ima: fix invalid memory reference ima: free duplicate measurement memory security: update security_file_mmap() docs selinux: Casting (void *) value returned by kmalloc is useless apparmor: fix module parameter handling Security: tomoyo: add .gitignore file tomoyo: add missing rcu_dereference() apparmor: add missing rcu_dereference() evm: prevent racing during tfm allocation evm: key must be set once during initialization mpi/mpi-mpow: NULL dereference on allocation failure digsig: build dependency fix KEYS: Give key types their own lockdep class for key->sem TPM: fix transmit_cmd error logic TPM: NSC and TIS drivers X86 dependency fix TPM: Export wait_for_stat for other vendor specific drivers TPM: Use vendor specific function for status probe tpm_tis: add delay after aborting command tpm_tis: Check return code from getting timeouts/durations tpm: Introduce function to poll for result of self test ... Fix up trivial conflict in lib/Makefile due to addition of CONFIG_MPI and SIGSIG next to CONFIG_DQL addition.
-
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs由 Linus Torvalds 提交于
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: autofs4: deal with autofs4_write/autofs4_write races autofs4: catatonic_mode vs. notify_daemon race autofs4: autofs4_wait() vs. autofs4_catatonic_mode() race hfsplus: creation of hidden dir on mount can fail block_dev: Suppress bdev_cache_init() kmemleak warninig fix shrink_dcache_parent() livelock coda: switch coda_cnode_make() to sane API as well, clean coda_lookup() coda: deal correctly with allocation failure from coda_cnode_makectl() securityfs: fix object creation races
-
由 Al Viro 提交于
Just serialize the actual writing of packets into pipe on a new mutex, independent from everything else in the locking hierarchy. As soon as something has started feeding a piece of packet into the pipe to daemon, we *want* everything else about to try the same to wait until we are done. Acked-by: NIan Kent <raven@themaw.net> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
we need to hold ->wq_mutex while we are forming the packet to send, lest we have autofs4_catatonic_mode() setting wq->name.name to NULL just as autofs4_notify_daemon() decides to memcpy() from it... We do have check for catatonic mode immediately after that (under ->wq_mutex, as it ought to be) and packet won't be actually sent, but it'll be too late for us if we oops on that memcpy() from NULL... Fix is obvious - just extend the area covered by ->wq_mutex over that switch and check whether it's catatonic *before* doing anything else. Acked-by: NIan Kent <raven@themaw.net> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
We need to recheck ->catatonic after autofs4_wait() got ->wq_mutex for good, or we might end up with wq inserted into queue after autofs4_catatonic_mode() had done its thing. It will stick there forever, since there won't be anything to clear its ->name.name. A bit of a complication: validate_request() drops and regains ->wq_mutex. It actually ends up the most convenient place to stick the check into... Acked-by: NIan Kent <raven@themaw.net> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost由 Linus Torvalds 提交于
lib: use generic pci_iomap on all architectures Many architectures don't want to pull in iomap.c, so they ended up duplicating pci_iomap from that file. That function isn't trivial, and we are going to modify it https://lkml.org/lkml/2011/11/14/183 so the duplication hurts. This reduces the scope of the problem significantly, by moving pci_iomap to a separate file and referencing that from all architectures. * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: alpha: drop pci_iomap/pci_iounmap from pci-noop.c mn10300: switch to GENERIC_PCI_IOMAP mn10300: add missing __iomap markers frv: switch to GENERIC_PCI_IOMAP tile: switch to GENERIC_PCI_IOMAP tile: don't panic on iomap sparc: switch to GENERIC_PCI_IOMAP sh: switch to GENERIC_PCI_IOMAP powerpc: switch to GENERIC_PCI_IOMAP parisc: switch to GENERIC_PCI_IOMAP mips: switch to GENERIC_PCI_IOMAP microblaze: switch to GENERIC_PCI_IOMAP arm: switch to GENERIC_PCI_IOMAP alpha: switch to GENERIC_PCI_IOMAP lib: add GENERIC_PCI_IOMAP lib: move GENERIC_IOMAP to lib/Kconfig Fix up trivial conflicts due to changes nearby in arch/{m68k,score}/Kconfig
-
git://linux-c6x.org/git/projects/linux-c6x-upstreaming由 Linus Torvalds 提交于
* tag 'for-linux-3.3-merge-window' of git://linux-c6x.org/git/projects/linux-c6x-upstreaming: (29 commits) C6X: replace tick_nohz_stop/restart_sched_tick calls C6X: add register_cpu call C6X: deal with memblock API changes C6X: fix timer64 initialization C6X: fix layout of EMIFA registers C6X: MAINTAINERS C6X: DSCR - Device State Configuration Registers C6X: EMIF - External Memory Interface C6X: general SoC support C6X: library code C6X: headers C6X: ptrace support C6X: loadable module support C6X: cache control C6X: clocks C6X: build infrastructure C6X: syscalls C6X: interrupt handling C6X: time management C6X: signal management ...
-
git://git.monstr.eu/linux-2.6-microblaze由 Linus Torvalds 提交于
* 'next' of git://git.monstr.eu/linux-2.6-microblaze: microblaze: Wire-up new system calls microblaze: Remove NO_IRQ from architecture input: xilinx_ps2: Don't use NO_IRQ block: xsysace: Don't use NO_IRQ microblaze: Trivial asm fix microblaze: Fix debug message in module microblaze: Remove eprintk macro microblaze: Send CR before LF for early console microblaze: Change NO_IRQ to 0 microblaze: Use irq_of_parse_and_map for timer microblaze: intc: Change variable name microblaze: Use of_find_compatible_node for timer and intc microblaze: Add __cmpdi2 microblaze: Synchronize __pa __va macros
-
git://github.com/gxt/linux由 Linus Torvalds 提交于
* 'unicore32' of git://github.com/gxt/linux: rtc-puv3: solve section mismatch in rtc-puv3.c rtc-puv3: using module_platform_driver() i2c-puv3: using module_platform_driver() rtc-puv3: irq: remove IRQF_DISABLED unicore32: Remove IRQF_DISABLED unicore32: Use set_current_blocked() unicore32: add ioremap_nocache definition unicore32: delete specified xlate_dev_mem_ptr of: add include asm/setup.h in drivers/of/fdt.c unicore32: standardize /proc/iomem "Kernel code" name
-
git://git.kernel.org/pub/scm/linux/kernel/git/lliubbo/blackfin由 Linus Torvalds 提交于
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lliubbo/blackfin: blackfin: bf561: add adv7183 capture support blackfin: bf537: add capture support blackfin: bf548: add capture support blackfin: time-ts: rm unused func broadcast_timer_setup() blackfin: i2c-lcd: change default clock rate blackfin: mac: dsa: add vlan mask in board file blackfin: bf537: change num_chipselect for spi-sport blackfin: serial: bfin-uart: remove unused field bf54x: get mem size: missing break in switch blackfin: smp: fix msg queue overflow issue blackfin: config: update macro SPI_BFIN in board file blackfin: config: update def config for all boards blackfin: smp: cleanup smp code blackfin: smp: add suspend and wakeup irq flags blackfin: bf533-stamp: add missed patches for new asoc driver blackfin: bf533-stamp: fix ad1836 name
-
git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux由 Linus Torvalds 提交于
* 'writeback-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux: writeback: move MIN_WRITEBACK_PAGES to fs-writeback.c writeback: balanced_rate cannot exceed write bandwidth writeback: do strict bdi dirty_exceeded writeback: avoid tiny dirty poll intervals writeback: max, min and target dirty pause time writeback: dirty ratelimit - think time compensation btrfs: fix dirtied pages accounting on sub-page writes writeback: fix dirtied pages accounting on redirty writeback: fix dirtied pages accounting on sub-page writes writeback: charge leaked page dirties to active tasks writeback: Include all dirty inodes in background writeback
-
由 Linus Torvalds 提交于
Andrew elucidates: - First installmeant of MM. We have a HUGE number of MM patches this time. It's crazy. - MAINTAINERS updates - backlight updates - leds - checkpatch updates - misc ELF stuff - rtc updates - reiserfs - procfs - some misc other bits * akpm: (124 commits) user namespace: make signal.c respect user namespaces workqueue: make alloc_workqueue() take printf fmt and args for name procfs: add hidepid= and gid= mount options procfs: parse mount options procfs: introduce the /proc/<pid>/map_files/ directory procfs: make proc_get_link to use dentry instead of inode signal: add block_sigmask() for adding sigmask to current->blocked sparc: make SA_NOMASK a synonym of SA_NODEFER reiserfs: don't lock root inode searching reiserfs: don't lock journal_init() reiserfs: delay reiserfs lock until journal initialization reiserfs: delete comments referring to the BKL drivers/rtc/interface.c: fix alarm rollover when day or month is out-of-range drivers/rtc/rtc-twl.c: add DT support for RTC inside twl4030/twl6030 drivers/rtc/: remove redundant spi driver bus initialization drivers/rtc/rtc-jz4740.c: make jz4740_rtc_driver static drivers/rtc/rtc-mc13xxx.c: make mc13xxx_rtc_idtable static rtc: convert drivers/rtc/* to use module_platform_driver() drivers/rtc/rtc-wm831x.c: convert to devm_kzalloc() drivers/rtc/rtc-wm831x.c: remove unused period IRQ handler ...
-
由 Serge E. Hallyn 提交于
ipc/mqueue.c: for __SI_MESQ, convert the uid being sent to recipient's user namespace. (new, thanks Oleg) __send_signal: convert current's uid to the recipient's user namespace for any siginfo which is not SI_FROMKERNEL (patch from Oleg, thanks again :) do_notify_parent and do_notify_parent_cldstop: map task's uid to parent's user namespace ptrace_signal maps parent's uid into current's user namespace before including in signal to current. IIUC Oleg has argued that this shouldn't matter as the debugger will play with it, but it seems like not converting the value currently being set is misleading. Changelog: Sep 20: Inspired by Oleg's suggestion, define map_cred_ns() helper to simplify callers and help make clear what we are translating (which uid into which namespace). Passing the target task would make callers even easier to read, but we pass in user_ns because current_user_ns() != task_cred_xxx(current, user_ns). Sep 20: As recommended by Oleg, also put task_pid_vnr() under rcu_read_lock in ptrace_signal(). Sep 23: In send_signal(), detect when (user) signal is coming from an ancestor or unrelated user namespace. Pass that on to __send_signal, which sets si_uid to 0 or overflowuid if needed. Oct 12: Base on Oleg's fixup_uid() patch. On top of that, handle all SI_FROMKERNEL cases at callers, because we can't assume sender is current in those cases. Nov 10: (mhelsley) rename fixup_uid to more meaningful usern_fixup_signal_uid Nov 10: (akpm) make the !CONFIG_USER_NS case clearer Signed-off-by: NSerge Hallyn <serge.hallyn@canonical.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Matt Helsley <matthltc@us.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> From: Serge Hallyn <serge.hallyn@canonical.com> Subject: __send_signal: pass q->info, not info, to userns_fixup_signal_uid (v2) Eric Biederman pointed out that passing info is a bug and could lead to a NULL pointer deref to boot. A collection of signal, securebits, filecaps, cap_bounds, and a few other ltp tests passed with this kernel. Changelog: Nov 18: previous patch missed a leading '&' Signed-off-by: NSerge Hallyn <serge.hallyn@canonical.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> From: Dan Carpenter <dan.carpenter@oracle.com> Subject: ipc/mqueue: lock() => unlock() typo There was a double lock typo introduced in b085f4bd6b21 "user namespace: make signal.c respect user namespaces" Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Matt Helsley <matthltc@us.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: NSerge Hallyn <serge@hallyn.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Tejun Heo 提交于
alloc_workqueue() currently expects the passed in @name pointer to remain accessible. This is inconvenient and a bit silly given that the whole wq is being dynamically allocated. This patch updates alloc_workqueue() and friends to take printf format string instead of opaque string and matching varargs at the end. The name is allocated together with the wq and formatted. alloc_ordered_workqueue() is converted to a macro to unify varargs handling with alloc_workqueue(), and, while at it, add comment to alloc_workqueue(). None of the current in-kernel users pass in string with '%' as constant name and this change shouldn't cause any problem. [akpm@linux-foundation.org: use __printf] Signed-off-by: NTejun Heo <tj@kernel.org> Suggested-by: NChristoph Hellwig <hch@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Vasiliy Kulikov 提交于
Add support for mount options to restrict access to /proc/PID/ directories. The default backward-compatible "relaxed" behaviour is left untouched. The first mount option is called "hidepid" and its value defines how much info about processes we want to be available for non-owners: hidepid=0 (default) means the old behavior - anybody may read all world-readable /proc/PID/* files. hidepid=1 means users may not access any /proc/<pid>/ directories, but their own. Sensitive files like cmdline, sched*, status are now protected against other users. As permission checking done in proc_pid_permission() and files' permissions are left untouched, programs expecting specific files' modes are not confused. hidepid=2 means hidepid=1 plus all /proc/PID/ will be invisible to other users. It doesn't mean that it hides whether a process exists (it can be learned by other means, e.g. by kill -0 $PID), but it hides process' euid and egid. It compicates intruder's task of gathering info about running processes, whether some daemon runs with elevated privileges, whether another user runs some sensitive program, whether other users run any program at all, etc. gid=XXX defines a group that will be able to gather all processes' info (as in hidepid=0 mode). This group should be used instead of putting nonroot user in sudoers file or something. However, untrusted users (like daemons, etc.) which are not supposed to monitor the tasks in the whole system should not be added to the group. hidepid=1 or higher is designed to restrict access to procfs files, which might reveal some sensitive private information like precise keystrokes timings: http://www.openwall.com/lists/oss-security/2011/11/05/3 hidepid=1/2 doesn't break monitoring userspace tools. ps, top, pgrep, and conky gracefully handle EPERM/ENOENT and behave as if the current user is the only user running processes. pstree shows the process subtree which contains "pstree" process. Note: the patch doesn't deal with setuid/setgid issues of keeping preopened descriptors of procfs files (like https://lkml.org/lkml/2011/2/7/368). We rely on that the leaked information like the scheduling counters of setuid apps doesn't threaten anybody's privacy - only the user started the setuid program may read the counters. Signed-off-by: NVasiliy Kulikov <segoon@openwall.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Randy Dunlap <rdunlap@xenotime.net> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Greg KH <greg@kroah.com> Cc: Theodore Tso <tytso@MIT.EDU> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: James Morris <jmorris@namei.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Vasiliy Kulikov 提交于
Add support for procfs mount options. Actual mount options are coming in the next patches. Signed-off-by: NVasiliy Kulikov <segoon@openwall.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Randy Dunlap <rdunlap@xenotime.net> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Greg KH <greg@kroah.com> Cc: Theodore Tso <tytso@MIT.EDU> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: James Morris <jmorris@namei.org> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Pavel Emelyanov 提交于
This one behaves similarly to the /proc/<pid>/fd/ one - it contains symlinks one for each mapping with file, the name of a symlink is "vma->vm_start-vma->vm_end", the target is the file. Opening a symlink results in a file that point exactly to the same inode as them vma's one. For example the ls -l of some arbitrary /proc/<pid>/map_files/ | lr-x------ 1 root root 64 Aug 26 06:40 7f8f80403000-7f8f80404000 -> /lib64/libc-2.5.so | lr-x------ 1 root root 64 Aug 26 06:40 7f8f8061e000-7f8f80620000 -> /lib64/libselinux.so.1 | lr-x------ 1 root root 64 Aug 26 06:40 7f8f80826000-7f8f80827000 -> /lib64/libacl.so.1.1.0 | lr-x------ 1 root root 64 Aug 26 06:40 7f8f80a2f000-7f8f80a30000 -> /lib64/librt-2.5.so | lr-x------ 1 root root 64 Aug 26 06:40 7f8f80a30000-7f8f80a4c000 -> /lib64/ld-2.5.so This *helps* checkpointing process in three ways: 1. When dumping a task mappings we do know exact file that is mapped by particular region. We do this by opening /proc/$pid/map_files/$address symlink the way we do with file descriptors. 2. This also helps in determining which anonymous shared mappings are shared with each other by comparing the inodes of them. 3. When restoring a set of processes in case two of them has a mapping shared, we map the memory by the 1st one and then open its /proc/$pid/map_files/$address file and map it by the 2nd task. Using /proc/$pid/maps for this is quite inconvenient since it brings repeatable re-reading and reparsing for this text file which slows down restore procedure significantly. Also as being pointed in (3) it is a way easier to use top level shared mapping in children as /proc/$pid/map_files/$address when needed. [akpm@linux-foundation.org: coding-style fixes] [gorcunov@openvz.org: make map_files depend on CHECKPOINT_RESTORE] Signed-off-by: NPavel Emelyanov <xemul@parallels.com> Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org> Reviewed-by: NVasiliy Kulikov <segoon@openwall.com> Reviewed-by: N"Kirill A. Shutemov" <kirill@shutemov.name> Cc: Tejun Heo <tj@kernel.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Pavel Machek <pavel@ucw.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Cyrill Gorcunov 提交于
Prepare the ground for the next "map_files" patch which needs a name of a link file to analyse. Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Tejun Heo <tj@kernel.org> Cc: Vasiliy Kulikov <segoon@openwall.com> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Matt Fleming 提交于
Abstract the code sequence for adding a signal handler's sa_mask to current->blocked because the sequence is identical for all architectures. Furthermore, in the past some architectures actually got this code wrong, so introduce a wrapper that all architectures can use. Signed-off-by: NMatt Fleming <matt.fleming@intel.com> Signed-off-by: NOleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Tejun Heo <tj@kernel.org> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Matt Fleming 提交于
Unlike other architectures, sparc currently has no SA_NODEFER definition but only the older SA_NOMASK. Since SA_NOMASK is the historical name for SA_NODEFER, add SA_NODEFER and copy what other architectures do by making SA_NOMASK a synonym for SA_NODEFER. Signed-off-by: NMatt Fleming <matt.fleming@intel.com> Acked-by: NOleg Nesterov <oleg@redhat.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Frederic Weisbecker 提交于
Nothing requires that we lock the filesystem until the root inode is provided. Also iget5_locked() triggers a warning because we are holding the filesystem lock while allocating the inode, which result in a lockdep suspicion that we have a lock inversion against the reclaim path: [ 1986.896979] ================================= [ 1986.896990] [ INFO: inconsistent lock state ] [ 1986.896997] 3.1.1-main #8 [ 1986.897001] --------------------------------- [ 1986.897007] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage. [ 1986.897016] kswapd0/16 [HC0[0]:SC0[0]:HE1:SE1] takes: [ 1986.897023] (&REISERFS_SB(s)->lock){+.+.?.}, at: [<c01f8bd4>] reiserfs_write_lock+0x20/0x2a [ 1986.897044] {RECLAIM_FS-ON-W} state was registered at: [ 1986.897050] [<c014a5b9>] mark_held_locks+0xae/0xd0 [ 1986.897060] [<c014aab3>] lockdep_trace_alloc+0x7d/0x91 [ 1986.897068] [<c0190ee0>] kmem_cache_alloc+0x1a/0x93 [ 1986.897078] [<c01e7728>] reiserfs_alloc_inode+0x13/0x3d [ 1986.897088] [<c01a5b06>] alloc_inode+0x14/0x5f [ 1986.897097] [<c01a5cb9>] iget5_locked+0x62/0x13a [ 1986.897106] [<c01e99e0>] reiserfs_fill_super+0x410/0x8b9 [ 1986.897114] [<c01953da>] mount_bdev+0x10b/0x159 [ 1986.897123] [<c01e764d>] get_super_block+0x10/0x12 [ 1986.897131] [<c0195b38>] mount_fs+0x59/0x12d [ 1986.897138] [<c01a80d1>] vfs_kern_mount+0x45/0x7a [ 1986.897147] [<c01a83e3>] do_kern_mount+0x2f/0xb0 [ 1986.897155] [<c01a987a>] do_mount+0x5c2/0x612 [ 1986.897163] [<c01a9a72>] sys_mount+0x61/0x8f [ 1986.897170] [<c044060c>] sysenter_do_call+0x12/0x32 [ 1986.897181] irq event stamp: 7509691 [ 1986.897186] hardirqs last enabled at (7509691): [<c0190f34>] kmem_cache_alloc+0x6e/0x93 [ 1986.897197] hardirqs last disabled at (7509690): [<c0190eea>] kmem_cache_alloc+0x24/0x93 [ 1986.897209] softirqs last enabled at (7508896): [<c01294bd>] __do_softirq+0xee/0xfd [ 1986.897222] softirqs last disabled at (7508859): [<c01030ed>] do_softirq+0x50/0x9d [ 1986.897234] [ 1986.897235] other info that might help us debug this: [ 1986.897242] Possible unsafe locking scenario: [ 1986.897244] [ 1986.897250] CPU0 [ 1986.897254] ---- [ 1986.897257] lock(&REISERFS_SB(s)->lock); [ 1986.897265] <Interrupt> [ 1986.897269] lock(&REISERFS_SB(s)->lock); [ 1986.897276] [ 1986.897277] *** DEADLOCK *** [ 1986.897278] [ 1986.897286] no locks held by kswapd0/16. [ 1986.897291] [ 1986.897292] stack backtrace: [ 1986.897299] Pid: 16, comm: kswapd0 Not tainted 3.1.1-main #8 [ 1986.897306] Call Trace: [ 1986.897314] [<c0439e76>] ? printk+0xf/0x11 [ 1986.897324] [<c01482d1>] print_usage_bug+0x20e/0x21a [ 1986.897332] [<c01479b8>] ? print_irq_inversion_bug+0x172/0x172 [ 1986.897341] [<c014855c>] mark_lock+0x27f/0x483 [ 1986.897349] [<c0148d88>] __lock_acquire+0x628/0x1472 [ 1986.897358] [<c0149fae>] lock_acquire+0x47/0x5e [ 1986.897366] [<c01f8bd4>] ? reiserfs_write_lock+0x20/0x2a [ 1986.897384] [<c01f8bd4>] ? reiserfs_write_lock+0x20/0x2a [ 1986.897397] [<c043b5ef>] mutex_lock_nested+0x35/0x26f [ 1986.897409] [<c01f8bd4>] ? reiserfs_write_lock+0x20/0x2a [ 1986.897421] [<c01f8bd4>] reiserfs_write_lock+0x20/0x2a [ 1986.897433] [<c01e2edd>] map_block_for_writepage+0xc9/0x590 [ 1986.897448] [<c01b1706>] ? create_empty_buffers+0x33/0x8f [ 1986.897461] [<c0121124>] ? get_parent_ip+0xb/0x31 [ 1986.897472] [<c043ef7f>] ? sub_preempt_count+0x81/0x8e [ 1986.897485] [<c043cae0>] ? _raw_spin_unlock+0x27/0x3d [ 1986.897496] [<c0121124>] ? get_parent_ip+0xb/0x31 [ 1986.897508] [<c01e355d>] reiserfs_writepage+0x1b9/0x3e7 [ 1986.897521] [<c0173b40>] ? clear_page_dirty_for_io+0xcb/0xde [ 1986.897533] [<c014a6e3>] ? trace_hardirqs_on_caller+0x108/0x138 [ 1986.897546] [<c014a71e>] ? trace_hardirqs_on+0xb/0xd [ 1986.897559] [<c0177b38>] shrink_page_list+0x34f/0x5e2 [ 1986.897572] [<c01780a7>] shrink_inactive_list+0x172/0x22c [ 1986.897585] [<c0178464>] shrink_zone+0x303/0x3b1 [ 1986.897597] [<c043cae0>] ? _raw_spin_unlock+0x27/0x3d [ 1986.897611] [<c01788c9>] kswapd+0x3b7/0x5f2 The deadlock shouldn't happen since we are doing that allocation in the mount path, the filesystem is not available for any reclaim. Still the warning is annoying. To solve this, acquire the lock later only where we need it, right before calling reiserfs_read_locked_inode() that wants to lock to walk the tree. Reported-by: NKnut Petersen <Knut_Petersen@t-online.de> Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: Jeff Mahoney <jeffm@suse.com> Cc: Jan Kara <jack@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-