- 25 7月, 2008 40 次提交
-
-
由 Nathan Fontenot 提交于
Verify memory entitlement updates can be handled by vio. Signed-off-by: NNathan Fontenot <nfont@austin.ibm.com> Signed-off-by: NRobert Jennings <rcj@linux.vnet.ibm.com> Acked-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Robert Jennings 提交于
This is a large patch but the normal code path is not affected. For non-pSeries platforms the code is ifdef'ed out and for non-CMO enabled pSeries systems this does not affect the normal code path. Devices that do not perform DMA operations do not need modification with this patch. The function get_desired_dma was renamed from get_io_entitlement for clarity. Overview Cooperative Memory Overcommitment (CMO) allows for a set of OS partitions to be run with less RAM than the aggregate needs of the group of partitions. The firmware will balance memory between the partitions and page in/out memory as needed. Based on the number and type of IO adpaters preset each partition is allocated an amount of memory for DMA operations and this allocation will be guaranteed to the partition; this is referred to as the partition's 'entitlement'. Partitions running in a CMO environment can only have virtual IO devices present. The VIO bus layer will manage the IO entitlement for the system. Accounting, at a system and per-device level, is tracked in the VIO bus code and exposed via sysfs. A set of dma_ops functions are added to the bus to allow for this accounting. Bus initialization At initialization, the bus will calculate the minimum needs of the system based on providing each device present with a standard minimum entitlement along with a spare allocation for the bus to handle hotplug events. If the minimum needs can not be met the system boot will be halted. Device changes The significant changes for devices while running under CMO are that the devices must specify how much dedicated IO entitlement they desire and must also handle DMA mapping errors that can occur due to constrained IO memory. The virtual IO drivers are modified to silence errors when DMA mappings fail for CMO and handle these failures gracefully. Each devices will be guaranteed a minimum entitlement that can always be mapped. Devices will specify how much entitlement they desire and the VIO bus will attempt to provide for this. Devices can change their desired entitlement level at any point in time to address particular needs (via vio_cmo_set_dev_desired()), not just at device probe time. VIO bus changes The system will have a particular entitlement level available from which it can provide memory to the devices. The bus defines two pools of memory within this entitlement, the reserved and excess pools. Each device is provided with it's own entitlement no less than a system defined minimum entitlement and no greater than what the device has specified as it's desired entitlement. The entitlement provided to devices comes from the reserve pool. The reserve pool can also contain a spare allocation as large as the system defined minimum entitlement which is used for device hotplug events. Any entitlement not needed to fulfill the needs of a reserve pool is placed in the excess pool. Each device is guaranteed that it can map up to it's entitled level; additional mapping are possible as long as there is unmapped memory in the excess pool. Bus probe As the system starts, each device is given an entitlement equal only to the system defined minimum entitlement. The reserve pool is equal to the sum of these entitlements, plus a spare allocation. The VIO bus also tracks the aggregate desired entitlement of all the devices. If the system desired entitlement is greater than the size of the reserve pool, when devices unmap IO memory it will be reserved and a balance operation will be scheduled for some time in the future. Entitlement balancing The balance function tries to fairly distribute entitlement between the devices in the system with the goal of providing each device with it's desired amount of entitlement. Devices using more than what would be ideal will have their entitled set-point adjusted; this will effectively set a goal for lower IO memory usage as future mappings can fail and deallocations will trigger a balance operation to distribute the newly unmapped memory. A fair distribution of entitlement can take several balance operations to achieve. Entitlement changes and device DLPAR events will alter the state of CMO and will trigger balance operations. Hotplug events The VIO bus allows for changes in system entitlement at run-time via 'vio_cmo_entitlement_update()'. When devices are added the hotplug device event will be preceded by a system entitlement increase and this is reversed when devices are removed. The following changes are made that the VIO bus layer for CMO: * add IO memory accounting per device structure. * add IO memory entitlement query function to driver structure. * during vio bus probe, if CMO is enabled, check that driver has memory entitlement query function defined. Fail if function not defined. * fail to register driver if io entitlement function not defined. * create set of dma_ops at vio level for CMO that will track allocations and return DMA failures once entitlement is reached. Entitlement will limited by overall system entitlement. Devices will have a reserved quantity of memory that is guaranteed, the rest can be used as available. * expose entitlement, current allocation, desired allocation, and the allocation error counter for devices to the user through sysfs * provide mechanism for changing a device's desired entitlement at run time for devices as an exported function and sysfs tunable * track any DMA failures for entitled IO memory for each vio device. * check entitlement against available system entitlement on device add * track entitlement metrics (high water mark, current usage) * provide function to reset high water mark * provide minimum and desired entitlement numbers at a bus level * provide drivers with a minimum guaranteed entitlement * balance available entitlement between devices to satisfy their needs * handle system entitlement changes and device hotplug Signed-off-by: NRobert Jennings <rcj@linux.vnet.ibm.com> Acked-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Robert Jennings 提交于
To support Cooperative Memory Overcommitment (CMO), we need to check for failure from some of the tce hcalls. These changes for the pseries platform affect the powerpc architecture; patches for the other affected platforms are included in this patch. pSeries platform IOMMU code changes: * platform TCE functions must handle H_NOT_ENOUGH_RESOURCES errors and return an error. Architecture IOMMU code changes: * Calls to ppc_md.tce_build need to check return values and return DMA_MAPPING_ERROR for transient errors. Architecture changes: * struct machdep_calls for tce_build*_pSeriesLP functions need to change to indicate failure. * all other platforms will need updates to iommu functions to match the new calling semantics; they will return 0 on success. The other platforms default configs have been built, but no further testing was performed. Signed-off-by: NRobert Jennings <rcj@linux.vnet.ibm.com> Acked-by: NOlof Johansson <olof@lixom.net> Acked-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Brian King 提交于
With the addition of Cooperative Memory Overcommitment (CMO) support for IBM Power Systems, two fields have been added to the VPA to report paging statistics. Add support in lparcfg to report them to userspace. Signed-off-by: NBrian King <brking@linux.vnet.ibm.com> Signed-off-by: NRobert Jennings <rcj@linux.vnet.ibm.com> Acked-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Brian King 提交于
Adds a collaborative memory manager, which acts as a simple balloon driver for System p machines that support cooperative memory overcommitment (CMO). Adds a platform configuration option for CMO called PPC_SMLPAR. Signed-off-by: NBrian King <brking@linux.vnet.ibm.com> Signed-off-by: NRobert Jennings <rcj@linux.vnet.ibm.com> Acked-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Brian King 提交于
Newer versions of firmware support page states, which are used by the collaborative memory manager (future patch) to "loan" pages to the hypervisor for use by other partitions. Signed-off-by: NBrian King <brking@linux.vnet.ibm.com> Signed-off-by: NRobert Jennings <rcj@linux.vnet.ibm.com> Acked-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Robert Jennings 提交于
For Cooperative Memory Overcommitment (CMO), set the FW_FEATURE_CMO flag in powerpc_firmware_features from the rtas ibm,get-system-parameters table prior to calling iommu_init_early_pSeries. With this, any CMO specific functionality can be controlled by checking: firmware_has_feature(FW_FEATURE_CMO) Signed-off-by: NRobert Jennings <rcj@linux.vnet.ibm.com> Acked-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Robert Jennings 提交于
Split the retrieval of processor entitlement data returned in the H_GET_PPP hcall into its own helper routine. Signed-off-by: NNathan Fontenot <nfont@austin.ibm.com> Signed-off-by: NRobert Jennings <rcj@linux.vnet.ibm.com> Acked-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Nathan Fontenot 提交于
Update /proc/ppc64/lparcfg to display Cooperative Memory Overcommitment statistics as reported by the H_GET_MPP hcall. This also updates the lparcfg interface to allow setting memory entitlement and weight. Signed-off-by: NNathan Fontenot <nfont@austin.ibm.com> Signed-off-by: NRobert Jennings <rcj@linux.vnet.ibm.com> Acked-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Nathan Fotenot 提交于
Split the retrieval and setting of processor entitlement and weight into helper routines. This also removes the printing of the raw values returned from h_get_ppp, the values are already parsed and printed. Signed-off-by: NNathan Fontenot <nfont@austin.ibm.com> Signed-off-by: NRobert Jennings <rcj@linux.vnet.ibm.com> Acked-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Nathan Fontenot 提交于
Remove the extraneous error reporting used when a hcall made from lparcfg fails. Signed-off-by: NNathan Fontenot <nfont@austin.ibm.com> Signed-off-by: NRobert Jennings <rcj@linux.vnet.ibm.com> Acked-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Segher Boessenkool 提交于
My previous patch to fix compilation with binutils-2.17 causes a "file truncated" build error from ld with binutils 2.15 (and possibly older), and a warning with 2.16 and 2.17. This fixes it. Signed-off-by: NSegher Boessenkool <segher@kernel.crashing.org> Acked-by: NChuck Meade <chuckmeade@mindspring.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Mark Nelson 提交于
At the moment the fixed mapping is by default strongly ordered (the iommu_fixed=weak boot option must be used to make the fixed mapping weakly ordered). If we're on a setup where the southbridge is being used in endpoint mode (triblade and CAB boards) the default should be a weakly ordered fixed mapping. This adds a check so that if a node of type pcie-endpoint can be found in the device tree the fixed mapping is set to be weak by default (but can be overridden using iommu_fixed=strong). Signed-off-by: NMark Nelson <markn@au1.ibm.com> Acked-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Luis Machado 提交于
This patch implements support for HW based watchpoint via the DBSR_DAC (Data Address Compare) facility of the BookE processors. It does so by interfacing with the existing DABR breakpoint code and adding the necessary bits and pieces for the new bits to be properly set or cleared Signed-off-by: NLuis Machado <luisgpm@br.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Stephen Rothwell 提交于
A struct sysdev_attribute * parameter was added to the show routine by commit 4a0b2b4d "sysdev: Pass the attribute to the low level sysdev show/store function". This eliminates a warning: arch/powerpc/kernel/sysfs.c:538: warning: initialization from incompatible pointer type Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Nathan Lynch 提交于
Stash the first platform string matched by identify_cpu() in powerpc_base_platform, and supply that to the ELF loader for the value of AT_BASE_PLATFORM. Signed-off-by: NNathan Lynch <ntl@pobox.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Linus Torvalds 提交于
..otherwise oprofile will fall back on that poor timer interrupt. Also replace the unreadable chain of if-statements with a "switch()" statement instead. It generates better code, and is a lot clearer. Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Linus Torvalds 提交于
Suresh Siddha wants to fix a possible FPU leakage in error conditions, but the fact that save/restore_i387() are inlines in a header file makes that harder to do than necessary. So start off with an obvious cleanup. This just moves the x86-64 version of save/restore_i387() out of the header file, and moves it to the only file that it is actually used in: arch/x86/kernel/signal_64.c. So exposing it in a header file was wrong to begin with. [ Side note: I'd like to fix up some of the games we play with the 32-bit version of these functions too, but that's a separate matter. The 32-bit versions are shared - under different names at that! - by both the native x86-32 code and the x86-64 32-bit compatibility code ] Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Linus Torvalds 提交于
Commit 9d25d4db ("x86: BUILD_IRQ say .text to avoid .data.percpu") added a ".text" specifier to make sure that BUILD_IRQ() builds the irq trampoline in the text segment rather than in some random left-over segment that the compiler happened to leave the asm in. However, we should also make sure that we switch back by adding a ".previous" at the end, so that there are no subtle issues with subsequent compiler-generated code. Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 David Brownell 提交于
This fixes kernel http://bugzilla.kernel.org/show_bug.cgi?id=11112 (bogus RTC update IRQs reported) for rtc-cmos, in two ways: - When HPET is stealing the IRQs, use the first IRQ to grab the seconds counter which will be monitored (instead of using whatever was previously in that memory); - In sane IRQ handling modes, scrub out old IRQ status before enabling IRQs. That latter is done by tightening up IRQ handling for rtc-cmos everywhere, also ensuring that when HPET is used it's the only thing triggering IRQ reports to userspace; net object shrink. Also fix a bogus HPET message related to its RTC emulation. Signed-off-by: NDavid Brownell <dbrownell@users.sourceforge.net> Report-by: NW Unruh <unruh@physics.ubc.ca> Cc: Andrew Victor <avictor.za@gmail.com> Cc: <stable@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ulrich Drepper 提交于
Remove the size parameter from the new epoll_create syscall and renames the syscall itself. The updated test program follows. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include <fcntl.h> #include <stdio.h> #include <time.h> #include <unistd.h> #include <sys/syscall.h> #ifndef __NR_epoll_create2 # ifdef __x86_64__ # define __NR_epoll_create2 291 # elif defined __i386__ # define __NR_epoll_create2 329 # else # error "need __NR_epoll_create2" # endif #endif #define EPOLL_CLOEXEC O_CLOEXEC int main (void) { int fd = syscall (__NR_epoll_create2, 0); if (fd == -1) { puts ("epoll_create2(0) failed"); return 1; } int coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if (coe & FD_CLOEXEC) { puts ("epoll_create2(0) set close-on-exec flag"); return 1; } close (fd); fd = syscall (__NR_epoll_create2, EPOLL_CLOEXEC); if (fd == -1) { puts ("epoll_create2(EPOLL_CLOEXEC) failed"); return 1; } coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if ((coe & FD_CLOEXEC) == 0) { puts ("epoll_create2(EPOLL_CLOEXEC) set close-on-exec flag"); return 1; } close (fd); puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: NUlrich Drepper <drepper@redhat.com> Acked-by: NDavide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ulrich Drepper 提交于
This patch introduces the new syscall inotify_init1 (note: the 1 stands for the one parameter the syscall takes, as opposed to no parameter before). The values accepted for this parameter are function-specific and defined in the inotify.h header. Here the values must match the O_* flags, though. In this patch CLOEXEC support is introduced. The following test must be adjusted for architectures other than x86 and x86-64 and in case the syscall numbers changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include <fcntl.h> #include <stdio.h> #include <unistd.h> #include <sys/syscall.h> #ifndef __NR_inotify_init1 # ifdef __x86_64__ # define __NR_inotify_init1 294 # elif defined __i386__ # define __NR_inotify_init1 332 # else # error "need __NR_inotify_init1" # endif #endif #define IN_CLOEXEC O_CLOEXEC int main (void) { int fd; fd = syscall (__NR_inotify_init1, 0); if (fd == -1) { puts ("inotify_init1(0) failed"); return 1; } int coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if (coe & FD_CLOEXEC) { puts ("inotify_init1(0) set close-on-exit"); return 1; } close (fd); fd = syscall (__NR_inotify_init1, IN_CLOEXEC); if (fd == -1) { puts ("inotify_init1(IN_CLOEXEC) failed"); return 1; } coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if ((coe & FD_CLOEXEC) == 0) { puts ("inotify_init1(O_CLOEXEC) does not set close-on-exit"); return 1; } close (fd); puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ [akpm@linux-foundation.org: add sys_ni stub] Signed-off-by: NUlrich Drepper <drepper@redhat.com> Acked-by: NDavide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ulrich Drepper 提交于
This patch introduces the new syscall pipe2 which is like pipe but it also takes an additional parameter which takes a flag value. This patch implements the handling of O_CLOEXEC for the flag. I did not add support for the new syscall for the architectures which have a special sys_pipe implementation. I think the maintainers of those archs have the chance to go with the unified implementation but that's up to them. The implementation introduces do_pipe_flags. I did that instead of changing all callers of do_pipe because some of the callers are written in assembler. I would probably screw up changing the assembly code. To avoid breaking code do_pipe is now a small wrapper around do_pipe_flags. Once all callers are changed over to do_pipe_flags the old do_pipe function can be removed. The following test must be adjusted for architectures other than x86 and x86-64 and in case the syscall numbers changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include <fcntl.h> #include <stdio.h> #include <unistd.h> #include <sys/syscall.h> #ifndef __NR_pipe2 # ifdef __x86_64__ # define __NR_pipe2 293 # elif defined __i386__ # define __NR_pipe2 331 # else # error "need __NR_pipe2" # endif #endif int main (void) { int fd[2]; if (syscall (__NR_pipe2, fd, 0) != 0) { puts ("pipe2(0) failed"); return 1; } for (int i = 0; i < 2; ++i) { int coe = fcntl (fd[i], F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if (coe & FD_CLOEXEC) { printf ("pipe2(0) set close-on-exit for fd[%d]\n", i); return 1; } } close (fd[0]); close (fd[1]); if (syscall (__NR_pipe2, fd, O_CLOEXEC) != 0) { puts ("pipe2(O_CLOEXEC) failed"); return 1; } for (int i = 0; i < 2; ++i) { int coe = fcntl (fd[i], F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if ((coe & FD_CLOEXEC) == 0) { printf ("pipe2(O_CLOEXEC) does not set close-on-exit for fd[%d]\n", i); return 1; } } close (fd[0]); close (fd[1]); puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: NUlrich Drepper <drepper@redhat.com> Acked-by: NDavide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ulrich Drepper 提交于
This patch adds the new dup3 syscall. It extends the old dup2 syscall by one parameter which is meant to hold a flag value. Support for the O_CLOEXEC flag is added in this patch. The following test must be adjusted for architectures other than x86 and x86-64 and in case the syscall numbers changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include <fcntl.h> #include <stdio.h> #include <time.h> #include <unistd.h> #include <sys/syscall.h> #ifndef __NR_dup3 # ifdef __x86_64__ # define __NR_dup3 292 # elif defined __i386__ # define __NR_dup3 330 # else # error "need __NR_dup3" # endif #endif int main (void) { int fd = syscall (__NR_dup3, 1, 4, 0); if (fd == -1) { puts ("dup3(0) failed"); return 1; } int coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if (coe & FD_CLOEXEC) { puts ("dup3(0) set close-on-exec flag"); return 1; } close (fd); fd = syscall (__NR_dup3, 1, 4, O_CLOEXEC); if (fd == -1) { puts ("dup3(O_CLOEXEC) failed"); return 1; } coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if ((coe & FD_CLOEXEC) == 0) { puts ("dup3(O_CLOEXEC) set close-on-exec flag"); return 1; } close (fd); puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: NUlrich Drepper <drepper@redhat.com> Acked-by: NDavide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ulrich Drepper 提交于
This patch adds the new epoll_create2 syscall. It extends the old epoll_create syscall by one parameter which is meant to hold a flag value. In this patch the only flag support is EPOLL_CLOEXEC which causes the close-on-exec flag for the returned file descriptor to be set. A new name EPOLL_CLOEXEC is introduced which in this implementation must have the same value as O_CLOEXEC. The following test must be adjusted for architectures other than x86 and x86-64 and in case the syscall numbers changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include <fcntl.h> #include <stdio.h> #include <time.h> #include <unistd.h> #include <sys/syscall.h> #ifndef __NR_epoll_create2 # ifdef __x86_64__ # define __NR_epoll_create2 291 # elif defined __i386__ # define __NR_epoll_create2 329 # else # error "need __NR_epoll_create2" # endif #endif #define EPOLL_CLOEXEC O_CLOEXEC int main (void) { int fd = syscall (__NR_epoll_create2, 1, 0); if (fd == -1) { puts ("epoll_create2(0) failed"); return 1; } int coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if (coe & FD_CLOEXEC) { puts ("epoll_create2(0) set close-on-exec flag"); return 1; } close (fd); fd = syscall (__NR_epoll_create2, 1, EPOLL_CLOEXEC); if (fd == -1) { puts ("epoll_create2(EPOLL_CLOEXEC) failed"); return 1; } coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if ((coe & FD_CLOEXEC) == 0) { puts ("epoll_create2(EPOLL_CLOEXEC) set close-on-exec flag"); return 1; } close (fd); puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: NUlrich Drepper <drepper@redhat.com> Acked-by: NDavide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ulrich Drepper 提交于
This patch adds the new eventfd2 syscall. It extends the old eventfd syscall by one parameter which is meant to hold a flag value. In this patch the only flag support is EFD_CLOEXEC which causes the close-on-exec flag for the returned file descriptor to be set. A new name EFD_CLOEXEC is introduced which in this implementation must have the same value as O_CLOEXEC. The following test must be adjusted for architectures other than x86 and x86-64 and in case the syscall numbers changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include <fcntl.h> #include <stdio.h> #include <unistd.h> #include <sys/syscall.h> #ifndef __NR_eventfd2 # ifdef __x86_64__ # define __NR_eventfd2 290 # elif defined __i386__ # define __NR_eventfd2 328 # else # error "need __NR_eventfd2" # endif #endif #define EFD_CLOEXEC O_CLOEXEC int main (void) { int fd = syscall (__NR_eventfd2, 1, 0); if (fd == -1) { puts ("eventfd2(0) failed"); return 1; } int coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if (coe & FD_CLOEXEC) { puts ("eventfd2(0) sets close-on-exec flag"); return 1; } close (fd); fd = syscall (__NR_eventfd2, 1, EFD_CLOEXEC); if (fd == -1) { puts ("eventfd2(EFD_CLOEXEC) failed"); return 1; } coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if ((coe & FD_CLOEXEC) == 0) { puts ("eventfd2(EFD_CLOEXEC) does not set close-on-exec flag"); return 1; } close (fd); puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ [akpm@linux-foundation.org: add sys_ni stub] Signed-off-by: NUlrich Drepper <drepper@redhat.com> Acked-by: NDavide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ulrich Drepper 提交于
This patch adds the new signalfd4 syscall. It extends the old signalfd syscall by one parameter which is meant to hold a flag value. In this patch the only flag support is SFD_CLOEXEC which causes the close-on-exec flag for the returned file descriptor to be set. A new name SFD_CLOEXEC is introduced which in this implementation must have the same value as O_CLOEXEC. The following test must be adjusted for architectures other than x86 and x86-64 and in case the syscall numbers changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include <fcntl.h> #include <signal.h> #include <stdio.h> #include <unistd.h> #include <sys/syscall.h> #ifndef __NR_signalfd4 # ifdef __x86_64__ # define __NR_signalfd4 289 # elif defined __i386__ # define __NR_signalfd4 327 # else # error "need __NR_signalfd4" # endif #endif #define SFD_CLOEXEC O_CLOEXEC int main (void) { sigset_t ss; sigemptyset (&ss); sigaddset (&ss, SIGUSR1); int fd = syscall (__NR_signalfd4, -1, &ss, 8, 0); if (fd == -1) { puts ("signalfd4(0) failed"); return 1; } int coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if (coe & FD_CLOEXEC) { puts ("signalfd4(0) set close-on-exec flag"); return 1; } close (fd); fd = syscall (__NR_signalfd4, -1, &ss, 8, SFD_CLOEXEC); if (fd == -1) { puts ("signalfd4(SFD_CLOEXEC) failed"); return 1; } coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if ((coe & FD_CLOEXEC) == 0) { puts ("signalfd4(SFD_CLOEXEC) does not set close-on-exec flag"); return 1; } close (fd); puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ [akpm@linux-foundation.org: add sys_ni stub] Signed-off-by: NUlrich Drepper <drepper@redhat.com> Acked-by: NDavide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Adrian Bunk 提交于
Trying to compile the v850 port brings many compile errors, one of them exists since at least kernel 2.6.19. There also seems to be noone willing to bring this port back into a usable state. This patch therefore removes the v850 port. If anyone ever decides to revive the v850 port the code will still be available from older kernels, and it wouldn't be impossible for the port to reenter the kernel if it would become actively maintained again. Signed-off-by: NAdrian Bunk <bunk@kernel.org> Acked-by: NGreg Ungerer <gerg@uclinux.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 WANG Cong 提交于
- Make some variables and functions static, since they don't need to be global. - Remove an unused function - arch/um/kernel/time.c::sched_clock(). - Clean the style a bit as complained by checkpatch.pl. Cc: Jeff Dike <jdike@addtoit.com> Signed-off-by: NWANG Cong <wangcong@zeuux.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 WANG Cong 提交于
- Remove arch_validate(), because no one uses it. - Remove useless macro HAVE_ARCH_VALIDATE. - Make the variable 'empty_bad_page' static. Cc: Jeff Dike <jdike@addtoit.com> Signed-off-by: NWANG Cong <wangcong@zeuux.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 WANG Cong 提交于
Make activate_fd() and free_irq_by_irq_and_dev() static. Remove init_aio_irq() since it has no users. Cc: Jeff Dike <jdike@addtoit.com> Signed-off-by: NWANG Cong <wangcong@zeuux.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Akinobu Mita 提交于
Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com> Cc: Mikael Starvik <starvik@axis.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Shaohua Li 提交于
ACPI defines a hardware signature. BIOS calculates the signature according to hardware configure and if hardware changes while hibernated, the signature will change. In that case, S4 resume should fail. Still, there may be systems on which this mechanism does not work correctly, so it is better to provide a workaround for them. For this reason, add a new switch to the acpi_sleep= command line argument allowing one to disable hardware signature checking. [shaohua.li@intel.com: build fix] Signed-off-by: NShaohua Li <shaohua.li@intel.com> Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl> Cc: Andi Kleen <andi@firstfloor.org> Cc: Len Brown <lenb@kernel.org> Acked-by: NPavel Machek <pavel@ucw.cz> Cc: <Valdis.Kletnieks@vt.edu> Cc: Shaohua Li <shaohua.li@intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Adrian Bunk 提交于
Remove the obsolete and no longer used include/linux/pm_legacy.h Reviewed-by: NRobert P. J. Day <rpjday@crashcourse.ca> Signed-off-by: NAdrian Bunk <bunk@kernel.org> Cc: Pavel Machek <pavel@suse.cz> Acked-by: N"Rafael J. Wysocki" <rjw@sisk.pl> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Adrian Bunk 提交于
The real option is named AGP_ALPHA_CORE. Reviewed-by: NRobert P. J. Day <rpjday@crashcourse.ca> Signed-off-by: NAdrian Bunk <bunk@kernel.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Andrea Righi 提交于
On 32-bit architectures PAGE_ALIGN() truncates 64-bit values to the 32-bit boundary. For example: u64 val = PAGE_ALIGN(size); always returns a value < 4GB even if size is greater than 4GB. The problem resides in PAGE_MASK definition (from include/asm-x86/page.h for example): #define PAGE_SHIFT 12 #define PAGE_SIZE (_AC(1,UL) << PAGE_SHIFT) #define PAGE_MASK (~(PAGE_SIZE-1)) ... #define PAGE_ALIGN(addr) (((addr)+PAGE_SIZE-1)&PAGE_MASK) The "~" is performed on a 32-bit value, so everything in "and" with PAGE_MASK greater than 4GB will be truncated to the 32-bit boundary. Using the ALIGN() macro seems to be the right way, because it uses typeof(addr) for the mask. Also move the PAGE_ALIGN() definitions out of include/asm-*/page.h in include/linux/mm.h. See also lkml discussion: http://lkml.org/lkml/2008/6/11/237 [akpm@linux-foundation.org: fix drivers/media/video/uvc/uvc_queue.c] [akpm@linux-foundation.org: fix v850] [akpm@linux-foundation.org: fix powerpc] [akpm@linux-foundation.org: fix arm] [akpm@linux-foundation.org: fix mips] [akpm@linux-foundation.org: fix drivers/media/video/pvrusb2/pvrusb2-dvb.c] [akpm@linux-foundation.org: fix drivers/mtd/maps/uclinux.c] [akpm@linux-foundation.org: fix powerpc] Signed-off-by: NAndrea Righi <righi.andrea@gmail.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Johannes Weiner 提交于
Almost all users of this field need a PFN instead of a physical address, so replace node_boot_start with node_min_pfn. [Lee.Schermerhorn@hp.com: fix spurious BUG_ON() in mark_bootmem()] Signed-off-by: NJohannes Weiner <hannes@saeureba.de> Cc: <linux-arch@vger.kernel.org> Signed-off-by: NLee Schermerhorn <lee.schermerhorn@hp.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Jon Tollefson 提交于
Instead of using the variable mmu_huge_psize to keep track of the huge page size we use an array of MMU_PAGE_* values. For each supported huge page size we need to know the hugepte_shift value and have a pgtable_cache. The hstate or an mmu_huge_psizes index is passed to functions so that they know which huge page size they should use. The hugepage sizes 16M and 64K are setup(if available on the hardware) so that they don't have to be set on the boot cmd line in order to use them. The number of 16G pages have to be specified at boot-time though (e.g. hugepagesz=16G hugepages=5). Signed-off-by: NJon Tollefson <kniht@linux.vnet.ibm.com> Signed-off-by: NNick Piggin <npiggin@suse.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Jon Tollefson 提交于
The huge page size is defined for 16G pages. If a hugepagesz of 16G is specified at boot-time then it becomes the huge page size instead of the default 16M. The change in pgtable-64K.h is to the macro pte_iterate_hashed_subpages to make the increment to va (the 1 being shifted) be a long so that it is not shifted to 0. Otherwise it would create an infinite loop when the shift value is for a 16G page (when base page size is 64K). Signed-off-by: NJon Tollefson <kniht@linux.vnet.ibm.com> Signed-off-by: NNick Piggin <npiggin@suse.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Jon Tollefson 提交于
The 16G huge pages have to be reserved in the HMC prior to boot. The location of the pages are placed in the device tree. This patch adds code to scan the device tree during very early boot and save these page locations until hugetlbfs is ready for them. Acked-by: NAdam Litke <agl@us.ibm.com> Signed-off-by: NJon Tollefson <kniht@linux.vnet.ibm.com> Signed-off-by: NNick Piggin <npiggin@suse.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-