提交 · c71d227fc4133f949dae620ed5e3a250b43f2415 · openanolis / cloud-kernel

30 11月, 2017 1 次提交

make kernel-side POLL... arch-independent · c71d227f

由 Al Viro 提交于 11月 29, 2017

mangle/demangle on the way to/from userland
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

c71d227f

28 11月, 2017 1 次提交
- A
  define __poll_t, annotate constants · 8ced390c
  由 Al Viro 提交于 7月 02, 2017
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  8ced390c
15 11月, 2017 1 次提交

vDSO for sparc · 9a08862a

由 Nagarathnam Muthusamy 提交于 9月 21, 2017

Following patch is based on work done by Nick Alcock on 64-bit vDSO for sparc
in Oracle linux. I have extended it to include support for 32-bit vDSO for sparc
on 64-bit kernel.

vDSO for sparc is based on the X86 implementation. This patch
provides vDSO support for both 64-bit and 32-bit programs on 64-bit kernel.
vDSO will be disabled on 32-bit linux kernel on sparc.

*) vclock_gettime.c contains all the vdso functions. Since data page is mapped
   before the vdso code page, the pointer to data page is got by subracting offset
   from an address in the vdso code page. The return address stored in
   %i7 is used for this purpose.
*) During compilation, both 32-bit and 64-bit vdso images are compiled and are
   converted into raw bytes by vdso2c program to be ready for mapping into the
   process. 32-bit images are compiled only if CONFIG_COMPAT is enabled. vdso2c
   generates two files vdso-image-64.c and vdso-image-32.c which contains the
   respective vDSO image in C structure.
*) During vdso initialization, required number of vdso pages are allocated and
   raw bytes are copied into the pages.
*) During every exec, these pages are mapped into the process through
   arch_setup_additional_pages and the location of mapping is passed on to the
   process through aux vector AT_SYSINFO_EHDR which is used by glibc.
*) A new update_vsyscall routine for sparc is added to keep the data page in
   vdso updated.
*) As vDSO cannot contain dynamically relocatable references, a new version of
   cpu_relax is added for the use of vDSO.

This change also requires a putback to glibc to use vDSO. For testing,
programs planning to try vDSO can be compiled against the generated
vdso(64/32).so in the source.

Testing:

========
[root@localhost ~]# cat vdso_test.c
int main() {
        struct timespec tv_start, tv_end;
        struct timeval tv_tmp;
	int i;
        int count = 1 * 1000 * 10000;
	long long diff;

        clock_gettime(0, &tv_start);
        for (i = 0; i < count; i++)
              gettimeofday(&tv_tmp, NULL);
        clock_gettime(0, &tv_end);
        diff = (long long)(tv_end.tv_sec -
		tv_start.tv_sec)*(1*1000*1000*1000);
        diff += (tv_end.tv_nsec - tv_start.tv_nsec);
	printf("Start sec: %d\n", tv_start.tv_sec);
	printf("End sec  : %d\n", tv_end.tv_sec);
        printf("%d cycles in %lld ns = %f ns/cycle\n", count, diff,
		(double)diff / (double)count);
        return 0;
}

[root@localhost ~]# cc vdso_test.c -o t32_without_fix -m32 -lrt
[root@localhost ~]# ./t32_without_fix
Start sec: 1502396130
End sec  : 1502396140
10000000 cycles in 9565148528 ns = 956.514853 ns/cycle
[root@localhost ~]# cc vdso_test.c -o t32_with_fix -m32 ./vdso32.so.dbg
[root@localhost ~]# ./t32_with_fix
Start sec: 1502396168
End sec  : 1502396169
10000000 cycles in 798141262 ns = 79.814126 ns/cycle
[root@localhost ~]# cc vdso_test.c -o t64_without_fix -m64 -lrt
[root@localhost ~]# ./t64_without_fix
Start sec: 1502396208
End sec  : 1502396218
10000000 cycles in 9846091800 ns = 984.609180 ns/cycle
[root@localhost ~]# cc vdso_test.c -o t64_with_fix -m64 ./vdso64.so.dbg
[root@localhost ~]# ./t64_with_fix
Start sec: 1502396257
End sec  : 1502396257
10000000 cycles in 380984048 ns = 38.098405 ns/cycle

V1 to V2 Changes:
=================
	Added hot patching code to switch the read stick instruction to read
tick instruction based on the hardware.

V2 to V3 Changes:
=================
	Merged latest changes from sparc-next and moved the initialization
of clocksource_tick.archdata.vclock_mode to time_init_early. Disabled
queued spinlock and rwlock configuration when simulating 32-bit config
to compile 32-bit VDSO.

V3 to V4 Changes:
=================
	Hardcoded the page size as 8192 in linker script for both 64-bit and
32-bit binaries. Removed unused variables in vdso2c.h. Added -mv8plus flag to
Makefile to prevent the generation of relocation entries for __lshrdi3 in 32-bit
vdso binary.
Signed-off-by: NNick Alcock <nick.alcock@oracle.com>
Signed-off-by: NNagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com>
Reviewed-by: NShannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9a08862a

02 11月, 2017 1 次提交

License cleanup: add SPDX license identifier to uapi header files with no license · 6f52b16c

由 Greg Kroah-Hartman 提交于 11月 01, 2017

Many user space API headers are missing licensing information, which
makes it hard for compliance tools to determine the correct license.

By default are files without license information under the default
license of the kernel, which is GPLV2. Marking them GPLV2 would exclude
them from being included in non GPLV2 code, which is obviously not
intended. The user space API headers fall under the syscall exception
which is in the kernels COPYING file:

NOTE! This copyright does *not* cover user programs that use kernel
services by normal system calls - this is merely considered normal use
of the kernel, and does *not* fall under the heading of "derived work".

otherwise syscall usage would not be possible.

Update the files which contain no license information with an SPDX
license identifier. The chosen identifier is 'GPL-2.0 WITH
Linux-syscall-note' which is the officially assigned identifier for the
Linux syscall exception. SPDX license identifiers are a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne. See the previous patch in this series for the
methodology of how this patch was researched.
Reviewed-by: NKate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: NPhilippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

6f52b16c

04 8月, 2017 1 次提交

sock: add SOCK_ZEROCOPY sockopt · 76851d12

由 Willem de Bruijn 提交于 8月 03, 2017

The send call ignores unknown flags. Legacy applications may already
unwittingly pass MSG_ZEROCOPY. Continue to ignore this flag unless a
socket opts in to zerocopy.

Introduce socket option SO_ZEROCOPY to enable MSG_ZEROCOPY processing.
Processes can also query this socket option to detect kernel support
for the feature. Older kernels will return ENOPROTOOPT.
Signed-off-by: NWillem de Bruijn <willemb@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

76851d12

25 7月, 2017 1 次提交

signal: Remove kernel interal si_code magic · cc731525

由 Eric W. Biederman 提交于 7月 16, 2017

struct siginfo is a union and the kernel since 2.4 has been hiding a union
tag in the high 16bits of si_code using the values:
__SI_KILL
__SI_TIMER
__SI_POLL
__SI_FAULT
__SI_CHLD
__SI_RT
__SI_MESGQ
__SI_SYS

While this looks plausible on the surface, in practice this situation has
not worked well.

- Injected positive signals are not copied to user space properly
  unless they have these magic high bits set.

- Injected positive signals are not reported properly by signalfd
  unless they have these magic high bits set.

- These kernel internal values leaked to userspace via ptrace_peek_siginfo

- It was possible to inject these kernel internal values and cause the
  the kernel to misbehave.

- Kernel developers got confused and expected these kernel internal values
  in userspace in kernel self tests.

- Kernel developers got confused and set si_code to __SI_FAULT which
  is SI_USER in userspace which causes userspace to think an ordinary user
  sent the signal and that it was not kernel generated.

- The values make it impossible to reorganize the code to transform
  siginfo_copy_to_user into a plain copy_to_user.  As si_code must
  be massaged before being passed to userspace.

So remove these kernel internal si codes and make the kernel code simpler
and more maintainable.

To replace these kernel internal magic si_codes introduce the helper
function siginfo_layout, that takes a signal number and an si_code and
computes which union member of siginfo is being used.  Have
siginfo_layout return an enumeration so that gcc will have enough
information to warn if a switch statement does not handle all of union
members.

A couple of architectures have a messed up ABI that defines signal
specific duplications of SI_USER which causes more special cases in
siginfo_layout than I would like.  The good news is only problem
architectures pay the cost.

Update all of the code that used the previous magic __SI_ values to
use the new SIL_ values and to call siginfo_layout to get those
values.  Escept where not all of the cases are handled remove the
defaults in the switch statements so that if a new case is missed in
the future the lack will show up at compile time.

Modify the code that copies siginfo si_code to userspace to just copy
the value and not cast si_code to a short first.  The high bits are no
longer used to hold a magic union member.

Fixup the siginfo header files to stop including the __SI_ values in
their constants and for the headers that were missing it to properly
update the number of si_codes for each signal type.

The fixes to copy_siginfo_from_user32 implementations has the
interesting property that several of them perviously should never have
worked as the __SI_ values they depended up where kernel internal.
With that dependency gone those implementations should work much
better.

The idea of not passing the __SI_ values out to userspace and then
not reinserting them has been tested with criu and criu worked without
changes.

Ref: 2.4.0-test1
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

cc731525

20 7月, 2017 1 次提交

signal/sparc: Document a conflict with SI_USER with SIGFPE · cc9f72e4

由 Eric W. Biederman 提交于 7月 16, 2017

Setting si_code to __SI_FAULT results in a userspace seeing
an si_code of 0.  This is the same si_code as SI_USER.  Posix
and common sense requires that SI_USER not be a signal specific
si_code.  As such this use of 0 for the si_code is a pretty
horribly broken ABI.

This was introduced in 2.3.41 so this mess has had a long time for
people to be able to start depending on it.

As this bug has existed for 17 years already I don't know if it is
worth fixing.  It is definitely worth documenting what is going
on so that no one decides to copy this bad decision.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: sparclinux@vger.kernel.org
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

cc9f72e4

17 7月, 2017 1 次提交

tty: Fix TIOCGPTPEER ioctl definition · c6325179

由 Gleb Fotengauer-Malinovskiy 提交于 7月 17, 2017

This ioctl does nothing to justify an _IOC_READ or _IOC_WRITE flag
because it doesn't copy anything from/to userspace to access the
argument.

Fixes: 54ebbfb1 ("tty: add TIOCGPTPEER ioctl")
Signed-off-by: NGleb Fotengauer-Malinovskiy <glebfm@altlinux.org>
Acked-by: NAleksa Sarai <asarai@suse.de>
Acked-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

c6325179

11 7月, 2017 1 次提交

sparc: move generic-y of exported headers to uapi/asm/Kbuild · b6744e04

由 Masahiro Yamada 提交于 7月 10, 2017

Since commit fcc8487d ("uapi: export all headers under uapi
directories"), all (and only) headers under uapi directories are
exported, but asm-generic wrappers are still exceptions.

To complete de-coupling the uapi from kernel headers, move generic-y
of exported headers to uapi/asm/Kbuild.

With this change, "make headers_install" will just need to parse
uapi/asm/Kbuild to build up exported headers.
Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>

b6744e04

21 6月, 2017 1 次提交

net: introduce SO_PEERGROUPS getsockopt · 28b5ba2a

由 David Herrmann 提交于 6月 21, 2017

This adds the new getsockopt(2) option SO_PEERGROUPS on SOL_SOCKET to
retrieve the auxiliary groups of the remote peer. It is designed to
naturally extend SO_PEERCRED. That is, the underlying data is from the
same credentials. Regarding its syntax, it is based on SO_PEERSEC. That
is, if the provided buffer is too small, ERANGE is returned and @optlen
is updated. Otherwise, the information is copied, @optlen is set to the
actual size, and 0 is returned.

While SO_PEERCRED (and thus `struct ucred') already returns the primary
group, it lacks the auxiliary group vector. However, nearly all access
controls (including kernel side VFS and SYSVIPC, but also user-space
polkit, DBus, ...) consider the entire set of groups, rather than just
the primary group. But this is currently not possible with pure
SO_PEERCRED. Instead, user-space has to work around this and query the
system database for the auxiliary groups of a UID retrieved via
SO_PEERCRED.

Unfortunately, there is no race-free way to query the auxiliary groups
of the PID/UID retrieved via SO_PEERCRED. Hence, the current user-space
solution is to use getgrouplist(3p), which itself falls back to NSS and
whatever is configured in nsswitch.conf(3). This effectively checks
which groups we *would* assign to the user if it logged in *now*. On
normal systems it is as easy as reading /etc/group, but with NSS it can
resort to quering network databases (eg., LDAP), using IPC or network
communication.

Long story short: Whenever we want to use auxiliary groups for access
checks on IPC, we need further IPC to talk to the user/group databases,
rather than just relying on SO_PEERCRED and the incoming socket. This
is unfortunate, and might even result in dead-locks if the database
query uses the same IPC as the original request.

So far, those recursions / dead-locks have been avoided by using
primitive IPC for all crucial NSS modules. However, we want to avoid
re-inventing the wheel for each NSS module that might be involved in
user/group queries. Hence, we would preferably make DBus (and other IPC
that supports access-management based on groups) work without resorting
to the user/group database. This new SO_PEERGROUPS ioctl would allow us
to make dbus-daemon work without ever calling into NSS.

Cc: Michal Sekletar <msekleta@redhat.com>
Cc: Simon McVittie <simon.mcvittie@collabora.co.uk>
Reviewed-by: NTom Gundersen <teg@jklm.no>
Signed-off-by: NDavid Herrmann <dh.herrmann@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

28b5ba2a

09 6月, 2017 1 次提交

tty: add TIOCGPTPEER ioctl · 54ebbfb1

由 Aleksa Sarai 提交于 6月 04, 2017

When opening the slave end of a PTY, it is not possible for userspace to
safely ensure that /dev/pts/$num is actually a slave (in cases where the
mount namespace in which devpts was mounted is controlled by an
untrusted process). In addition, there are several unresolvable
race conditions if userspace were to attempt to detect attacks through
stat(2) and other similar methods [in addition it is not clear how
userspace could detect attacks involving FUSE].

Resolve this by providing an interface for userpace to safely open the
"peer" end of a PTY file descriptor by using the dentry cached by
devpts. Since it is not possible to have an open master PTY without
having its slave exposed in /dev/pts this interface is safe. This
interface currently does not provide a way to get the master pty (since
it is not clear whether such an interface is safe or even useful).

Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Valentin Rothberg <vrothberg@suse.com>
Signed-off-by: NAleksa Sarai <asarai@suse.de>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

54ebbfb1

22 5月, 2017 1 次提交

net: Define SCM_TIMESTAMPING_PKTINFO on all architectures. · 1c4f676a

由 David S. Miller 提交于 5月 21, 2017

A definition was only provided for asm-generic/socket.h
using platforms, define it for the others as well
Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1c4f676a

10 5月, 2017 1 次提交

uapi: export all headers under uapi directories · fcc8487d

由 Nicolas Dichtel 提交于 3月 27, 2017

Regularly, when a new header is created in include/uapi/, the developer
forgets to add it in the corresponding Kbuild file. This error is usually
detected after the release is out.

In fact, all headers under uapi directories should be exported, thus it's
useless to have an exhaustive list.

After this patch, the following files, which were not exported, are now
exported (with make headers_install_all):
asm-arc/kvm_para.h
asm-arc/ucontext.h
asm-blackfin/shmparam.h
asm-blackfin/ucontext.h
asm-c6x/shmparam.h
asm-c6x/ucontext.h
asm-cris/kvm_para.h
asm-h8300/shmparam.h
asm-h8300/ucontext.h
asm-hexagon/shmparam.h
asm-m32r/kvm_para.h
asm-m68k/kvm_para.h
asm-m68k/shmparam.h
asm-metag/kvm_para.h
asm-metag/shmparam.h
asm-metag/ucontext.h
asm-mips/hwcap.h
asm-mips/reg.h
asm-mips/ucontext.h
asm-nios2/kvm_para.h
asm-nios2/ucontext.h
asm-openrisc/shmparam.h
asm-parisc/kvm_para.h
asm-powerpc/perf_regs.h
asm-sh/kvm_para.h
asm-sh/ucontext.h
asm-tile/shmparam.h
asm-unicore32/shmparam.h
asm-unicore32/ucontext.h
asm-x86/hwcap2.h
asm-xtensa/kvm_para.h
drm/armada_drm.h
drm/etnaviv_drm.h
drm/vgem_drm.h
linux/aspeed-lpc-ctrl.h
linux/auto_dev-ioctl.h
linux/bcache.h
linux/btrfs_tree.h
linux/can/vxcan.h
linux/cifs/cifs_mount.h
linux/coresight-stm.h
linux/cryptouser.h
linux/fsmap.h
linux/genwqe/genwqe_card.h
linux/hash_info.h
linux/kcm.h
linux/kcov.h
linux/kfd_ioctl.h
linux/lightnvm.h
linux/module.h
linux/nbd-netlink.h
linux/nilfs2_api.h
linux/nilfs2_ondisk.h
linux/nsfs.h
linux/pr.h
linux/qrtr.h
linux/rpmsg.h
linux/sched/types.h
linux/sed-opal.h
linux/smc.h
linux/smc_diag.h
linux/stm.h
linux/switchtec_ioctl.h
linux/vfio_ccw.h
linux/wil6210_uapi.h
rdma/bnxt_re-abi.h

Note that I have removed from this list the files which are generated in every
exported directories (like .install or .install.cmd).

Thanks to Julien Floret <julien.floret@6wind.com> for the tip to get all
subdirs with a pure makefile command.

For the record, note that exported files for asm directories are a mix of
files listed by:
 - include/uapi/asm-generic/Kbuild.asm;
 - arch/<arch>/include/uapi/asm/Kbuild;
 - arch/<arch>/include/asm/Kbuild.
Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: NRussell King <rmk+kernel@armlinux.org.uk>
Acked-by: NMark Salter <msalter@redhat.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>

fcc8487d

24 4月, 2017 1 次提交

sparc: Update syscall tables. · f6ebf0bb

由 David S. Miller 提交于 4月 23, 2017

Hook up statx.

Ignore pkeys system calls, we don't have protection keeys
on SPARC.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f6ebf0bb

08 4月, 2017 1 次提交

New getsockopt option to get socket cookie · 5daab9db

由 Chenbo Feng 提交于 4月 05, 2017

Introduce a new getsockopt operation to retrieve the socket cookie
for a specific socket based on the socket fd. It returns a unique
non-decreasing cookie for each socket.
Tested: https://android-review.googlesource.com/#/c/358163/Acked-by: NWillem de Bruijn <willemb@google.com>
Signed-off-by: NChenbo Feng <fengc@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5daab9db

25 3月, 2017 1 次提交

net: Introduce SO_INCOMING_NAPI_ID · 6d433902

由 Sridhar Samudrala 提交于 3月 24, 2017

This socket option returns the NAPI ID associated with the queue on which
the last frame is received. This information can be used by the apps to
split the incoming flows among the threads based on the Rx queue on which
they are received.

If the NAPI ID actually represents a sender_cpu then the value is ignored
and 0 is returned.
Signed-off-by: NSridhar Samudrala <sridhar.samudrala@intel.com>
Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6d433902

23 3月, 2017 1 次提交

sock: introduce SO_MEMINFO getsockopt · a2d133b1

由 Josh Hunt 提交于 3月 20, 2017

Allows reading of SK_MEMINFO_VARS via socket option. This way an
application can get all meminfo related information in single socket
option call instead of multiple calls.

Adds helper function, sk_get_meminfo(), and uses that for both
getsockopt and sock_diag_put_meminfo().

Suggested by Eric Dumazet.
Signed-off-by: NJosh Hunt <johunt@akamai.com>
Reviewed-by: NJason Baron <jbaron@akamai.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a2d133b1

30 11月, 2016 1 次提交

tcp: SOF_TIMESTAMPING_OPT_STATS option for SO_TIMESTAMPING · 1c885808

由 Francis Yan 提交于 11月 27, 2016

This patch exports the sender chronograph stats via the socket
SO_TIMESTAMPING channel. Currently we can instrument how long a
particular application unit of data was queued in TCP by tracking
SOF_TIMESTAMPING_TX_SOFTWARE and SOF_TIMESTAMPING_TX_SCHED. Having
these sender chronograph stats exported simultaneously along with
these timestamps allow further breaking down the various sender
limitation.  For example, a video server can tell if a particular
chunk of video on a connection takes a long time to deliver because
TCP was experiencing small receive window. It is not possible to
tell before this patch without packet traces.

To prepare these stats, the user needs to set
SOF_TIMESTAMPING_OPT_STATS and SOF_TIMESTAMPING_OPT_TSONLY flags
while requesting other SOF_TIMESTAMPING TX timestamps. When the
timestamps are available in the error queue, the stats are returned
in a separate control message of type SCM_TIMESTAMPING_OPT_STATS,
in a list of TLVs (struct nlattr) of types: TCP_NLA_BUSY_TIME,
TCP_NLA_RWND_LIMITED, TCP_NLA_SNDBUF_LIMITED. Unit is microsecond.
Signed-off-by: NFrancis Yan <francisyyan@gmail.com>
Signed-off-by: NYuchung Cheng <ycheng@google.com>
Signed-off-by: NSoheil Hassas Yeganeh <soheil@google.com>
Acked-by: NNeal Cardwell <ncardwell@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1c885808

30 3月, 2016 1 次提交
- D
  sparc: Write up preadv2/pwritev2 syscalls. · 5ec71293
  由 David S. Miller 提交于 3月 29, 2016
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  5ec71293
21 3月, 2016 1 次提交

sparc: Convert naked unsigned uses to unsigned int · 9ef595d8

由 Joe Perches 提交于 3月 10, 2016

Use the more normal kernel definition/declaration style.

Done via:

$ git ls-files arch/sparc | \
  xargs ./scripts/checkpatch.pl -f --fix-inplace --types=unspecified_int
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9ef595d8

26 2月, 2016 1 次提交

net: Facility to report route quality of connected sockets · a87cb3e4

由 Tom Herbert 提交于 2月 24, 2016

This patch add the SO_CNX_ADVICE socket option (setsockopt only). The
purpose is to allow an application to give feedback to the kernel about
the quality of the network path for a connected socket. The value
argument indicates the type of quality report. For this initial patch
the only supported advice is a value of 1 which indicates "bad path,
please reroute"-- the action taken by the kernel is to call
dst_negative_advice which will attempt to choose a different ECMP route,
reset the TX hash for flow label and UDP source port in encapsulation,
etc.

This facility should be useful for connected UDP sockets where only the
application can provide any feedback about path quality. It could also
be useful for TCP applications that have additional knowledge about the
path outside of the normal TCP control loop.
Signed-off-by: NTom Herbert <tom@herbertland.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a87cb3e4

22 1月, 2016 1 次提交
- D
  sparc: Hook up copy_file_range syscall. · c10910c3
  由 David S. Miller 提交于 1月 21, 2016
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  c10910c3
05 1月, 2016 1 次提交

soreuseport: setsockopt SO_ATTACH_REUSEPORT_[CE]BPF · 538950a1

由 Craig Gallek 提交于 1月 04, 2016

Expose socket options for setting a classic or extended BPF program
for use when selecting sockets in an SO_REUSEPORT group.  These options
can be used on the first socket to belong to a group before bind or
on any socket in the group after bind.

This change includes refactoring of the existing sk_filter code to
allow reuse of the existing BPF filter validation checks.
Signed-off-by: NCraig Gallek <kraig@google.com>
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

538950a1

01 1月, 2016 2 次提交

D
sparc: Wire up mlock2 system call. · 42d85c52
由 David S. Miller 提交于 12月 31, 2015
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
42d85c52

sparc: Add all necessary direct socket system calls. · 8b30ca73

由 David S. Miller 提交于 12月 31, 2015

The GLIBC folks would like to eliminate socketcall support
eventually, and this makes sense regardless so wire them
all up.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8b30ca73

24 12月, 2015 1 次提交

sparc: Hook up userfaultfd system call · 9bcfd78a

由 Mike Kravetz 提交于 11月 20, 2015

After hooking up system call, userfaultfd selftest was successful for
both 32 and 64 bit version of test.
Signed-off-by: NMike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9bcfd78a

10 11月, 2015 1 次提交

sparc/sparc64: allocate sys_membarrier system call number · 9c2d5eeb

由 Mathieu Desnoyers 提交于 11月 09, 2015

Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com>
Acked-by: N"David S. Miller" <davem@davemloft.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9c2d5eeb

06 11月, 2015 1 次提交

mm: mlock: add mlock flags to enable VM_LOCKONFAULT usage · b0f205c2

由 Eric B Munson 提交于 11月 05, 2015

The previous patch introduced a flag that specified pages in a VMA should
be placed on the unevictable LRU, but they should not be made present when
the area is created.  This patch adds the ability to set this state via
the new mlock system calls.

We add MLOCK_ONFAULT for mlock2 and MCL_ONFAULT for mlockall.
MLOCK_ONFAULT will set the VM_LOCKONFAULT modifier for VM_LOCKED.
MCL_ONFAULT should be used as a modifier to the two other mlockall flags.
When used with MCL_CURRENT, all current mappings will be marked with
VM_LOCKED | VM_LOCKONFAULT.  When used with MCL_FUTURE, the mm->def_flags
will be marked with VM_LOCKED | VM_LOCKONFAULT.  When used with both
MCL_CURRENT and MCL_FUTURE, all current mappings and mm->def_flags will be
marked with VM_LOCKED | VM_LOCKONFAULT.

Prior to this patch, mlockall() will unconditionally clear the
mm->def_flags any time it is called without MCL_FUTURE.  This behavior is
maintained after adding MCL_ONFAULT.  If a call to mlockall(MCL_FUTURE) is
followed by mlockall(MCL_CURRENT), the mm->def_flags will be cleared and
new VMAs will be unlocked.  This remains true with or without MCL_ONFAULT
in either mlockall() invocation.

munlock() will unconditionally clear both vma flags.  munlockall()
unconditionally clears for VMA flags on all VMAs and in the mm->def_flags
field.
Signed-off-by: NEric B Munson <emunson@akamai.com>
Acked-by: NMichal Hocko <mhocko@suse.com>
Acked-by: NVlastimil Babka <vbabka@suse.cz>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Shuah Khan <shuahkh@osg.samsung.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b0f205c2

08 8月, 2015 1 次提交

sparc64: Fix incorrect ASI_ST_BLKINIT_MRU_S value · c1b3475d

由 Rob Gardner 提交于 8月 07, 2015

ASI_ST_BLKINIT_MRU_S is incorrectly defined as 0xf2 when it
should be 0xf3 according to section 10.3.1 of the Sparc
Architecture manual.
Signed-off-by: NRob Gardner <rob.gardner@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c1b3475d

07 8月, 2015 1 次提交

treewide: Fix typo compatability -> compatibility · 60acc4eb

由 Laurent Pinchart 提交于 5月 27, 2015

Even though 'compatability' has a dedicated entry in the Wiktionary,
it's listed as 'Mispelling of compatibility'. Fix it.
Signed-off-by: NLaurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> for the atomic_helper.c
Signed-off-by: NJiri Kosina <jkosina@suse.com>

60acc4eb

14 12月, 2014 1 次提交

sparc: hook up execveat system call · 38351a32

由 David Drysdale 提交于 12月 12, 2014

Signed-off-by: NDavid Drysdale <drysdale@google.com>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

38351a32

06 12月, 2014 1 次提交

net: sock: allow eBPF programs to be attached to sockets · 89aa0758

由 Alexei Starovoitov 提交于 12月 01, 2014

introduce new setsockopt() command:

setsockopt(sock, SOL_SOCKET, SO_ATTACH_BPF, &prog_fd, sizeof(prog_fd))

where prog_fd was received from syscall bpf(BPF_PROG_LOAD, attr, ...)
and attr->prog_type == BPF_PROG_TYPE_SOCKET_FILTER

setsockopt() calls bpf_prog_get() which increments refcnt of the program,
so it doesn't get unloaded while socket is using the program.

The same eBPF program can be attached to multiple sockets.

User task exit automatically closes socket which calls sk_filter_uncharge()
which decrements refcnt of eBPF program
Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

89aa0758

17 11月, 2014 1 次提交

sparc64: Fix constraints on swab helpers. · 5a2b59d3

由 David S. Miller 提交于 11月 16, 2014

We are reading the memory location, so we have to have a memory
constraint in there purely for the sake of showing the data flow
to the compiler.
Reported-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5a2b59d3

12 11月, 2014 1 次提交

net: introduce SO_INCOMING_CPU · 2c8c56e1

由 Eric Dumazet 提交于 11月 11, 2014

Alternative to RPS/RFS is to use hardware support for multiple
queues.

Then split a set of million of sockets into worker threads, each
one using epoll() to manage events on its own socket pool.

Ideally, we want one thread per RX/TX queue/cpu, but we have no way to
know after accept() or connect() on which queue/cpu a socket is managed.

We normally use one cpu per RX queue (IRQ smp_affinity being properly
set), so remembering on socket structure which cpu delivered last packet
is enough to solve the problem.

After accept(), connect(), or even file descriptor passing around
processes, applications can use :

 int cpu;
 socklen_t len = sizeof(cpu);

 getsockopt(fd, SOL_SOCKET, SO_INCOMING_CPU, &cpu, &len);

And use this information to put the socket into the right silo
for optimal performance, as all networking stack should run
on the appropriate cpu, without need to send IPI (RPS/RFS).
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2c8c56e1

29 10月, 2014 1 次提交
- D
  sparc: Hook up bpf system call. · c20ce793
  由 David S. Miller 提交于 10月 28, 2014
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  c20ce793
10 9月, 2014 1 次提交

sparc/uapi: Add definition of TIOC[SG]RS485 · 999156ad

由 Ricardo Ribalda Delgado 提交于 9月 09, 2014

Commit: e676253b (serial/8250: Add
support for RS485 IOCTLs), adds support for RS485 ioctls for 825_core on
all the archs. Unfortunaltely the definition of TIOCSRS485 and
TIOCGRS485 was missing on the ioctls.h file
Signed-off-by: NRicardo Ribalda Delgado <ricardo.ribalda@gmail.com>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

999156ad

14 8月, 2014 1 次提交
- D
  sparc: Hook up memfd_create system call. · 10cf15e1
  由 David S. Miller 提交于 8月 13, 2014
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  10cf15e1
07 8月, 2014 1 次提交
- D
  sparc: Hook up seccomp and getrandom system calls. · caa9199b
  由 David S. Miller 提交于 8月 06, 2014
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  caa9199b
22 7月, 2014 1 次提交
- D
  sparc: Hook up renameat2 syscall. · 26053926
  由 David S. Miller 提交于 7月 21, 2014
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  26053926
29 1月, 2014 1 次提交
- D
  sparc: Hook up sched_setattr and sched_getattr syscalls. · a54983ae
  由 David S. Miller 提交于 1月 29, 2014
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  a54983ae

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功