提交 · e0d1095ae3405404d247afb00233ef837d58da83 · openanolis / cloud-kernel

02 8月, 2013 1 次提交

net: rename CONFIG_NET_LL_RX_POLL to CONFIG_NET_RX_BUSY_POLL · e0d1095a

由 Cong Wang 提交于 8月 01, 2013

Eliezer renames several *ll_poll to *busy_poll, but forgets
CONFIG_NET_LL_RX_POLL, so in case of confusion, rename it too.

Cc: Eliezer Tamir <eliezer.tamir@linux.intel.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: NCong Wang <amwang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e0d1095a

11 7月, 2013 2 次提交

net: rename busy poll socket op and globals · 64b0dc51

由 Eliezer Tamir 提交于 7月 10, 2013

Rename LL_SO to BUSY_POLL_SO
Rename sysctl_net_ll_{read,poll} to sysctl_busy_{read,poll}
Fix up users of these variables.
Fix documentation for sysctl.

a patch for the socket.7  man page will follow separately,
because of limitations of my mail setup.
Signed-off-by: NEliezer Tamir <eliezer.tamir@linux.intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

64b0dc51

net: rename include/net/ll_poll.h to include/net/busy_poll.h · 076bb0c8

由 Eliezer Tamir 提交于 7月 10, 2013

Rename the file and correct all the places where it is included.
Signed-off-by: NEliezer Tamir <eliezer.tamir@linux.intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

076bb0c8

09 7月, 2013 1 次提交

net: rename low latency sockets functions to busy poll · cbf55001

由 Eliezer Tamir 提交于 7月 08, 2013

Rename functions in include/net/ll_poll.h to busy wait.
Clarify documentation about expected power use increase.
Rename POLL_LL to POLL_BUSY_LOOP.
Add need_resched() testing to poll/select busy loops.

Note, that in select and poll can_busy_poll is dynamic and is
updated continuously to reflect the existence of supported
sockets with valid queue information.
Signed-off-by: NEliezer Tamir <eliezer.tamir@linux.intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cbf55001

26 6月, 2013 1 次提交

net: poll/select low latency socket support · 2d48d67f

由 Eliezer Tamir 提交于 6月 24, 2013

select/poll busy-poll support.

Split sysctl value into two separate ones, one for read and one for poll.
updated Documentation/sysctl/net.txt

Add a new poll flag POLL_LL. When this flag is set, sock_poll will call
sk_poll_ll if possible. sock_poll sets this flag in its return value
to indicate to select/poll when a socket that can busy poll is found.

When poll/select have nothing to report, call the low-level
sock_poll again until we are out of time or we find something.

Once the system call finds something, it stops setting POLL_LL, so it can
return the result to the user ASAP.
Signed-off-by: NEliezer Tamir <eliezer.tamir@linux.intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2d48d67f

18 6月, 2013 2 次提交

net: add socket option for low latency polling · dafcc438

由 Eliezer Tamir 提交于 6月 14, 2013

adds a socket option for low latency polling.
This allows overriding the global sysctl value with a per-socket one.
Unexport sysctl_net_ll_poll since for now it's not needed in modules.
Signed-off-by: NEliezer Tamir <eliezer.tamir@linux.intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dafcc438

net: change sysctl_net_ll_poll into an unsigned int · eb6db622

由 Eliezer Tamir 提交于 6月 14, 2013

There is no reason for sysctl_net_ll_poll to be an unsigned long.
Change it into an unsigned int.
Fix the proc handler.
Signed-off-by: NEliezer Tamir <eliezer.tamir@linux.intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

eb6db622

11 6月, 2013 1 次提交

net: add low latency socket poll · 06021292

由 Eliezer Tamir 提交于 6月 10, 2013

Adds an ndo_ll_poll method and the code that supports it.
This method can be used by low latency applications to busy-poll
Ethernet device queues directly from the socket code.
sysctl_net_ll_poll controls how many microseconds to poll.
Default is zero (disabled).
Individual protocol support will be added by subsequent patches.
Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: NEliezer Tamir <eliezer.tamir@linux.intel.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Tested-by: NWillem de Bruijn <willemb@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

06021292

07 6月, 2013 1 次提交

net: Unbreak compat_sys_{send,recv}msg · a7526eb5

由 Andy Lutomirski 提交于 6月 05, 2013

I broke them in this commit:

    commit 1be374a0
    Author: Andy Lutomirski <luto@amacapital.net>
    Date:   Wed May 22 14:07:44 2013 -0700

        net: Block MSG_CMSG_COMPAT in send(m)msg and recv(m)msg

This patch adds __sys_sendmsg and __sys_sendmsg as common helpers that accept
MSG_CMSG_COMPAT and blocks MSG_CMSG_COMPAT at the syscall entrypoints.  It
also reverts some unnecessary checks in sys_socketcall.

Apparently I was suffering from underscore blindness the first time around.
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
Tested-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a7526eb5

29 5月, 2013 1 次提交

net: Block MSG_CMSG_COMPAT in send(m)msg and recv(m)msg · 1be374a0

由 Andy Lutomirski 提交于 5月 22, 2013

To: linux-kernel@vger.kernel.org
Cc: x86@kernel.org, trinity@vger.kernel.org, Andy Lutomirski <luto@amacapital.net>, netdev@vger.kernel.org, "David S.
	Miller" <davem@davemloft.net>
Subject: [PATCH 5/5] net: Block MSG_CMSG_COMPAT in send(m)msg and recv(m)msg

MSG_CMSG_COMPAT is (AFAIK) not intended to be part of the API --
it's a hack that steals a bit to indicate to other networking code
that a compat entry was used.  So don't allow it from a non-compat
syscall.

This prevents an oops when running this code:

int main()
{
	int s;
	struct sockaddr_in addr;
	struct msghdr *hdr;

	char *highpage = mmap((void*)(TASK_SIZE_MAX - 4096), 4096,
	                      PROT_READ | PROT_WRITE,
	                      MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED, -1, 0);
	if (highpage == MAP_FAILED)
		err(1, "mmap");

	s = socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP);
	if (s == -1)
		err(1, "socket");

        addr.sin_family = AF_INET;
        addr.sin_port = htons(1);
        addr.sin_addr.s_addr = htonl(INADDR_LOOPBACK);
	if (connect(s, (struct sockaddr*)&addr, sizeof(addr)) != 0)
		err(1, "connect");

	void *evil = highpage + 4096 - COMPAT_MSGHDR_SIZE;
	printf("Evil address is %p\n", evil);

	if (syscall(__NR_sendmmsg, s, evil, 1, MSG_CMSG_COMPAT) < 0)
		err(1, "sendmmsg");

	return 0;
}

Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1be374a0

23 5月, 2013 1 次提交

netfilter: don't panic on error while walking through the init path · 6d11cfdb

由 Pablo Neira Ayuso 提交于 5月 22, 2013

Don't panic if we hit an error while adding the nf_log or pernet
netfilter support, just bail out.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
Acked-by: NGao feng <gaofeng@cn.fujitsu.com>

6d11cfdb

30 4月, 2013 1 次提交
- A
  sock_close() couldn't have been called with NULL inode since at least 2.1.early · 0e2bcaae
  由 Al Viro 提交于 4月 14, 2013
```
... if not since 0.99 or so.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  0e2bcaae
20 4月, 2013 1 次提交

net: socket: move ktime2ts to ktime header api · 6e94d1ef

由 Daniel Borkmann 提交于 4月 16, 2013

Currently, ktime2ts is a small helper function that is only used in
net/socket.c. Move this helper into the ktime API as a small inline
function, so that i) it's maintained together with ktime routines,
and ii) also other files can make use of it. The function is named
ktime_to_timespec_cond() and placed into the generic part of ktime,
since we internally make use of ktime_to_timespec(). ktime_to_timespec()
itself does not check the ktime variable for zero, hence, we name
this function ktime_to_timespec_cond() for only a conditional
conversion, and adapt its users to it.
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6e94d1ef

15 4月, 2013 1 次提交

net: sock: make sock_tx_timestamp void · bf84a010

由 Daniel Borkmann 提交于 4月 14, 2013

Currently, sock_tx_timestamp() always returns 0. The comment that
describes the sock_tx_timestamp() function wrongly says that it
returns an error when an invalid argument is passed (from commit
20d49473, ``net: socket infrastructure for SO_TIMESTAMPING'').
Make the function void, so that we can also remove all the unneeded
if conditions that check for such a _non-existant_ error case in the
output path.
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bf84a010

11 4月, 2013 1 次提交

kernel: audit: beautify code, for extern function, better to check its parameters by itself · 2950fa9d

由 Chen Gang 提交于 4月 07, 2013

  __audit_socketcall is an extern function.
  better to check its parameters by itself.

    also can return error code, when fail (find invalid parameters).
    also use macro instead of real hard code number
    also give related comments for it.
Signed-off-by: NChen Gang <gang.chen@asianux.com>
[eparis: fix the return value when !CONFIG_AUDIT]
Signed-off-by: NEric Paris <eparis@redhat.com>

2950fa9d

26 2月, 2013 1 次提交
- A
  get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero · 21d20681
  由 Al Viro 提交于 2月 23, 2013
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  21d20681
23 2月, 2013 1 次提交

fs: Preserve error code in get_empty_filp(), part 2 · 39b65252

由 Anatol Pomozov 提交于 9月 12, 2012

Allocating a file structure in function get_empty_filp() might fail because
of several reasons:
 - not enough memory for file structures
 - operation is not allowed
 - user is over its limit

Currently the function returns NULL in all cases and we loose the exact
reason of the error. All callers of get_empty_filp() assume that the function
can fail with ENFILE only.

Return error through pointer. Change all callers to preserve this error code.

[AV: cleaned up a bit, carved the get_empty_filp() part out into a separate commit
(things remaining here deal with alloc_file()), removed pipe(2) behaviour change]
Signed-off-by: NAnatol Pomozov <anatol.pomozov@gmail.com>
Reviewed-by: N"Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

39b65252

12 2月, 2013 1 次提交

ethtool: fix sparse warning · 954b1244

由 Stephen Hemminger 提交于 2月 11, 2013

Fixes sparse complaints about dropping __user in casts.
 warning: cast removes address space of expression
Signed-off-by: NStephen Hemminger <stephen@networkplumber.org>
Acked-by: NBen Hutchings <bhutchings@solarflare.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

954b1244

01 2月, 2013 1 次提交

wanrouter: completely decouple obsolete code from kernel. · a786a7c0

由 Paul Gortmaker 提交于 1月 30, 2013

The original suggestion to delete wanrouter started earlier
with the mainline commit f0d1b3c2
("net/wanrouter: Deprecate and schedule for removal") in May 2012.

More importantly, Dan Carpenter found[1] that the driver had a
fundamental breakage introduced back in 2008, with commit
7be6065b ("netdevice wanrouter: Convert directly reference of
netdev->priv").  So we know with certainty that the code hasn't been
used by anyone willing to at least take the effort to send an e-mail
report of breakage for at least 4 years.

This commit does a decouple of the wanrouter subsystem, by going
after the Makefile/Kconfig and similar files, so that these mainline
files that we are keeping do not have the big wanrouter file/driver
deletion commit tied into their history.

Once this commit is in place, we then can remove the obsolete cyclomx
drivers and similar that have a dependency on CONFIG_WAN_ROUTER_DRIVERS.

[1] http://www.spinics.net/lists/netdev/msg218670.htmlOriginally-by: NJoe Perches <joe@perches.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>

a786a7c0

26 10月, 2012 2 次提交

cgroup: net_cls: Rework update socket logic · 6a328d8c

由 Daniel Wagner 提交于 10月 25, 2012

The cgroup logic part of net_cls is very similar as the one in
net_prio. Let's stream line the net_cls logic with the net_prio one.

The net_prio update logic was changed by following commit (note there
were some changes necessary later on)

commit 406a3c63
Author: John Fastabend <john.r.fastabend@intel.com>
Date:   Fri Jul 20 10:39:25 2012 +0000

    net: netprio_cgroup: rework update socket logic

    Instead of updating the sk_cgrp_prioidx struct field on every send
    this only updates the field when a task is moved via cgroup
    infrastructure.

    This allows sockets that may be used by a kernel worker thread
    to be managed. For example in the iscsi case today a user can
    put iscsid in a netprio cgroup and control traffic will be sent
    with the correct sk_cgrp_prioidx value set but as soon as data
    is sent the kernel worker thread isssues a send and sk_cgrp_prioidx
    is updated with the kernel worker threads value which is the
    default case.

    It seems more correct to only update the field when the user
    explicitly sets it via control group infrastructure. This allows
    the users to manage sockets that may be used with other threads.

Since classid is now updated when the task is moved between the
cgroups, we don't have to call sock_update_classid() from various
places to ensure we always using the latest classid value.

[v2: Use iterate_fd() instead of open coding]
Signed-off-by: NDaniel Wagner <daniel.wagner@bmw-carit.de>
Cc:  Li Zefan <lizefan@huawei.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Joe Perches <joe@perches.com>
Cc: John Fastabend <john.r.fastabend@intel.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Stanislav Kinsbursky <skinsbursky@parallels.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: <netdev@vger.kernel.org>
Cc: <cgroups@vger.kernel.org>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6a328d8c

cgroup: net_cls: Pass in task to sock_update_classid() · fd9a08a7

由 Daniel Wagner 提交于 10月 25, 2012

sock_update_classid() assumes that the update operation always are
applied on the current task. sock_update_classid() needs to know on
which tasks to work on in order to be able to migrate task between
cgroups using the struct cgroup_subsys attach() callback.
Signed-off-by: NDaniel Wagner <daniel.wagner@bmw-carit.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Glauber Costa <glommer@parallels.com>
Cc: Joe Perches <joe@perches.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Stanislav Kinsbursky <skinsbursky@parallels.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: <netdev@vger.kernel.org>
Cc: <cgroups@vger.kernel.org>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fd9a08a7

28 9月, 2012 1 次提交

net: remove sk_init() helper · e2bcabec

由 Eric Dumazet 提交于 9月 25, 2012

It seems sk_init() has no value today and even does strange things :

# grep . /proc/sys/net/core/?mem_*
/proc/sys/net/core/rmem_default:212992
/proc/sys/net/core/rmem_max:131071
/proc/sys/net/core/wmem_default:212992
/proc/sys/net/core/wmem_max:131071

We can remove it completely.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reviewed-by: NShan Wei <davidshan@tencent.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e2bcabec

27 9月, 2012 2 次提交

unexport sock_map_fd(), switch to sock_alloc_file() · 56b31d1c

由 Al Viro 提交于 8月 18, 2012

Both modular callers of sock_map_fd() had been buggy; sctp one leaks
descriptor and file if copy_to_user() fails, 9p one shouldn't be
exposing file in the descriptor table at all.

Switch both to sock_alloc_file(), export it, unexport sock_map_fd() and
make it static.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

56b31d1c

A
take descriptor handling from sock_alloc_file() to callers · 28407630
由 Al Viro 提交于 8月 17, 2012
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
28407630

06 9月, 2012 1 次提交

Fix order of arguments to compat_put_time[spec|val] · ed6fe9d6

由 Mikulas Patocka 提交于 9月 01, 2012

Commit 644595f8 ("compat: Handle COMPAT_USE_64BIT_TIME in
net/socket.c") introduced a bug where the helper functions to take
either a 64-bit or compat time[spec|val] got the arguments in the wrong
order, passing the kernel stack pointer off as a user pointer (and vice
versa).

Because of the user address range check, that in turn then causes an
EFAULT due to the user pointer range checking failing for the kernel
address.  Incorrectly resuling in a failed system call for 32-bit
processes with a 64-bit kernel.

On odder architectures like HP-PA (with separate user/kernel address
spaces), it can be used read kernel memory.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ed6fe9d6

05 9月, 2012 1 次提交

net: Providing protocol type via system.sockprotoname xattr of /proc/PID/fd entries · 600e1779

由 Masatake YAMATO 提交于 8月 29, 2012

lsof reports some of socket descriptors as "can't identify protocol" like:

    [yamato@localhost]/tmp% sudo lsof | grep dbus | grep iden
    dbus-daem   652          dbus    6u     sock ... 17812 can't identify protocol
    dbus-daem   652          dbus   34u     sock ... 24689 can't identify protocol
    dbus-daem   652          dbus   42u     sock ... 24739 can't identify protocol
    dbus-daem   652          dbus   48u     sock ... 22329 can't identify protocol
    ...

lsof cannot resolve the protocol used in a socket because procfs
doesn't provide the map between inode number on sockfs and protocol
type of the socket.

For improving the situation this patch adds an extended attribute named
'system.sockprotoname' in which the protocol name for
/proc/PID/fd/SOCKET is stored. So lsof can know the protocol for a
given /proc/PID/fd/SOCKET with getxattr system call.

A few weeks ago I submitted a patch for the same purpose. The patch
was introduced /proc/net/sockfs which enumerates inodes and protocols
of all sockets alive on a system. However, it was rejected because (1)
a global lock was needed, and (2) the layout of struct socket was
changed with the patch.

This patch doesn't use any global lock; and doesn't change the layout
of any structs.

In this patch, a protocol name is stored to dentry->d_name of sockfs
when new socket is associated with a file descriptor. Before this
patch dentry->d_name was not used; it was just filled with empty
string. lsof may use an extended attribute named
'system.sockprotoname' to retrieve the value of dentry->d_name.

It is nice if we can see the protocol name with ls -l
/proc/PID/fd. However, "socket:[#INODE]", the name format returned
from sockfs_dname() was already defined. To keep the compatibility
between kernel and user land, the extended attribute is used to
prepare the value of dentry->d_name.
Signed-off-by: NMasatake YAMATO <yamato@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

600e1779

16 8月, 2012 1 次提交

net: fix info leak in compat dev_ifconf() · 43da5f2e

由 Mathias Krause 提交于 8月 15, 2012

The implementation of dev_ifconf() for the compat ioctl interface uses
an intermediate ifc structure allocated in userland for the duration of
the syscall. Though, it fails to initialize the padding bytes inserted
for alignment and that for leaks four bytes of kernel stack. Add an
explicit memset(0) before filling the structure to avoid the info leak.
Signed-off-by: NMathias Krause <minipli@googlemail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

43da5f2e

23 7月, 2012 1 次提交

net: netprio_cgroup: rework update socket logic · 406a3c63

由 John Fastabend 提交于 7月 20, 2012

Instead of updating the sk_cgrp_prioidx struct field on every send
this only updates the field when a task is moved via cgroup
infrastructure.

This allows sockets that may be used by a kernel worker thread
to be managed. For example in the iscsi case today a user can
put iscsid in a netprio cgroup and control traffic will be sent
with the correct sk_cgrp_prioidx value set but as soon as data
is sent the kernel worker thread isssues a send and sk_cgrp_prioidx
is updated with the kernel worker threads value which is the
default case.

It seems more correct to only update the field when the user
explicitly sets it via control group infrastructure. This allows
the users to manage sockets that may be used with other threads.
Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

406a3c63

21 7月, 2012 1 次提交

tun: fix a crash bug and a memory leak · b09e786b

由 Mikulas Patocka 提交于 7月 19, 2012

This patch fixes a crash
tun_chr_close -> netdev_run_todo -> tun_free_netdev -> sk_release_kernel ->
sock_release -> iput(SOCK_INODE(sock))
introduced by commit 1ab5ecb9

The problem is that this socket is embedded in struct tun_struct, it has
no inode, iput is called on invalid inode, which modifies invalid memory
and optionally causes a crash.

sock_release also decrements sockets_in_use, this causes a bug that
"sockets: used" field in /proc/*/net/sockstat keeps on decreasing when
creating and closing tun devices.

This patch introduces a flag SOCK_EXTERNALLY_ALLOCATED that instructs
sock_release to not free the inode and not decrement sockets_in_use,
fixing both memory corruption and sockets_in_use underflow.

It should be backported to 3.3 an 3.4 stabke.
Signed-off-by: NMikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
Cc: stable@kernel.org
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b09e786b

16 5月, 2012 1 次提交

net: Convert net_ratelimit uses to net_<level>_ratelimited · e87cc472

由 Joe Perches 提交于 5月 13, 2012

Standardize the net core ratelimited logging functions.

Coalesce formats, align arguments.
Change a printk then vprintk sequence to use printf extension %pV.
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e87cc472

15 5月, 2012 1 次提交

net: replace percpu_xxx funcs with this_cpu_xxx or __this_cpu_xxx · 19e8d69c

由 Alex Shi 提交于 5月 14, 2012

percpu_xxx funcs are duplicated with this_cpu_xxx funcs, so replace
them for further code clean up.

And in preempt safe scenario, __this_cpu_xxx funcs may has a bit
better performance since __this_cpu_xxx has no redundant
preempt_enable/preempt_disable on some architectures.
Signed-off-by: NAlex Shi <alex.shi@intel.com>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NTejun Heo <tj@kernel.org>

19e8d69c

22 4月, 2012 1 次提交

net: change big iov allocations · a74e9106

由 Eric Dumazet 提交于 4月 20, 2012

iov of more than 8 entries are allocated in sendmsg()/recvmsg() through
sock_kmalloc()

As these allocations are temporary only and small enough, it makes sense
to use plain kmalloc() and avoid sk_omem_alloc atomic overhead.

Slightly changed fast path to be even faster.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Mike Waychison <mikew@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a74e9106

21 4月, 2012 1 次提交

net sysctl: Initialize the network sysctls sooner to avoid problems. · 2ca794e5

由 Eric W. Biederman 提交于 4月 19, 2012

If the netfilter code is modified to use register_net_sysctl_table the
kernel fails to boot because the per net sysctl infrasturce is not setup
soon enough. So to avoid races call net_sysctl_init from sock_init().
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
Acked-by: NPavel Emelyanov <xemul@parallels.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2ca794e5

16 4月, 2012 1 次提交

net: cleanup unsigned to unsigned int · 95c96174

由 Eric Dumazet 提交于 4月 15, 2012

Use of "unsigned int" is preferred to bare "unsigned" in net tree.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

95c96174

06 4月, 2012 1 次提交

tcp: tcp_sendpages() should call tcp_push() once · 35f9c09f

由 Eric Dumazet 提交于 4月 05, 2012

commit 2f533844 (tcp: allow splice() to build full TSO packets) added
a regression for splice() calls using SPLICE_F_MORE.

We need to call tcp_flush() at the end of the last page processed in
tcp_sendpages(), or else transmits can be deferred and future sends
stall.

Add a new internal flag, MSG_SENDPAGE_NOTLAST, acting like MSG_MORE, but
with different semantic.

For all sendpage() providers, its a transparent change. Only
sock_sendpage() and tcp_sendpages() can differentiate the two different
flags provided by pipe_to_sendpage()
Reported-by: NTom Herbert <therbert@google.com>
Cc: Nandita Dukkipati <nanditad@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: H.K. Jerry Chu <hkchu@google.com>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Mahesh Bandewar <maheshb@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail&gt;com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

35f9c09f

12 3月, 2012 1 次提交

net: get rid of some pointless casts to sockaddr · 43db362d

由 Maciej Żenczykowski 提交于 3月 11, 2012

The following 4 functions:
  move_addr_to_kernel
  move_addr_to_user
  verify_iovec
  verify_compat_iovec
are always effectively called with a sockaddr_storage.

Make this explicit by changing their signature.

This removes a large number of casts from sockaddr_storage to sockaddr.
Signed-off-by: NMaciej Żenczykowski <maze@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

43db362d

21 2月, 2012 1 次提交

compat: Handle COMPAT_USE_64BIT_TIME in net/socket.c · 644595f8

由 H. Peter Anvin 提交于 2月 19, 2012

Use helper functions aware of COMPAT_USE_64BIT_TIME to write struct
timeval and struct timespec to userspace in net/socket.c.
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

644595f8

13 1月, 2012 1 次提交

net: reintroduce missing rcu_assign_pointer() calls · cf778b00

由 Eric Dumazet 提交于 1月 12, 2012

commit a9b3cd7f (rcu: convert uses of rcu_assign_pointer(x, NULL) to
RCU_INIT_POINTER) did a lot of incorrect changes, since it did a
complete conversion of rcu_assign_pointer(x, y) to RCU_INIT_POINTER(x,
y).

We miss needed barriers, even on x86, when y is not NULL.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
CC: Stephen Hemminger <shemminger@vyatta.com>
CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cf778b00

06 1月, 2012 1 次提交

vfs: fix up ENOIOCTLCMD error handling · 07d106d0

由 Linus Torvalds 提交于 1月 05, 2012

We're doing some odd things there, which already messes up various users
(see the net/socket.c code that this removes), and it was going to add
yet more crud to the block layer because of the incorrect error code
translation.

ENOIOCTLCMD is not an error return that should be returned to user mode
from the "ioctl()" system call, but it should *not* be translated as
EINVAL ("Invalid argument").  It should be translated as ENOTTY
("Inappropriate ioctl for device").

That EINVAL confusion has apparently so permeated some code that the
block layer actually checks for it, which is sad.  We continue to do so
for now, but add a big comment about how wrong that is, and we should
remove it entirely eventually.  In the meantime, this tries to keep the
changes localized to just the EINVAL -> ENOTTY fix, and removing code
that makes it harder to do the right thing.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

07d106d0

05 1月, 2012 1 次提交

ethtool: Allow drivers to select RX NFC rule locations · 55664f32

由 Ben Hutchings 提交于 1月 03, 2012

Define special location values for RX NFC that request the driver to
select the actual rule location.  This allows for implementation on
devices that use hash-based filter lookup, whereas currently the API is
more suited to devices with TCAM lookup or linear search.

In ethtool_set_rxnfc() and the compat wrapper ethtool_ioctl(), copy
the structure back to user-space after insertion so that the actual
location is returned.
Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

55664f32

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功