提交 · 738b52bb9845da183b6ff46a8f685b56a63379d1 · openanolis / cloud-kernel

07 2月, 2014 1 次提交

inet: defines IPPROTO_* needed for module alias generation · ee262ad8

由 Jan Moskyto Matejka 提交于 2月 06, 2014

Commit cfd280c9 ("net: sync some IP headers with glibc") changed a set of
define's to an enum (with no explanation why) which introduced a bug
in module mip6 where aliases are generated using the IPPROTO_* defines;
mip6 doesn't load if require_module called with the aliases from
xfrm_get_type().

Reverting this change back to define's to fix the aliases.

modinfo mip6 (before this change)
alias:          xfrm-type-10-IPPROTO_DSTOPTS
alias:          xfrm-type-10-IPPROTO_ROUTING

modinfo mip6 (after this change)
alias:          xfrm-type-10-43
alias:          xfrm-type-10-60
Signed-off-by: NJan Moskyto Matejka <mq@suse.cz>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ee262ad8

29 1月, 2014 3 次提交

Btrfs: add support for inode properties · 63541927

由 Filipe David Borba Manana 提交于 1月 07, 2014

This change adds infrastructure to allow for generic properties for
inodes. Properties are name/value pairs that can be associated with
inodes for different purposes. They are stored as xattrs with the
prefix "btrfs."

Properties can be inherited - this means when a directory inode has
inheritable properties set, these are added to new inodes created
under that directory. Further, subvolumes can also have properties
associated with them, and they can be inherited from their parent
subvolume. Naturally, directory properties have priority over subvolume
properties (in practice a subvolume property is just a regular
property associated with the root inode, objectid 256, of the
subvolume's fs tree).

This change also adds one specific property implementation, named
"compression", whose values can be "lzo" or "zlib" and it's an
inheritable property.

The corresponding changes to btrfs-progs were also implemented.
A patch with xfstests for this feature will follow once there's
agreement on this change/feature.

Further, the script at the bottom of this commit message was used to
do some benchmarks to measure any performance penalties of this feature.

Basically the tests correspond to:

Test 1 - create a filesystem and mount it with compress-force=lzo,
then sequentially create N files of 64Kb each, measure how long it took
to create the files, unmount the filesystem, mount the filesystem and
perform an 'ls -lha' against the test directory holding the N files, and
report the time the command took.

Test 2 - create a filesystem and don't use any compression option when
mounting it - instead set the compression property of the subvolume's
root to 'lzo'. Then create N files of 64Kb, and report the time it took.
The unmount the filesystem, mount it again and perform an 'ls -lha' like
in the former test. This means every single file ends up with a property
(xattr) associated to it.

Test 3 - same as test 2, but uses 4 properties - 3 are duplicates of the
compression property, have no real effect other than adding more work
when inheriting properties and taking more btree leaf space.

Test 4 - same as test 3 but with 10 properties per file.

Results (in seconds, and averages of 5 runs each), for different N
numbers of files follow.

* Without properties (test 1)

                    file creation time        ls -lha time
10 000 files              3.49                   0.76
100 000 files            47.19                   8.37
1 000 000 files         518.51                 107.06

* With 1 property (compression property set to lzo - test 2)

                    file creation time        ls -lha time
10 000 files              3.63                    0.93
100 000 files            48.56                    9.74
1 000 000 files         537.72                  125.11

* With 4 properties (test 3)

                    file creation time        ls -lha time
10 000 files              3.94                    1.20
100 000 files            52.14                   11.48
1 000 000 files         572.70                  142.13

* With 10 properties (test 4)

                    file creation time        ls -lha time
10 000 files              4.61                    1.35
100 000 files            58.86                   13.83
1 000 000 files         656.01                  177.61

The increased latencies with properties are essencialy because of:

*) When creating an inode, we now synchronously write 1 more item
   (an xattr item) for each property inherited from the parent dir
   (or subvolume). This could be done in an asynchronous way such
   as we do for dir intex items (delayed-inode.c), which could help
   reduce the file creation latency;

*) With properties, we now have larger fs trees. For this particular
   test each xattr item uses 75 bytes of leaf space in the fs tree.
   This could be less by using a new item for xattr items, instead of
   the current btrfs_dir_item, since we could cut the 'location' and
   'type' fields (saving 18 bytes) and maybe 'transid' too (saving a
   total of 26 bytes per xattr item) from the btrfs_dir_item type.

Also tried batching the xattr insertions (ignoring proper hash
collision handling, since it didn't exist) when creating files that
inherit properties from their parent inode/subvolume, but the end
results were (surprisingly) essentially the same.

Test script:

$ cat test.pl
  #!/usr/bin/perl -w

  use strict;
  use Time::HiRes qw(time);
  use constant NUM_FILES => 10_000;
  use constant FILE_SIZES => (64 * 1024);
  use constant DEV => '/dev/sdb4';
  use constant MNT_POINT => '/home/fdmanana/btrfs-tests/dev';
  use constant TEST_DIR => (MNT_POINT . '/testdir');

  system("mkfs.btrfs", "-l", "16384", "-f", DEV) == 0 or die "mkfs.btrfs failed!";

  # following line for testing without properties
  #system("mount", "-o", "compress-force=lzo", DEV, MNT_POINT) == 0 or die "mount failed!";

  # following 2 lines for testing with properties
  system("mount", DEV, MNT_POINT) == 0 or die "mount failed!";
  system("btrfs", "prop", "set", MNT_POINT, "compression", "lzo") == 0 or die "set prop failed!";

  system("mkdir", TEST_DIR) == 0 or die "mkdir failed!";
  my ($t1, $t2);

  $t1 = time();
  for (my $i = 1; $i <= NUM_FILES; $i++) {
      my $p = TEST_DIR . '/file_' . $i;
      open(my $f, '>', $p) or die "Error opening file!";
      $f->autoflush(1);
      for (my $j = 0; $j < FILE_SIZES; $j += 4096) {
          print $f ('A' x 4096) or die "Error writing to file!";
      }
      close($f);
  }
  $t2 = time();
  print "Time to create " . NUM_FILES . ": " . ($t2 - $t1) . " seconds.\n";
  system("umount", DEV) == 0 or die "umount failed!";
  system("mount", DEV, MNT_POINT) == 0 or die "mount failed!";

  $t1 = time();
  system("bash -c 'ls -lha " . TEST_DIR . " > /dev/null'") == 0 or die "ls failed!";
  $t2 = time();
  print "Time to ls -lha all files: " . ($t2 - $t1) . " seconds.\n";
  system("umount", DEV) == 0 or die "umount failed!";
Signed-off-by: NFilipe David Borba Manana <fdmanana@gmail.com>
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NChris Mason <clm@fb.com>

63541927

btrfs: add ioctl to export size of global metadata reservation · 01e219e8

由 Jeff Mahoney 提交于 11月 01, 2013

btrfs filesystem df output will show the size of the metadata space
and how much of it is used, and the user assumes that the difference
is all usable space. Since that's not actually the case due to the
global metadata reservation, we should provide the full picture to the
user.

This patch adds an ioctl that exports the size of the global metadata
reservation so that btrfs filesystem df can report it.
Signed-off-by: NJeff Mahoney <jeffm@suse.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
Signed-off-by: NChris Mason <clm@fb.com>

01e219e8

btrfs: add ioctls to query/change feature bits online · 2eaa055f

由 Jeff Mahoney 提交于 11月 15, 2013

There are some feature bits that require no offline setup and can
be enabled online. I've only reviewed extended irefs, but there will
probably be more.

We introduce three new ioctls:
- BTRFS_IOC_GET_SUPPORTED_FEATURES: query the kernel for supported features.
- BTRFS_IOC_GET_FEATURES: query the kernel for enabled features on a per-fs
  basis, as well as querying for which features are changeable with mounted.
- BTRFS_IOC_SET_FEATURES: change features on a per-fs basis.

We introduce two new masks per feature set (_SAFE_SET and _SAFE_CLEAR) that
allow us to define which features are safe to change at runtime.

The failure modes for BTRFS_IOC_SET_FEATURES are as follows:
- Enabling a completely unsupported feature: warns and returns -ENOTSUPP
- Enabling a feature that can only be done offline: warns and returns -EPERM
Signed-off-by: NJeff Mahoney <jeffm@suse.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
Signed-off-by: NChris Mason <clm@fb.com>

2eaa055f

28 1月, 2014 1 次提交

NVMe: Abort timed out commands · c30341dc

由 Keith Busch 提交于 12月 10, 2013

Send nvme abort command to io requests that have timed out on an
initialized device. If the command is not returned after another timeout,
schedule the controller for reset.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
[fix endianness issues]
Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>

c30341dc

24 1月, 2014 5 次提交

rtnetlink: remove IFLA_BOND_SLAVE definition · f55aa836

由 Jiri Pirko 提交于 1月 24, 2014

This is in net-next only, for couple of days. Not used anymore, and never
should have been. So just remove it and pretend it was never there.
Signed-off-by: NJiri Pirko <jiri@resnulli.us>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f55aa836

uapi: convert u64 to __u64 in exported headers · 0d9dfc23

由 Mike Frysinger 提交于 1月 23, 2014

The u64 type is not defined in any exported kernel headers, so trying to
use it will lead to build failures.
Signed-off-by: NMike Frysinger <vapier@gentoo.org>
Acked-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0d9dfc23

include/uapi/linux/dn.h: pull in ioctl.h header · c3189245

由 Mike Frysinger 提交于 1月 23, 2014

This header uses _IOW/_IOR defines but doesn't include ioctl.h for it.
If you try to use this w/out including ioctl.h yourself, it can fail to
build, so add the explicit include.
Signed-off-by: NMike Frysinger <vapier@gentoo.org>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c3189245

include/uapi/linux/ppp-ioctl.h: pull in ppp_defs.h · e8b67146

由 Mike Frysinger 提交于 1月 23, 2014

This header uses enum NPmode but doesn't include ppp_defs.h.  If you try
to use this header w/out including the defs header first, it leads to a
build failure.  So add the explicit include to fix it.

Don't know of any packages directly impacted, but noticed while building
some ppp code by hand.
Signed-off-by: NMike Frysinger <vapier@gentoo.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e8b67146

rtnetlink: add missing IFLA_BOND_AD_INFO_UNSPEC · 237266f7

由 Jiri Pirko 提交于 1月 23, 2014

Signed-off-by: NJiri Pirko <jiri@resnulli.us>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

237266f7

23 1月, 2014 3 次提交

rtnetlink: provide api for getting and setting slave info · ba7d49b1

由 Jiri Pirko 提交于 1月 22, 2014

Recent patch
bonding: add netlink attributes to slave link dev (1d3ee88a)

Introduced yet another device specific way to access slave information
over rtnetlink. There is one already there for bridge.

This patch introduces generic way to do this, for getting and setting
info as well by extending link_ops. Later on, this new interface will
be used for bridge ports as well.
Signed-off-by: NJiri Pirko <jiri@resnulli.us>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ba7d49b1

rtnetlink: put "BOND" into nl attribute names which are related to bonding · df7dbcbb

由 Jiri Pirko 提交于 1月 22, 2014

Signed-off-by: NJiri Pirko <jiri@resnulli.us>
Acked-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

df7dbcbb

af_packet: Add Queue mapping mode to af_packet fanout operation · 2d36097d

由 Neil Horman 提交于 1月 22, 2014

This patch adds a queue mapping mode to the fanout operation of af_packet
sockets.  This allows user space af_packet users to better filter on flows
ingressing and egressing via a specific hardware queue, and avoids the potential
packet reordering that can occur when FANOUT_CPU is being used and irq affinity
varies.

Tested successfully by myself.  applies to net-next
Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
CC: "David S. Miller" <davem@davemloft.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2d36097d

22 1月, 2014 2 次提交

neighbour.h: fix comment · c04e7da0

由 Li Zhong 提交于 1月 22, 2014

Signed-off-by: NLi Zhong <zhong@linux.vnet.ibm.com>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

c04e7da0

dm log userspace: allow mark requests to piggyback on flush requests · 5066a4df

由 Dongmao Zhang 提交于 1月 15, 2014

In the cluster evironment, cluster write has poor performance because
userspace_flush() has to contact a userspace program (cmirrord) for
clear/mark/flush requests.  But both mark and flush requests require
cmirrord to communicate the message to all the cluster nodes for each
flush call.  This behaviour is really slow.

To address this we now merge mark and flush requests together to reduce
the kernel-userspace-kernel time.  We allow a new directive,
"integrated_flush" that can be used to instruct the kernel log code to
combine flush and mark requests when directed by userspace.  If not
directed by userspace (due to an older version of the userspace code
perhaps), the kernel will function as it did previously - preserving
backwards compatibility.  Additionally, flush requests are performed
lazily when only clear requests exist.
Signed-off-by: NDongmao Zhang <dmzhang@suse.com>
Signed-off-by: NJonathan Brassow <jbrassow@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

5066a4df

21 1月, 2014 5 次提交

uapi: Use __kernel_long_t in struct mq_attr · 63159f5d

由 H.J. Lu 提交于 12月 27, 2013

Both x32 and x86-64 use the same struct mq_attr for system calls. But
x32 long is 32-bit. This patch replaces long with __kernel_long_t in
struct mq_attr.
Signed-off-by: NH.J. Lu <hjl.tools@gmail.com>
Link: http://lkml.kernel.org/r/1388182464-28428-9-git-send-email-hjl.tools@gmail.comSigned-off-by: NH. Peter Anvin <hpa@zytor.com>

63159f5d

uapi: Use __kernel_ulong_t in shmid64_ds/shminfo64/shm_info · f8dcdf01

由 H.J. Lu 提交于 12月 27, 2013

Both x32 and x86-64 use the same struct shmid64_ds/shminfo64/shm_info for
system calls. But x32 long is 32-bit. This patch replaces unsigned long
with __kernel_ulong_t in struct shmid64_ds/shminfo64/shm_info.
Signed-off-by: NH.J. Lu <hjl.tools@gmail.com>
Link: http://lkml.kernel.org/r/1388182464-28428-8-git-send-email-hjl.tools@gmail.comSigned-off-by: NH. Peter Anvin <hpa@zytor.com>

f8dcdf01

uapi: Use __kernel_long_t in struct msgbuf · 443d5670

由 H.J. Lu 提交于 12月 27, 2013

x32 msgsnd/msgrcv system calls are the same as x86-64 msgsnd/msgrcv system
calls, which use 64-bit integer for long in struct msgbuf . But x32 long
is 32 bit. This patch replaces long in struct msgbuf with __kernel_long_t.
Signed-off-by: NH.J. Lu <hjl.tools@gmail.com>
Link: http://lkml.kernel.org/r/1388182464-28428-5-git-send-email-hjl.tools@gmail.comSigned-off-by: NH. Peter Anvin <hpa@zytor.com>

443d5670

uapi: Use __kernel_long_t/__kernel_ulong_t in <linux/resource.h> · b684bfed

由 H.J. Lu 提交于 12月 27, 2013

Both x32 and x86-64 use the same struct rusage and struct rlimit for
system calls.  But x32 log is 32-bit.  This patch change uapi
<linux/resource.h> to use __kernel_long_t in struct rusage and
__kernel_ulong_t in and struct rlimit.
Signed-off-by: NH.J. Lu <hjl.tools@gmail.com>
Link: http://lkml.kernel.org/r/1388182464-28428-3-git-send-email-hjl.tools@gmail.comSigned-off-by: NH. Peter Anvin <hpa@zytor.com>

b684bfed

uapi: Use __kernel_long_t in struct timex · 7fb30128

由 H.J. Lu 提交于 12月 27, 2013

x32 adjtimex system call is the same as x86-64 adjtimex system call,
which uses 64-bit integer for long in struct timex. But x32 long is
32 bit. This patch replaces long in struct timex with __kernel_long_t.
Signed-off-by: NH.J. Lu <hjl.tools@gmail.com>
Link: http://lkml.kernel.org/r/1388182464-28428-2-git-send-email-hjl.tools@gmail.comSigned-off-by: NH. Peter Anvin <hpa@zytor.com>

7fb30128

20 1月, 2014 2 次提交

ipv6: add a flag to get the flow label used remotly · 46e5f401

由 Florent Fourcot 提交于 1月 17, 2014

This information is already available via IPV6_FLOWINFO
of IPV6_2292PKTOPTIONS, and them a filtering to get the flow label
information. But it is probably logical and easier for users to add this
here, and to control both sent/received flow label values with the
IPV6_FLOWLABEL_MGR option.
Signed-off-by: NFlorent Fourcot <florent.fourcot@enst-bretagne.fr>
Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

46e5f401

ipv6: add the IPV6_FL_F_REFLECT flag to IPV6_FL_A_GET · df3687ff

由 Florent Fourcot 提交于 1月 17, 2014

With this option, the socket will reply with the flow label value read
on received packets.

The goal is to have a connection with the same flow label in both
direction of the communication.

Changelog of V4:
 * Do not erase the flow label on the listening socket. Use pktopts to
 store the received value
Signed-off-by: NFlorent Fourcot <florent.fourcot@enst-bretagne.fr>
Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

df3687ff

18 1月, 2014 1 次提交

bonding: add netlink attributes to slave link dev · 1d3ee88a

由 sfeldma@cumulusnetworks.com 提交于 1月 16, 2014

If link is IFF_SLAVE, extend link dev netlink attributes to include
slave attributes with new IFLA_SLAVE nest.  Add netlink notification
(RTM_NEWLINK) when slave status changes from backup to active, or
visa-versa.

Adds new ndo_get_slave op to net_device_ops to fill skb with IFLA_SLAVE
attributes.  Currently only used by bonding driver, but could be
used by other aggregating devices with slaves.
Signed-off-by: NScott Feldman <sfeldma@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1d3ee88a

17 1月, 2014 3 次提交

floppy: bail out in open() if drive is not responding to block0 read · 7b7b68bb

由 Jiri Kosina 提交于 1月 10, 2014

In case reading of block 0 during open() fails, it is not the right thing
to let open() succeed.

Fix this by introducing FD_OPEN_SHOULD_FAIL_BIT flag, and setting it in
case the bio callback encounters an error while trying to read block 0.

As a bonus, this works around certain broken userspace (blkid), which is
not able to properly handle read()s returning IO errors. Hence be nice to
those, and bail out during open() already; if block 0 is not readable,
read()s are not going to provide any meaningful data anyway.
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

7b7b68bb

add support for Hyper-V reference time counter · e984097b

由 Vadim Rozenfeld 提交于 1月 16, 2014

Signed-off: Peter Lieven <pl@kamp.de>
Signed-off: Gleb Natapov
Signed-off: Vadim Rozenfeld <vrozenfe@redhat.com>

After some consideration I decided to submit only Hyper-V reference
counters support this time. I will submit iTSC support as a separate
patch as soon as it is ready.

v1 -> v2
1. mark TSC page dirty as suggested by
    Eric Northup <digitaleric@google.com> and Gleb
2. disable local irq when calling get_kernel_ns,
    as it was done by Peter Lieven <pl@amp.de>
3. move check for TSC page enable from second patch
    to this one.

v3 -> v4
    Get rid of ref counter offset.

v4 -> v5
    replace __copy_to_user with kvm_write_guest
    when updateing iTSC page.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

e984097b

net_sched: act: pick a different type for act_xt · 6c80563c

由 WANG Cong 提交于 1月 15, 2014

In tcf_register_action() we check either ->type or ->kind to see if
there is an existing action registered, but ipt action registers two
actions with same type but different kinds. They should have different
types too.

Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NJamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6c80563c

16 1月, 2014 2 次提交

sched: Move SCHED_RESET_ON_FORK into attr::sched_flags · 7479f3c9

由 Peter Zijlstra 提交于 1月 15, 2014

I noticed the new sched_{set,get}attr() calls didn't properly deal
with the SCHED_RESET_ON_FORK hack.

Instead of propagating the flags in high bits nonsense use the brand
spanking new attr::sched_flags field.
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@gmail.com>
Cc: Dario Faggioli <raistlin@linux.it>
Link: http://lkml.kernel.org/r/20140115162242.GJ31570@twins.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>

7479f3c9

ipv6 addrconf: add IFA_F_NOPREFIXROUTE flag to suppress creation of IP6 routes · 761aac73

由 Thomas Haller 提交于 1月 15, 2014

When adding/modifying an IPv6 address, the userspace application needs
a way to suppress adding a prefix route. This is for example relevant
together with IFA_F_MANAGERTEMPADDR, where userspace creates autoconf
generated addresses, but depending on on-link, no route for the
prefix should be added.
Signed-off-by: NThomas Haller <thaller@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

761aac73

14 1月, 2014 4 次提交

md: Change handling of save_raid_disk and metadata update during recovery. · f466722c

由 NeilBrown 提交于 12月 09, 2013

Since commit d70ed2e4
   MD: Allow restarting an interrupted incremental recovery.

we don't write out the metadata to devices while they are recovering.
This had a good reason, but has unfortunate consequences.  This patch
changes things to make them work better.

At issue is what happens if the array is shut down while a recovery is
happening, particularly a bitmap-guided recovery.
Ideally the recovery should pick up where it left off.
However the metadata cannot represent the state "A recovery is in
process which is guided by the bitmap".

Before the above mentioned commit, we wrote metadata to the device
which said "this is being recovered and it is up to <here>".  So after
a restart, a full recovery (not bitmap-guided) would happen from
where-ever it was up to.

After the commit the metadata wasn't updated so it still said "This
device is fully in sync with <this> event count".  That leads to a
bitmap-based recovery following the whole bitmap, which should be a
lot less work than a full recovery from some starting point.  So this
was an improvement.

However updates some metadata but not all leads to other problems.
In particular, the metadata written to the fully-up-to-date device
record that the array has all devices present (even though some are
recovering).  So on restart, mdadm wants to find all devices and
expects them to have current event counts.
Obviously it doesn't (some have old event counts) so (when assembling
with --incremental) it waits indefinitely for the rest of the expected
devices.

It really is wrong to not update all the metadata together.  Do that
is bound to cause confusion.
Instead, we should make it possible to record the truth in the
metadata.  i.e. we need to be able to record that a device is being
recovered based on the bitmap.
We already have a Feature flag to say that recovery is happening.  We
now add another one to say that it is a bitmap-based recovery.

With this we can remove the code that disables the write-out of
metadata on some devices.

So this patch:
 - moves the setting of 'saved_raid_disk' from add_new_disk to
   the validate_super methods.  This makes sure it is always set
   properly, both when adding a new device to an array, and when
   assembling an array from a collection of devices.
 - Adds a metadata flag MD_FEATURE_RECOVERY_BITMAP which is only
   used if MD_FEATURE_RECOVERY_OFFSET is set, and record that a
   bitmap-based recovery is allowed.
   This is only present in v1.x metadata. v0.90 doesn't support
   devices which are in the middle of recovery at all.
 - Only skips writing metadata to Faulty devices.

 - Also allows rdev state to be set to "-insync" via sysfs.
   This can be used for external-metadata arrays.  When the
   'role' is set the device is assumed to be in-sync.  If, after
   setting the role, we set the state to "-insync", the role is
   moved to saved_raid_disk which effectively says the device is
   partly in-sync with that slot and needs a bitmap recovery.

Cc: Andrei Warkentin <andreiw@vmware.com>
Signed-off-by: NNeilBrown <neilb@suse.de>

f466722c

audit: use define's for audit version · 70249a9c

由 Eric Paris 提交于 1月 13, 2014

Give names to the audit versions.  Just something for a userspace
programmer to know what the version provides.
Signed-off-by: NEric Paris <eparis@redhat.com>

70249a9c

audit: add audit_backlog_wait_time configuration option · 51cc83f0

由 Richard Guy Briggs 提交于 9月 18, 2013

reaahead-collector abuses the audit logging facility to discover which files
are accessed at boot time to make a pre-load list

Add a tuning option to audit_backlog_wait_time so that if auditd can't keep up,
or gets blocked, the callers won't be blocked.

Bump audit_status API version to "2".
Signed-off-by: NRichard Guy Briggs <rgb@redhat.com>
Signed-off-by: NEric Paris <eparis@redhat.com>

51cc83f0

audit: clean up AUDIT_GET/SET local variables and future-proof API · 09f883a9

由 Richard Guy Briggs 提交于 9月 18, 2013

Re-named confusing local variable names (status_set and status_get didn't agree
with their command type name) and reduced their scope.

Future-proof API changes by not depending on the exact size of the audit_status
struct and by adding an API version field.
Signed-off-by: NRichard Guy Briggs <rgb@redhat.com>
Signed-off-by: NEric Paris <eparis@redhat.com>

09f883a9

13 1月, 2014 3 次提交

[media] s5p-mfc: Add controls to set vp8 enc profile · bbd8f3fe

由 Kiran AVND 提交于 12月 16, 2013

Add v4l2 controls to set desired profile for VP8 encoder.
Acceptable levels for VP8 encoder are
0: Version 0
1: Version 1
2: Version 2
3: Version 3
Signed-off-by: NKiran AVND <avnd.kiran@samsung.com>
Signed-off-by: NPawel Osciak <posciak@chromium.org>
Signed-off-by: NArun Kumar K <arun.kk@samsung.com>
Signed-off-by: NKamil Debski <k.debski@samsung.com>
Signed-off-by: NMauro Carvalho Chehab <m.chehab@samsung.com>

bbd8f3fe

[media] s5p-mfc: Add QP setting support for vp8 encoder · 4773ab99

由 Arun Kumar K 提交于 11月 15, 2013

Adds v4l2 controls to set MIN, MAX QP values and
I, P frame QP for vp8 encoder.
Signed-off-by: NKiran AVND <avnd.kiran@samsung.com>
Signed-off-by: NArun Kumar K <arun.kk@samsung.com>
Signed-off-by: NKamil Debski <k.debski@samsung.com>
Signed-off-by: NMauro Carvalho Chehab <m.chehab@samsung.com>

4773ab99

sched/deadline: Add SCHED_DEADLINE structures & implementation · aab03e05

由 Dario Faggioli 提交于 11月 28, 2013

Introduces the data structures, constants and symbols needed for
SCHED_DEADLINE implementation.

Core data structure of SCHED_DEADLINE are defined, along with their
initializers. Hooks for checking if a task belong to the new policy
are also added where they are needed.

Adds a scheduling class, in sched/dl.c and a new policy called
SCHED_DEADLINE. It is an implementation of the Earliest Deadline
First (EDF) scheduling algorithm, augmented with a mechanism (called
Constant Bandwidth Server, CBS) that makes it possible to isolate
the behaviour of tasks between each other.

The typical -deadline task will be made up of a computation phase
(instance) which is activated on a periodic or sporadic fashion. The
expected (maximum) duration of such computation is called the task's
runtime; the time interval by which each instance need to be completed
is called the task's relative deadline. The task's absolute deadline
is dynamically calculated as the time instant a task (better, an
instance) activates plus the relative deadline.

The EDF algorithms selects the task with the smallest absolute
deadline as the one to be executed first, while the CBS ensures each
task to run for at most its runtime every (relative) deadline
length time interval, avoiding any interference between different
tasks (bandwidth isolation).
Thanks to this feature, also tasks that do not strictly comply with
the computational model sketched above can effectively use the new
policy.

To summarize, this patch:
 - introduces the data structures, constants and symbols needed;
 - implements the core logic of the scheduling algorithm in the new
   scheduling class file;
 - provides all the glue code between the new scheduling class and
   the core scheduler and refines the interactions between sched/dl
   and the other existing scheduling classes.
Signed-off-by: NDario Faggioli <raistlin@linux.it>
Signed-off-by: NMichael Trimarchi <michael@amarulasolutions.com>
Signed-off-by: NFabio Checconi <fchecconi@gmail.com>
Signed-off-by: NJuri Lelli <juri.lelli@gmail.com>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1383831828-15501-4-git-send-email-juri.lelli@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

aab03e05

12 1月, 2014 1 次提交

perf: Introduce a flag to enable close-on-exec in perf_event_open() · a21b0b35

由 Yann Droneaud 提交于 1月 05, 2014

Unlike recent modern userspace API such as:

  epoll_create1 (EPOLL_CLOEXEC), eventfd (EFD_CLOEXEC),
  fanotify_init (FAN_CLOEXEC), inotify_init1 (IN_CLOEXEC),
  signalfd (SFD_CLOEXEC), timerfd_create (TFD_CLOEXEC),
  or the venerable general purpose open (O_CLOEXEC),

perf_event_open() syscall lack a flag to atomically set FD_CLOEXEC
(eg. close-on-exec) flag on file descriptor it returns to userspace.

The present patch adds a PERF_FLAG_FD_CLOEXEC flag to allow
perf_event_open() syscall to atomically set close-on-exec.

Having this flag will enable userspace to remove the file descriptor
from the list of file descriptors being inherited across exec,
without the need to call fcntl(fd, F_SETFD, FD_CLOEXEC) and the
associated race condition between the current thread and another
thread calling fork(2) then execve(2).

Links:

 - Secure File Descriptor Handling (Ulrich Drepper, 2008)
   http://udrepper.livejournal.com/20407.html

 - Excuse me son, but your code is leaking !!! (Dan Walsh, March 2012)
   http://danwalsh.livejournal.com/53603.html

 - Notes in DMA buffer sharing: leak and security hole
   http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/dma-buf-sharing.txt?id=v3.13-rc3#n428Signed-off-by: NYann Droneaud <ydroneaud@opteya.com>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/8c03f54e1598b1727c19706f3af03f98685d9fe6.1388952061.git.ydroneaud@opteya.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

a21b0b35

11 1月, 2014 1 次提交

tcp: metrics: New netlink attribute for src IP and dumped in netlink reply · 8a59359c

由 Christoph Paasch 提交于 1月 08, 2014

This patch adds a new netlink attribute for the source-IP and appends it
to the netlink reply. Now, iproute2 can have access to the source-IP.
Signed-off-by: NChristoph Paasch <christoph.paasch@uclouvain.be>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8a59359c

10 1月, 2014 2 次提交

netfilter: introduce l2tp match extension · 74f77a6b

由 James Chapman 提交于 1月 06, 2014

Introduce an xtables add-on for matching L2TP packets. Supports L2TPv2
and L2TPv3 over IPv4 and IPv6. As well as filtering on L2TP tunnel-id
and session-id, the filtering decision can also include the L2TP
packet type (control or data), protocol version (2 or 3) and
encapsulation type (UDP or IP).

The most common use for this will likely be to filter L2TP data
packets of individual L2TP tunnels or sessions. While a u32 match can
be used, the L2TP protocol headers are such that field offsets differ
depending on bits set in the header, making rules for matching generic
L2TP connections cumbersome. This match extension takes care of all
that.
Signed-off-by: NJames Chapman <jchapman@katalix.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

74f77a6b

netfilter: nft_ct: Add support to set the connmark · c4ede3d3

由 Kristian Evensen 提交于 1月 07, 2014

This patch adds kernel support for setting properties of tracked
connections. Currently, only connmark is supported. One use-case
for this feature is to provide the same functionality as
-j CONNMARK --save-mark in iptables.

Some restructuring was needed to implement the set op. The new
structure follows that of nft_meta.
Signed-off-by: NKristian Evensen <kristian.evensen@gmail.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

c4ede3d3

09 1月, 2014 1 次提交

bcache: Add bch_btree_keys_u64s_remaining() · 59158fde

由 Kent Overstreet 提交于 11月 11, 2013

Helper function to explicitly check how much space is free in a btree node
Signed-off-by: NKent Overstreet <kmo@daterainc.com>

59158fde

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功