1. 23 4月, 2015 1 次提交
  2. 22 4月, 2015 3 次提交
    • I
      libceph: announce support for straw2 buckets · 7c1c4747
      Ilya Dryomov 提交于
      Sync up feature bits and enable CEPH_FEATURE_CRUSH_V4.
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      7c1c4747
    • I
      crush: straw2 bucket type with an efficient 64-bit crush_ln() · 958a2765
      Ilya Dryomov 提交于
      This is an improved straw bucket that correctly avoids any data movement
      between items A and B when neither A nor B's weights are changed.  Said
      differently, if we adjust the weight of item C (including adding it anew
      or removing it completely), we will only see inputs move to or from C,
      never between other items in the bucket.
      
      Notably, there is not intermediate scaling factor that needs to be
      calculated.  The mapping function is a simple function of the item weights.
      
      The below commits were squashed together into this one (mostly to avoid
      adding and then yanking a ~6000 lines worth of crush_ln_table):
      
      - crush: add a straw2 bucket type
      - crush: add crush_ln to calculate nature log efficently
      - crush: improve straw2 adjustment slightly
      - crush: change crush_ln to provide 32 more digits
      - crush: fix crush_get_bucket_item_weight and bucket destroy for straw2
      - crush/mapper: fix divide-by-0 in straw2
        (with div64_s64() for draw = ln / w and INT64_MIN -> S64_MIN - need
         to create a proper compat.h in ceph.git)
      
      Reflects ceph.git commits 242293c908e923d474910f2b8203fa3b41eb5a53,
                                32a1ead92efcd351822d22a5fc37d159c65c1338,
                                6289912418c4a3597a11778bcf29ed5415117ad9,
                                35fcb04e2945717cf5cfe150b9fa89cb3d2303a1,
                                6445d9ee7290938de1e4ee9563912a6ab6d8ee5f,
                                b5921d55d16796e12d66ad2c4add7305f9ce2353.
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      958a2765
    • Y
      ceph: rename snapshot support · 0ea611a3
      Yan, Zheng 提交于
      Signed-off-by: NYan, Zheng <zyan@redhat.com>
      0ea611a3
  3. 21 4月, 2015 2 次提交
    • T
      net: add skb_checksum_complete_unset · 4e18b9ad
      Tom Herbert 提交于
      This function changes ip_summed to CHECKSUM_NONE if CHECKSUM_COMPLETE
      is set. This is called to discard checksum-complete when packet
      is being modified and checksum is not pulled for headers in a layer.
      Signed-off-by: NTom Herbert <tom@herbertland.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4e18b9ad
    • L
      smp: don't use 16-bit words for atomic accesses · f4d03bd1
      Linus Torvalds 提交于
      Yes, it should work, but it's a bad idea.  Not only did ARM64 not have
      the 16-bit access code (there's a separate patch to add it), it's just
      not a good atomic type.  Some architectures fundamentally don't do
      atomic accesses in them (alpha), and it's not like it saves any space
      here anyway because of structure packing issues.
      
      We normally should aim for flags to be "unsigned int" or "unsigned
      long".  And if space is at a premium, use a single byte (although that
      causes problems on alpha again).  There might be very special cases
      where a 16-byte entity is really wanted, but this is not one of them.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f4d03bd1
  4. 20 4月, 2015 5 次提交
  5. 19 4月, 2015 2 次提交
  6. 18 4月, 2015 3 次提交
    • D
      netns: remove BUG_ONs from net_generic() · 2591ffd3
      Denys Vlasenko 提交于
      This inline has ~500 callsites.
      
      On 04/14/2015 08:37 PM, David Miller wrote:
      > That BUG_ON() was added 7 years ago, and I don't remember it ever
      > triggering or helping us diagnose something, so just remove it and
      > keep the function inlined.
      
      On x86 allyesconfig build:
      
          text     data      bss       dec     hex filename
      82447071 22255384 20627456 125329911 77861f7 vmlinux4
      82441375 22255384 20627456 125324215 7784bb7 vmlinux5prime
      Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com>
      CC: Eric W. Biederman <ebiederm@xmission.com>
      CC: David S. Miller <davem@davemloft.net>
      CC: Jan Engelhardt <jengelh@medozas.de>
      CC: Jiri Pirko <jpirko@redhat.com>
      CC: linux-kernel@vger.kernel.org
      CC: netdev@vger.kernel.org
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2591ffd3
    • J
      net: remove unused 'dev' argument from netif_needs_gso() · 8b86a61d
      Johannes Berg 提交于
      In commit 04ffcb25 ("net: Add ndo_gso_check") Tom originally
      added the 'dev' argument to be able to call ndo_gso_check().
      
      Then later, when generalizing this in commit 5f35227e
      ("net: Generalize ndo_gso_check to ndo_features_check")
      Jesse removed the call to ndo_gso_check() in netif_needs_gso()
      by calling the new ndo_features_check() in a different place.
      This made the 'dev' argument unused.
      
      Remove the unused argument and go back to the code as before.
      
      Cc: Tom Herbert <therbert@google.com>
      Cc: Jesse Gross <jesse@nicira.com>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8b86a61d
    • E
      inet_diag: fix access to tcp cc information · 521f1cf1
      Eric Dumazet 提交于
      Two different problems are fixed here :
      
      1) inet_sk_diag_fill() might be called without socket lock held.
         icsk->icsk_ca_ops can change under us and module be unloaded.
         -> Access to freed memory.
         Fix this using rcu_read_lock() to prevent module unload.
      
      2) Some TCP Congestion Control modules provide information
         but again this is not safe against icsk->icsk_ca_ops
         change and nla_put() errors were ignored. Some sockets
         could not get the additional info if skb was almost full.
      
      Fix this by returning a status from get_info() handlers and
      using rcu protection as well.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      521f1cf1
  7. 17 4月, 2015 17 次提交
  8. 16 4月, 2015 7 次提交
    • R
      cpumask: resurrect CPU_MASK_CPU0 · 1527781d
      Rusty Russell 提交于
      We removed it in 2f0f267e (cpumask: remove deprecated functions.),
      but grep shows it still used by MIPS, and not unreasonably.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      1527781d
    • R
      linux/bitmap.h: improve BITMAP_{LAST,FIRST}_WORD_MASK · 89c1e79e
      Rasmus Villemoes 提交于
      The macro BITMAP_LAST_WORD_MASK can be implemented without a conditional,
      which will generally lead to slightly better generated code (221 bytes
      saved for allmodconfig-GCOV_KERNEL, ~2k with GCOV_KERNEL).  As a small
      bonus, this also ensures that the nbits parameter is expanded exactly
      once.
      
      In BITMAP_FIRST_WORD_MASK, if start is signed gcc is technically allowed
      to assume it is positive (or divisible by BITS_PER_LONG), and hence just
      do the simple mask.  It doesn't seem to use this, and even on an
      architecture like x86 where the shift only depends on the lower 5 or 6
      bits, and these bits are not affected by the signedness of the expression,
      gcc still generates code to compute the C99 mandated value of start %
      BITS_PER_LONG.  So just use a mask explicitly, also for consistency with
      BITMAP_LAST_WORD_MASK.
      Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
      Cc: Tejun Heo <tj@kernel.org>
      Reviewed-by: NGeorge Spelvin <linux@horizon.com>
      Cc: Yury Norov <yury.norov@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      89c1e79e
    • R
      lib/string_helpers.c: change semantics of string_escape_mem · 41416f23
      Rasmus Villemoes 提交于
      The current semantics of string_escape_mem are inadequate for one of its
      current users, vsnprintf().  If that is to honour its contract, it must
      know how much space would be needed for the entire escaped buffer, and
      string_escape_mem provides no way of obtaining that (short of allocating a
      large enough buffer (~4 times input string) to let it play with, and
      that's definitely a big no-no inside vsnprintf).
      
      So change the semantics for string_escape_mem to be more snprintf-like:
      Return the size of the output that would be generated if the destination
      buffer was big enough, but of course still only write to the part of dst
      it is allowed to, and (contrary to snprintf) don't do '\0'-termination.
      It is then up to the caller to detect whether output was truncated and to
      append a '\0' if desired.  Also, we must output partial escape sequences,
      otherwise a call such as snprintf(buf, 3, "%1pE", "\123") would cause
      printf to write a \0 to buf[2] but leaving buf[0] and buf[1] with whatever
      they previously contained.
      
      This also fixes a bug in the escaped_string() helper function, which used
      to unconditionally pass a length of "end-buf" to string_escape_mem();
      since the latter doesn't check osz for being insanely large, it would
      happily write to dst.  For example, kasprintf(GFP_KERNEL, "something and
      then %pE", ...); is an easy way to trigger an oops.
      
      In test-string_helpers.c, the -ENOMEM test is replaced with testing for
      getting the expected return value even if the buffer is too small.  We
      also ensure that nothing is written (by relying on a NULL pointer deref)
      if the output size is 0 by passing NULL - this has to work for
      kasprintf("%pE") to work.
      
      In net/sunrpc/cache.c, I think qword_add still has the same semantics.
      Someone should definitely double-check this.
      
      In fs/proc/array.c, I made the minimum possible change, but longer-term it
      should stop poking around in seq_file internals.
      
      [andriy.shevchenko@linux.intel.com: simplify qword_add]
      [andriy.shevchenko@linux.intel.com: add missed curly braces]
      Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
      Acked-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Signed-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      41416f23
    • S
      printk: comment pr_cont() stating it is only to continue a line · 7b1460ec
      Steven Rostedt 提交于
      KERN_CONT is nicely commented in kern_levels.h, but pr_cont() is now used
      more often, and it lacks the comment stating what it is used for.  It can
      be confused as continuing the log level, but that is not its purpose.  Its
      purpose is to continue a line that had no newline enclosed.  This should
      be documented by pr_cont() as well.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      Acked-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7b1460ec
    • J
      kernel/reboot.c: add orderly_reboot for graceful reboot · 7a54f46b
      Joel Stanley 提交于
      The kernel has orderly_poweroff which allows the kernel to initiate a
      graceful shutdown of userspace, by running /sbin/poweroff.  This adds
      orderly_reboot that will cause userspace to shut itself down by calling
      /sbin/reboot.
      
      This will be used for shutdown initiated by a system controller on
      platforms that do not use ACPI.
      
      orderly_reboot() should be used when the system wants to allow userspace
      to gracefully shut itself down.  For cases where the system may imminently
      catch on fire, the existing emergency_restart() provides an immediate
      reboot without involving userspace.
      Signed-off-by: NJoel Stanley <joel@jms.id.au>
      Cc: Fabian Frederick <fabf@skynet.be>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Jeremy Kerr <jk@ozlabs.org>
      Cc: David S. Miller <davem@davemloft.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7a54f46b
    • J
      kernel/resource.c: remove deprecated __check_region() and friends · 96831c0a
      Jakub Sitnicki 提交于
      All users of __check_region(), check_region(), and check_mem_region() are
      gone.  We got rid of the last user in v4.0-rc1.  Remove them.
      
      bloat-o-meter on x86_64 shows:
      
      add/remove: 0/3 grow/shrink: 0/0 up/down: 0/-102 (-102)
      function                                     old     new   delta
      __kstrtab___check_region                      15       -     -15
      __ksymtab___check_region                      16       -     -16
      __check_region                                71       -     -71
      Signed-off-by: NJakub Sitnicki <jsitnicki@gmail.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      96831c0a
    • I
      kernel: conditionally support non-root users, groups and capabilities · 2813893f
      Iulia Manda 提交于
      There are a lot of embedded systems that run most or all of their
      functionality in init, running as root:root.  For these systems,
      supporting multiple users is not necessary.
      
      This patch adds a new symbol, CONFIG_MULTIUSER, that makes support for
      non-root users, non-root groups, and capabilities optional.  It is enabled
      under CONFIG_EXPERT menu.
      
      When this symbol is not defined, UID and GID are zero in any possible case
      and processes always have all capabilities.
      
      The following syscalls are compiled out: setuid, setregid, setgid,
      setreuid, setresuid, getresuid, setresgid, getresgid, setgroups,
      getgroups, setfsuid, setfsgid, capget, capset.
      
      Also, groups.c is compiled out completely.
      
      In kernel/capability.c, capable function was moved in order to avoid
      adding two ifdef blocks.
      
      This change saves about 25 KB on a defconfig build.  The most minimal
      kernels have total text sizes in the high hundreds of kB rather than
      low MB.  (The 25k goes down a bit with allnoconfig, but not that much.
      
      The kernel was booted in Qemu.  All the common functionalities work.
      Adding users/groups is not possible, failing with -ENOSYS.
      
      Bloat-o-meter output:
      add/remove: 7/87 grow/shrink: 19/397 up/down: 1675/-26325 (-24650)
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NIulia Manda <iulia.manda21@gmail.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      Acked-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Tested-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2813893f