1. 01 4月, 2014 5 次提交
    • L
      Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 971eae7c
      Linus Torvalds 提交于
      Pull scheduler changes from Ingo Molnar:
       "Bigger changes:
      
         - sched/idle restructuring: they are WIP preparation for deeper
           integration between the scheduler and idle state selection, by
           Nicolas Pitre.
      
         - add NUMA scheduling pseudo-interleaving, by Rik van Riel.
      
         - optimize cgroup context switches, by Peter Zijlstra.
      
         - RT scheduling enhancements, by Thomas Gleixner.
      
        The rest is smaller changes, non-urgnt fixes and cleanups"
      
      * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (68 commits)
        sched: Clean up the task_hot() function
        sched: Remove double calculation in fix_small_imbalance()
        sched: Fix broken setscheduler()
        sparc64, sched: Remove unused sparc64_multi_core
        sched: Remove unused mc_capable() and smt_capable()
        sched/numa: Move task_numa_free() to __put_task_struct()
        sched/fair: Fix endless loop in idle_balance()
        sched/core: Fix endless loop in pick_next_task()
        sched/fair: Push down check for high priority class task into idle_balance()
        sched/rt: Fix picking RT and DL tasks from empty queue
        trace: Replace hardcoding of 19 with MAX_NICE
        sched: Guarantee task priority in pick_next_task()
        sched/idle: Remove stale old file
        sched: Put rq's sched_avg under CONFIG_FAIR_GROUP_SCHED
        cpuidle/arm64: Remove redundant cpuidle_idle_call()
        cpuidle/powernv: Remove redundant cpuidle_idle_call()
        sched, nohz: Exclude isolated cores from load balancing
        sched: Fix select_task_rq_fair() description comments
        workqueue: Replace hardcoding of -20 and 19 with MIN_NICE and MAX_NICE
        sys: Replace hardcoding of -20 and 19 with MIN_NICE and MAX_NICE
        ...
      971eae7c
    • L
      Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 8c292f11
      Linus Torvalds 提交于
      Pull perf changes from Ingo Molnar:
       "Main changes:
      
        Kernel side changes:
      
         - Add SNB/IVB/HSW client uncore memory controller support (Stephane
           Eranian)
      
         - Fix various x86/P4 PMU driver bugs (Don Zickus)
      
        Tooling, user visible changes:
      
         - Add several futex 'perf bench' microbenchmarks (Davidlohr Bueso)
      
         - Speed up thread map generation (Don Zickus)
      
         - Introduce 'perf kvm --list-cmds' command line option for use by
           scripts (Ramkumar Ramachandra)
      
         - Print the evsel name in the annotate stdio output, prep to fix
           support outputting annotation for multiple events, not just for the
           first one (Arnaldo Carvalho de Melo)
      
         - Allow setting preferred callchain method in .perfconfig (Jiri Olsa)
      
         - Show in what binaries/modules 'perf probe's are set (Masami
           Hiramatsu)
      
         - Support distro-style debuginfo for uprobe in 'perf probe' (Masami
           Hiramatsu)
      
        Tooling, internal changes and fixes:
      
         - Use tid in mmap/mmap2 events to find maps (Don Zickus)
      
         - Record the reason for filtering an address_location (Namhyung Kim)
      
         - Apply all filters to an addr_location (Namhyung Kim)
      
         - Merge al->filtered with hist_entry->filtered in report/hists
           (Namhyung Kim)
      
         - Fix memory leak when synthesizing thread records (Namhyung Kim)
      
         - Use ui__has_annotation() in 'report' (Namhyung Kim)
      
         - hists browser refactorings to reuse code accross UIs (Namhyung Kim)
      
         - Add support for the new DWARF unwinder library in elfutils (Jiri
           Olsa)
      
         - Fix build race in the generation of bison files (Jiri Olsa)
      
         - Further streamline the feature detection display, trimming it a bit
           to show just the libraries detected, using VF=1 gets a more verbose
           output, showing the less interesting feature checks as well (Jiri
           Olsa).
      
         - Check compatible symtab type before loading dso (Namhyung Kim)
      
         - Check return value of filename__read_debuglink() (Stephane Eranian)
      
         - Move some hashing and fs related code from tools/perf/util/ to
           tools/lib/ so that it can be used by more tools/ living utilities
           (Borislav Petkov)
      
         - Prepare DWARF unwinding code for using an elfutils alternative
           unwinding library (Jiri Olsa)
      
         - Fix DWARF unwind max_stack processing (Jiri Olsa)
      
         - Add dwarf unwind 'perf test' entry (Jiri Olsa)
      
         - 'perf probe' improvements including memory leak fixes, sharing the
           intlist class with other tools, uprobes/kprobes code sharing and
           use of ref_reloc_sym (Masami Hiramatsu)
      
         - Shorten sample symbol resolving by adding cpumode to struct
           addr_location (Arnaldo Carvalho de Melo)
      
         - Fix synthesizing mmaps for threads (Don Zickus)
      
         - Fix invalid output on event group stdio report (Namhyung Kim)
      
         - Fixup header alignment in 'perf sched latency' output (Ramkumar
           Ramachandra)
      
         - Fix off-by-one error in 'perf timechart record' argv handling
           (Ramkumar Ramachandra)
      
        Tooling, cleanups:
      
         - Remove unused thread__find_map function (Jiri Olsa)
      
         - Remove unused simple_strtoul() function (Ramkumar Ramachandra)
      
        Tooling, documentation updates:
      
         - Update function names in debug messages (Ramkumar Ramachandra)
      
         - Update some code references in design.txt (Ramkumar Ramachandra)
      
         - Clarify load-latency information in the 'perf mem' docs (Andi
           Kleen)
      
         - Clarify x86 register naming in 'perf probe' docs (Andi Kleen)"
      
      * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (96 commits)
        perf tools: Remove unused simple_strtoul() function
        perf tools: Update some code references in design.txt
        perf evsel: Update function names in debug messages
        perf tools: Remove thread__find_map function
        perf annotate: Print the evsel name in the stdio output
        perf report: Use ui__has_annotation()
        perf tools: Fix memory leak when synthesizing thread records
        perf tools: Use tid in mmap/mmap2 events to find maps
        perf report: Merge al->filtered with hist_entry->filtered
        perf symbols: Apply all filters to an addr_location
        perf symbols: Record the reason for filtering an address_location
        perf sched: Fixup header alignment in 'latency' output
        perf timechart: Fix off-by-one error in 'record' argv handling
        perf machine: Factor machine__find_thread to take tid argument
        perf tools: Speed up thread map generation
        perf kvm: introduce --list-cmds for use by scripts
        perf ui hists: Pass evsel to hpp->header/width functions explicitly
        perf symbols: Introduce thread__find_cpumode_addr_location
        perf session: Change header.misc dump from decimal to hex
        perf ui/tui: Reuse generic __hpp__fmt() code
        ...
      8c292f11
    • L
      Merge branch 'core-types-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · d31605dc
      Linus Torvalds 提交于
      Pull hweight type fix from Ingo Molnar:
       "This lone commit makes sure that __const_hweight8() is unsigned, which
        addresses a build warning if code is built with -Wsign-compare.
      
        I hope the type cast in this cleanup is fine - another option would be
        to eliminate the double unary negation and use a construct with more
        obvious integer type characteristics, along the lines of:
      
              ((w) & (1ULL << 1) ? 1U : 0U)
      
        or so"
      
      * 'core-types-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        bitops: Fix signedness of compile-time hweight implementations
      d31605dc
    • L
      Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · b3fd4ea9
      Linus Torvalds 提交于
      Pull RCU updates from Ingo Molnar:
       "Main changes:
      
         - Torture-test changes, including refactoring of rcutorture and
           introduction of a vestigial locktorture.
      
         - Real-time latency fixes.
      
         - Documentation updates.
      
         - Miscellaneous fixes"
      
      * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (77 commits)
        rcu: Provide grace-period piggybacking API
        rcu: Ensure kernel/rcu/rcu.h can be sourced/used stand-alone
        rcu: Fix sparse warning for rcu_expedited from kernel/ksysfs.c
        notifier: Substitute rcu_access_pointer() for rcu_dereference_raw()
        Documentation/memory-barriers.txt: Clarify release/acquire ordering
        rcutorture: Save kvm.sh output to log
        rcutorture: Add a lock_busted to test the test
        rcutorture: Place kvm-test-1-run.sh output into res directory
        rcutorture: Rename TREE_RCU-Kconfig.txt
        locktorture: Add kvm-recheck.sh plug-in for locktorture
        rcutorture: Gracefully handle NULL cleanup hooks
        locktorture: Add vestigial locktorture configuration
        rcutorture: Introduce "rcu" directory level underneath configs
        rcutorture: Rename kvm-test-1-rcu.sh
        rcutorture: Remove RCU dependencies from ver_functions.sh API
        rcutorture: Create CFcommon file for common Kconfig parameters
        rcutorture: Create config files for scripted test-the-test testing
        rcutorture: Add an rcu_busted to test the test
        locktorture: Add a lock-torture kernel module
        rcutorture: Abstract kvm-recheck.sh
        ...
      b3fd4ea9
    • L
      Merge branch 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 462bf234
      Linus Torvalds 提交于
      Pull core locking updates from Ingo Molnar:
       "The biggest change is the MCS spinlock generalization changes from Tim
        Chen, Peter Zijlstra, Jason Low et al.  There's also lockdep
        fixes/enhancements from Oleg Nesterov, in particular a false negative
        fix related to lockdep_set_novalidate_class() usage"
      
      * 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (22 commits)
        locking/mutex: Fix debug checks
        locking/mutexes: Add extra reschedule point
        locking/mutexes: Introduce cancelable MCS lock for adaptive spinning
        locking/mutexes: Unlock the mutex without the wait_lock
        locking/mutexes: Modify the way optimistic spinners are queued
        locking/mutexes: Return false if task need_resched() in mutex_can_spin_on_owner()
        locking: Move mcs_spinlock.h into kernel/locking/
        m68k: Skip futex_atomic_cmpxchg_inatomic() test
        futex: Allow architectures to skip futex_atomic_cmpxchg_inatomic() test
        Revert "sched/wait: Suppress Sparse 'variable shadowing' warning"
        lockdep: Change lockdep_set_novalidate_class() to use _and_name
        lockdep: Change mark_held_locks() to check hlock->check instead of lockdep_no_validate
        lockdep: Don't create the wrong dependency on hlock->check == 0
        lockdep: Make held_lock->check and "int check" argument bool
        locking/mcs: Allow architecture specific asm files to be used for contended case
        locking/mcs: Order the header files in Kbuild of each architecture in alphabetical order
        sched/wait: Suppress Sparse 'variable shadowing' warning
        hung_task/Documentation: Fix hung_task_warnings description
        locking/mcs: Allow architectures to hook in to contended paths
        locking/mcs: Micro-optimize the MCS code, add extra comments
        ...
      462bf234
  2. 31 3月, 2014 10 次提交
    • L
      Linux 3.14 · 455c6fdb
      Linus Torvalds 提交于
      455c6fdb
    • L
      Merge branch 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · fedc1ed0
      Linus Torvalds 提交于
      Pull vfs fixes from Al Viro:
       "Switch mnt_hash to hlist, turning the races between __lookup_mnt() and
        hash modifications into false negatives from __lookup_mnt() (instead
        of hangs)"
      
      On the false negatives from __lookup_mnt():
       "The *only* thing we care about is not getting stuck in __lookup_mnt().
        If it misses an entry because something in front of it just got moved
        around, etc, we are fine.  We'll notice that mount_lock mismatch and
        that'll be it"
      
      * 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        switch mnt_hash to hlist
        don't bother with propagate_mnt() unless the target is shared
        keep shadowed vfsmounts together
        resizable namespace.c hashes
      fedc1ed0
    • R
      MAINTAINERS: resume as Documentation maintainer · 01358e56
      Randy Dunlap 提交于
      I am the new kernel tree Documentation maintainer (except for parts that
      are handled by other people, of course).
      Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
      Acked-by: NRob Landley <rob@landley.net>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      01358e56
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input · 915ac4e2
      Linus Torvalds 提交于
      Pull input updates from Dmitry Torokhov:
       "Some more updates for the input subsystem.
      
        You will get a fix for race in mousedev that has been causing quite a
        few oopses lately and a small fixup for force feedback support in
        evdev"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
        Input: mousedev - fix race when creating mixed device
        Input: don't modify the id of ioctl-provided ff effect on upload failure
      915ac4e2
    • E
      AUDIT: Allow login in non-init namespaces · aa4af831
      Eric Paris 提交于
      It its possible to configure your PAM stack to refuse login if audit
      messages (about the login) were unable to be sent.  This is common in
      many distros and thus normal configuration of many containers.  The PAM
      modules determine if audit is enabled/disabled in the kernel based on
      the return value from sending an audit message on the netlink socket.
      If userspace gets back ECONNREFUSED it believes audit is disabled in the
      kernel.  If it gets any other error else it refuses to let the login
      proceed.
      
      Just about ever since the introduction of namespaces the kernel audit
      subsystem has returned EPERM if the task sending a message was not in
      the init user or pid namespace.  So many forms of containers have never
      worked if audit was enabled in the kernel.
      
      BUT if the container was not in net_init then the kernel network code
      would send ECONNREFUSED (instead of the audit code sending EPERM).  Thus
      by pure accident/dumb luck/bug if an admin configured the PAM stack to
      reject all logins that didn't talk to audit, but then ran the login
      untility in the non-init_net namespace, it would work!! Clearly this was
      a bug, but it is a bug some people expected.
      
      With the introduction of network namespace support in 3.14-rc1 the two
      bugs stopped cancelling each other out.  Now, containers in the
      non-init_net namespace refused to let users log in (just like PAM was
      configfured!) Obviously some people were not happy that what used to let
      users log in, now didn't!
      
      This fix is kinda hacky.  We return ECONNREFUSED for all non-init
      relevant namespaces.  That means that not only will the old broken
      non-init_net setups continue to work, now the broken non-init_pid or
      non-init_user setups will 'work'.  They don't really work, since audit
      isn't logging things.  But it's what most users want.
      
      In 3.15 we should have patches to support not only the non-init_net
      (3.14) namespace but also the non-init_pid and non-init_user namespace.
      So all will be right in the world.  This just opens the doors wide open
      on 3.14 and hopefully makes users happy, if not the audit system...
      Reported-by: NAndre Tomt <andre@tomt.net>
      Reported-by: NAdam Richter <adam_richter2004@yahoo.com>
      Signed-off-by: NEric Paris <eparis@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      aa4af831
    • T
      ext4: atomically set inode->i_flags in ext4_set_inode_flags() · 00a1a053
      Theodore Ts'o 提交于
      Use cmpxchg() to atomically set i_flags instead of clearing out the
      S_IMMUTABLE, S_APPEND, etc. flags and then setting them from the
      EXT4_IMMUTABLE_FL, EXT4_APPEND_FL flags, since this opens up a race
      where an immutable file has the immutable flag cleared for a brief
      window of time.
      Reported-by: NJohn Sullivan <jsrhbz@kanargh.force9.co.uk>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      00a1a053
    • A
      switch mnt_hash to hlist · 38129a13
      Al Viro 提交于
      fixes RCU bug - walking through hlist is safe in face of element moves,
      since it's self-terminating.  Cyclic lists are not - if we end up jumping
      to another hash chain, we'll loop infinitely without ever hitting the
      original list head.
      
      [fix for dumb braino folded]
      
      Spotted by: Max Kellermann <mk@cm4all.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      38129a13
    • A
      don't bother with propagate_mnt() unless the target is shared · 0b1b901b
      Al Viro 提交于
      If the dest_mnt is not shared, propagate_mnt() does nothing -
      there's no mounts to propagate to and thus no copies to create.
      Might as well don't bother calling it in that case.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      0b1b901b
    • A
      keep shadowed vfsmounts together · 1d6a32ac
      Al Viro 提交于
      preparation to switching mnt_hash to hlist
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      1d6a32ac
    • A
      resizable namespace.c hashes · 0818bf27
      Al Viro 提交于
      * switch allocation to alloc_large_system_hash()
      * make sizes overridable by boot parameters (mhash_entries=, mphash_entries=)
      * switch mountpoint_hashtable from list_head to hlist_head
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      0818bf27
  3. 30 3月, 2014 5 次提交
  4. 29 3月, 2014 20 次提交
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 49d8137a
      Linus Torvalds 提交于
      Pull networking fixes from David Miller:
      
       1) We've discovered a common error in several networking drivers, they
          put VLAN offload features into ->vlan_features, which would suggest
          that they support offloading 2 or more levels of VLAN encapsulation.
          Not only do these devices not do that, but we don't have the
          infrastructure yet to handle that at all.
      
          Fixes from Vlad Yasevich.
      
       2) Fix tcpdump crash with bridging and vlans, also from Vlad.
      
       3) Some MAINTAINERS updates for random32 and bonding.
      
       4) Fix late reseeds of prandom generator, from Sasha Levin.
      
       5) Bridge doesn't handle stacked vlans properly, fix from Toshiaki
          Makita.
      
       6) Fix deadlock in openvswitch, from Flavio Leitner.
      
       7) get_timewait4_sock() doesn't report delay times correctly, fix from
          Eric Dumazet.
      
       8) Duplicate address detection and addrconf verification need to run in
          contexts where RTNL can be obtained.  Move them to run from a
          workqueue.  From Hannes Frederic Sowa.
      
       9) Fix route refcount leaking in ip tunnels, from Pravin B Shelar.
      
      10) Don't return -EINTR from non-blocking recvmsg() on AF_UNIX sockets,
          from Eric Dumazet.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (28 commits)
        vlan: Warn the user if lowerdev has bad vlan features.
        veth: Turn off vlan rx acceleration in vlan_features
        ifb: Remove vlan acceleration from vlan_features
        qlge: Do not propaged vlan tag offloads to vlans
        bridge: Fix crash with vlan filtering and tcpdump
        net: Account for all vlan headers in skb_mac_gso_segment
        MAINTAINERS: bonding: change email address
        MAINTAINERS: bonding: change email address
        ipv6: move DAD and addrconf_verify processing to workqueue
        tcp: fix get_timewait4_sock() delay computation on 64bit
        openvswitch: fix a possible deadlock and lockdep warning
        bridge: Fix handling stacked vlan tags
        bridge: Fix inabillity to retrieve vlan tags when tx offload is disabled
        vhost: validate vhost_get_vq_desc return value
        vhost: fix total length when packets are too short
        random32: avoid attempt to late reseed if in the middle of seeding
        random32: assign to network folks in MAINTAINERS
        net/mlx4_core: pass pci_device_id.driver_data to __mlx4_init_one during reset
        core, nfqueue, openvswitch: Orphan frags in skb_zerocopy and handle errors
        vlan: Set hard_header_len according to available acceleration
        ...
      49d8137a
    • D
      Merge branch 'vlan_offloads' · 5f2feca2
      David S. Miller 提交于
      Vlad Yasevich says:
      
      ====================
      Audit all drivers for correct vlan_features.
      
      Some drivers set vlan acceleration features in vlan_features.  This causes
      issues with Q-in-Q/802.1ad configurations.
      
      Audit all the drivers for correct vlan_features.  Fix broken ones.
      Add a warning to vlan code to help catch future offenders.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5f2feca2
    • V
      vlan: Warn the user if lowerdev has bad vlan features. · 2adb956b
      Vlad Yasevich 提交于
      Some drivers incorrectly assign vlan acceleration features to
      vlan_features thus causing issues for Q-in-Q vlan configurations.
      Warn the user of such cases.
      Signed-off-by: NVlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2adb956b
    • V
      veth: Turn off vlan rx acceleration in vlan_features · 3f8c707b
      Vlad Yasevich 提交于
      For completeness, turn off vlan rx acceleration in vlan_features so
      that it doesn't show up on q-in-q setups.
      Signed-off-by: NVlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3f8c707b
    • V
      ifb: Remove vlan acceleration from vlan_features · 8dd6e147
      Vlad Yasevich 提交于
      Do not include vlan acceleration features in vlan_features as that
      precludes correct Q-in-Q operation.
      Signed-off-by: NVlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8dd6e147
    • V
      qlge: Do not propaged vlan tag offloads to vlans · f6d1ac4b
      Vlad Yasevich 提交于
      qlge driver turns off NETIF_F_HW_CTAG_FILTER, but forgets to
      turn off HW_CTAG_TX and HW_CTAG_RX on vlan devices.  With the
      current settings, q-in-q will only generate a single vlan header.
      Remember to mask off CTAG_TX and CTAG_RX features in vlan_features.
      
      CC: Shahed Shaikh <shahed.shaikh@qlogic.com>
      CC: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
      CC: Ron Mercer <ron.mercer@qlogic.com>
      Signed-off-by: NVlad Yasevich <vyasevic@redhat.com>
      Acked-by: NJitendra Kalsaria <jitendra.kalsaria@qlogic.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f6d1ac4b
    • V
      bridge: Fix crash with vlan filtering and tcpdump · fc92f745
      Vlad Yasevich 提交于
      When the vlan filtering is enabled on the bridge, but
      the filter is not configured on the bridge device itself,
      running tcpdump on the bridge device will result in a
      an Oops with NULL pointer dereference.  The reason
      is that br_pass_frame_up() will bypass the vlan
      check because promisc flag is set.  It will then try
      to get the table pointer and process the packet based
      on the table.  Since the table pointer is NULL, we oops.
      Catch this special condition in br_handle_vlan().
      Reported-by: NToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      CC: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Signed-off-by: NVlad Yasevich <vyasevic@redhat.com>
      Acked-by: NToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fc92f745
    • V
      net: Account for all vlan headers in skb_mac_gso_segment · 53d6471c
      Vlad Yasevich 提交于
      skb_network_protocol() already accounts for multiple vlan
      headers that may be present in the skb.  However, skb_mac_gso_segment()
      doesn't know anything about it and assumes that skb->mac_len
      is set correctly to skip all mac headers.  That may not
      always be the case.  If we are simply forwarding the packet (via
      bridge or macvtap), all vlan headers may not be accounted for.
      
      A simple solution is to allow skb_network_protocol to return
      the vlan depth it has calculated.  This way skb_mac_gso_segment
      will correctly skip all mac headers.
      Signed-off-by: NVlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      53d6471c
    • V
      898602a0
    • J
      MAINTAINERS: bonding: change email address · 79b30750
      Jay Vosburgh 提交于
      Update my email address.
      Signed-off-by: NJay Vosburgh <fubar@us.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      79b30750
    • L
      Merge branch 'akpm' (patches from Andrew Morton) · bc53267e
      Linus Torvalds 提交于
      Merge two fixes from Andrew Morton:
       "The x86 fix should come from x86 guys but they appear to be
        conferencing or otherwise distracted.
      
        The ocfs2 fix is a bit of a mess - the code runs into an immediate
        NULL deref and we're trying to work out how this got through test and
        review, but we haven't heard from Goldwyn in the past few days.
        Sasha's patch fixes the oops, but the feature as a whole is probably
        broken.  So this is a stopgap for 3.14 - I'll aim to get the real
        fixes into 3.14.x"
      
      * emailed patches from Andrew Morton akpm@linux-foundation.org>:
        x86: fix boot on uniprocessor systems
        ocfs2: check if cluster name exists before deref
      bc53267e
    • A
      x86: fix boot on uniprocessor systems · 825600c0
      Artem Fetishev 提交于
      On x86 uniprocessor systems topology_physical_package_id() returns -1
      which causes rapl_cpu_prepare() to leave rapl_pmu variable uninitialized
      which leads to GPF in rapl_pmu_init().
      
      See arch/x86/kernel/cpu/perf_event_intel_rapl.c.
      
      It turns out that physical_package_id and core_id can actually be
      retreived for uniprocessor systems too.  Enabling them also fixes
      rapl_pmu code.
      Signed-off-by: NArtem Fetishev <artem_fetishev@epam.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      825600c0
    • S
      ocfs2: check if cluster name exists before deref · d9060742
      Sasha Levin 提交于
      Commit c74a3bdd ("ocfs2: add clustername to cluster connection") is
      trying to strlcpy a string which was explicitly passed as NULL in the
      very same patch, triggering a NULL ptr deref.
      
        BUG: unable to handle kernel NULL pointer dereference at           (null)
        IP: strlcpy (lib/string.c:388 lib/string.c:151)
        CPU: 19 PID: 19426 Comm: trinity-c19 Tainted: G        W     3.14.0-rc7-next-20140325-sasha-00014-g9476368-dirty #274
        RIP:  strlcpy (lib/string.c:388 lib/string.c:151)
        Call Trace:
         ocfs2_cluster_connect (fs/ocfs2/stackglue.c:350)
         ocfs2_cluster_connect_agnostic (fs/ocfs2/stackglue.c:396)
         user_dlm_register (fs/ocfs2/dlmfs/userdlm.c:679)
         dlmfs_mkdir (fs/ocfs2/dlmfs/dlmfs.c:503)
         vfs_mkdir (fs/namei.c:3467)
         SyS_mkdirat (fs/namei.c:3488 fs/namei.c:3472)
         tracesys (arch/x86/kernel/entry_64.S:749)
      
      akpm: this patch probably disables the feature.  A temporary thing to
      avoid triviel oopses.
      Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
      Cc: Goldwyn Rodrigues <rgoldwyn@suse.com>
      Cc: Mark Fasheh <mfasheh@suse.de>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d9060742
    • H
      ipv6: move DAD and addrconf_verify processing to workqueue · c15b1cca
      Hannes Frederic Sowa 提交于
      addrconf_join_solict and addrconf_join_anycast may cause actions which
      need rtnl locked, especially on first address creation.
      
      A new DAD state is introduced which defers processing of the initial
      DAD processing into a workqueue.
      
      To get rtnl lock we need to push the code paths which depend on those
      calls up to workqueues, specifically addrconf_verify and the DAD
      processing.
      
      (v2)
      addrconf_dad_failure needs to be queued up to the workqueue, too. This
      patch introduces a new DAD state and stop the DAD processing in the
      workqueue (this is because of the possible ipv6_del_addr processing
      which removes the solicited multicast address from the device).
      
      addrconf_verify_lock is removed, too. After the transition it is not
      needed any more.
      
      As we are not processing in bottom half anymore we need to be a bit more
      careful about disabling bottom half out when we lock spin_locks which are also
      used in bh.
      
      Relevant backtrace:
      [  541.030090] RTNL: assertion failed at net/core/dev.c (4496)
      [  541.031143] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G           O 3.10.33-1-amd64-vyatta #1
      [  541.031145] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
      [  541.031146]  ffffffff8148a9f0 000000000000002f ffffffff813c98c1 ffff88007c4451f8
      [  541.031148]  0000000000000000 0000000000000000 ffffffff813d3540 ffff88007fc03d18
      [  541.031150]  0000880000000006 ffff88007c445000 ffffffffa0194160 0000000000000000
      [  541.031152] Call Trace:
      [  541.031153]  <IRQ>  [<ffffffff8148a9f0>] ? dump_stack+0xd/0x17
      [  541.031180]  [<ffffffff813c98c1>] ? __dev_set_promiscuity+0x101/0x180
      [  541.031183]  [<ffffffff813d3540>] ? __hw_addr_create_ex+0x60/0xc0
      [  541.031185]  [<ffffffff813cfe1a>] ? __dev_set_rx_mode+0xaa/0xc0
      [  541.031189]  [<ffffffff813d3a81>] ? __dev_mc_add+0x61/0x90
      [  541.031198]  [<ffffffffa01dcf9c>] ? igmp6_group_added+0xfc/0x1a0 [ipv6]
      [  541.031208]  [<ffffffff8111237b>] ? kmem_cache_alloc+0xcb/0xd0
      [  541.031212]  [<ffffffffa01ddcd7>] ? ipv6_dev_mc_inc+0x267/0x300 [ipv6]
      [  541.031216]  [<ffffffffa01c2fae>] ? addrconf_join_solict+0x2e/0x40 [ipv6]
      [  541.031219]  [<ffffffffa01ba2e9>] ? ipv6_dev_ac_inc+0x159/0x1f0 [ipv6]
      [  541.031223]  [<ffffffffa01c0772>] ? addrconf_join_anycast+0x92/0xa0 [ipv6]
      [  541.031226]  [<ffffffffa01c311e>] ? __ipv6_ifa_notify+0x11e/0x1e0 [ipv6]
      [  541.031229]  [<ffffffffa01c3213>] ? ipv6_ifa_notify+0x33/0x50 [ipv6]
      [  541.031233]  [<ffffffffa01c36c8>] ? addrconf_dad_completed+0x28/0x100 [ipv6]
      [  541.031241]  [<ffffffff81075c1d>] ? task_cputime+0x2d/0x50
      [  541.031244]  [<ffffffffa01c38d6>] ? addrconf_dad_timer+0x136/0x150 [ipv6]
      [  541.031247]  [<ffffffffa01c37a0>] ? addrconf_dad_completed+0x100/0x100 [ipv6]
      [  541.031255]  [<ffffffff8105313a>] ? call_timer_fn.isra.22+0x2a/0x90
      [  541.031258]  [<ffffffffa01c37a0>] ? addrconf_dad_completed+0x100/0x100 [ipv6]
      
      Hunks and backtrace stolen from a patch by Stephen Hemminger.
      Reported-by: NStephen Hemminger <stephen@networkplumber.org>
      Signed-off-by: NStephen Hemminger <stephen@networkplumber.org>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c15b1cca
    • E
      tcp: fix get_timewait4_sock() delay computation on 64bit · e2a1d3e4
      Eric Dumazet 提交于
      It seems I missed one change in get_timewait4_sock() to compute
      the remaining time before deletion of IPV4 timewait socket.
      
      This could result in wrong output in /proc/net/tcp for tm->when field.
      
      Fixes: 96f817fe ("tcp: shrink tcp6_timewait_sock by one cache line")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e2a1d3e4
    • F
      openvswitch: fix a possible deadlock and lockdep warning · 4f647e0a
      Flavio Leitner 提交于
      There are two problematic situations.
      
      A deadlock can happen when is_percpu is false because it can get
      interrupted while holding the spinlock. Then it executes
      ovs_flow_stats_update() in softirq context which tries to get
      the same lock.
      
      The second sitation is that when is_percpu is true, the code
      correctly disables BH but only for the local CPU, so the
      following can happen when locking the remote CPU without
      disabling BH:
      
             CPU#0                            CPU#1
        ovs_flow_stats_get()
         stats_read()
       +->spin_lock remote CPU#1        ovs_flow_stats_get()
       |  <interrupted>                  stats_read()
       |  ...                       +-->  spin_lock remote CPU#0
       |                            |     <interrupted>
       |  ovs_flow_stats_update()   |     ...
       |   spin_lock local CPU#0 <--+     ovs_flow_stats_update()
       +---------------------------------- spin_lock local CPU#1
      
      This patch disables BH for both cases fixing the deadlocks.
      Acked-by: NJesse Gross <jesse@nicira.com>
      
      =================================
      [ INFO: inconsistent lock state ]
      3.14.0-rc8-00007-g632b06aa #1 Tainted: G          I
      ---------------------------------
      inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
      swapper/0/0 [HC0[0]:SC1[5]:HE1:SE0] takes:
      (&(&cpu_stats->lock)->rlock){+.?...}, at: [<ffffffffa05dd8a1>] ovs_flow_stats_update+0x51/0xd0 [openvswitch]
      {SOFTIRQ-ON-W} state was registered at:
      [<ffffffff810f973f>] __lock_acquire+0x68f/0x1c40
      [<ffffffff810fb4e2>] lock_acquire+0xa2/0x1d0
      [<ffffffff817d8d9e>] _raw_spin_lock+0x3e/0x80
      [<ffffffffa05dd9e4>] ovs_flow_stats_get+0xc4/0x1e0 [openvswitch]
      [<ffffffffa05da855>] ovs_flow_cmd_fill_info+0x185/0x360 [openvswitch]
      [<ffffffffa05daf05>] ovs_flow_cmd_build_info.constprop.27+0x55/0x90 [openvswitch]
      [<ffffffffa05db41d>] ovs_flow_cmd_new_or_set+0x4dd/0x570 [openvswitch]
      [<ffffffff816c245d>] genl_family_rcv_msg+0x1cd/0x3f0
      [<ffffffff816c270e>] genl_rcv_msg+0x8e/0xd0
      [<ffffffff816c0239>] netlink_rcv_skb+0xa9/0xc0
      [<ffffffff816c0798>] genl_rcv+0x28/0x40
      [<ffffffff816bf830>] netlink_unicast+0x100/0x1e0
      [<ffffffff816bfc57>] netlink_sendmsg+0x347/0x770
      [<ffffffff81668e9c>] sock_sendmsg+0x9c/0xe0
      [<ffffffff816692d9>] ___sys_sendmsg+0x3a9/0x3c0
      [<ffffffff8166a911>] __sys_sendmsg+0x51/0x90
      [<ffffffff8166a962>] SyS_sendmsg+0x12/0x20
      [<ffffffff817e3ce9>] system_call_fastpath+0x16/0x1b
      irq event stamp: 1740726
      hardirqs last  enabled at (1740726): [<ffffffff8175d5e0>] ip6_finish_output2+0x4f0/0x840
      hardirqs last disabled at (1740725): [<ffffffff8175d59b>] ip6_finish_output2+0x4ab/0x840
      softirqs last  enabled at (1740674): [<ffffffff8109be12>] _local_bh_enable+0x22/0x50
      softirqs last disabled at (1740675): [<ffffffff8109db05>] irq_exit+0xc5/0xd0
      
      other info that might help us debug this:
       Possible unsafe locking scenario:
      
             CPU0
             ----
        lock(&(&cpu_stats->lock)->rlock);
        <Interrupt>
          lock(&(&cpu_stats->lock)->rlock);
      
       *** DEADLOCK ***
      
      5 locks held by swapper/0/0:
       #0:  (((&ifa->dad_timer))){+.-...}, at: [<ffffffff810a7155>] call_timer_fn+0x5/0x320
       #1:  (rcu_read_lock){.+.+..}, at: [<ffffffff81788a55>] mld_sendpack+0x5/0x4a0
       #2:  (rcu_read_lock_bh){.+....}, at: [<ffffffff8175d149>] ip6_finish_output2+0x59/0x840
       #3:  (rcu_read_lock_bh){.+....}, at: [<ffffffff8168ba75>] __dev_queue_xmit+0x5/0x9b0
       #4:  (rcu_read_lock){.+.+..}, at: [<ffffffffa05e41b5>] internal_dev_xmit+0x5/0x110 [openvswitch]
      
      stack backtrace:
      CPU: 0 PID: 0 Comm: swapper/0 Tainted: G          I  3.14.0-rc8-00007-g632b06aa #1
      Hardware name:                  /DX58SO, BIOS SOX5810J.86A.5599.2012.0529.2218 05/29/2012
       0000000000000000 0fcf20709903df0c ffff88042d603808 ffffffff817cfe3c
       ffffffff81c134c0 ffff88042d603858 ffffffff817cb6da 0000000000000005
       ffffffff00000001 ffff880400000000 0000000000000006 ffffffff81c134c0
      Call Trace:
       <IRQ>  [<ffffffff817cfe3c>] dump_stack+0x4d/0x66
       [<ffffffff817cb6da>] print_usage_bug+0x1f4/0x205
       [<ffffffff810f7f10>] ? check_usage_backwards+0x180/0x180
       [<ffffffff810f8963>] mark_lock+0x223/0x2b0
       [<ffffffff810f96d3>] __lock_acquire+0x623/0x1c40
       [<ffffffff810f5707>] ? __lock_is_held+0x57/0x80
       [<ffffffffa05e26c6>] ? masked_flow_lookup+0x236/0x250 [openvswitch]
       [<ffffffff810fb4e2>] lock_acquire+0xa2/0x1d0
       [<ffffffffa05dd8a1>] ? ovs_flow_stats_update+0x51/0xd0 [openvswitch]
       [<ffffffff817d8d9e>] _raw_spin_lock+0x3e/0x80
       [<ffffffffa05dd8a1>] ? ovs_flow_stats_update+0x51/0xd0 [openvswitch]
       [<ffffffffa05dd8a1>] ovs_flow_stats_update+0x51/0xd0 [openvswitch]
       [<ffffffffa05dcc64>] ovs_dp_process_received_packet+0x84/0x120 [openvswitch]
       [<ffffffff810f93f7>] ? __lock_acquire+0x347/0x1c40
       [<ffffffffa05e3bea>] ovs_vport_receive+0x2a/0x30 [openvswitch]
       [<ffffffffa05e4218>] internal_dev_xmit+0x68/0x110 [openvswitch]
       [<ffffffffa05e41b5>] ? internal_dev_xmit+0x5/0x110 [openvswitch]
       [<ffffffff8168b4a6>] dev_hard_start_xmit+0x2e6/0x8b0
       [<ffffffff8168be87>] __dev_queue_xmit+0x417/0x9b0
       [<ffffffff8168ba75>] ? __dev_queue_xmit+0x5/0x9b0
       [<ffffffff8175d5e0>] ? ip6_finish_output2+0x4f0/0x840
       [<ffffffff8168c430>] dev_queue_xmit+0x10/0x20
       [<ffffffff8175d641>] ip6_finish_output2+0x551/0x840
       [<ffffffff8176128a>] ? ip6_finish_output+0x9a/0x220
       [<ffffffff8176128a>] ip6_finish_output+0x9a/0x220
       [<ffffffff8176145f>] ip6_output+0x4f/0x1f0
       [<ffffffff81788c29>] mld_sendpack+0x1d9/0x4a0
       [<ffffffff817895b8>] mld_send_initial_cr.part.32+0x88/0xa0
       [<ffffffff817691b0>] ? addrconf_dad_completed+0x220/0x220
       [<ffffffff8178e301>] ipv6_mc_dad_complete+0x31/0x50
       [<ffffffff817690d7>] addrconf_dad_completed+0x147/0x220
       [<ffffffff817691b0>] ? addrconf_dad_completed+0x220/0x220
       [<ffffffff8176934f>] addrconf_dad_timer+0x19f/0x1c0
       [<ffffffff810a71e9>] call_timer_fn+0x99/0x320
       [<ffffffff810a7155>] ? call_timer_fn+0x5/0x320
       [<ffffffff817691b0>] ? addrconf_dad_completed+0x220/0x220
       [<ffffffff810a76c4>] run_timer_softirq+0x254/0x3b0
       [<ffffffff8109d47d>] __do_softirq+0x12d/0x480
      Signed-off-by: NFlavio Leitner <fbl@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4f647e0a
    • T
      bridge: Fix handling stacked vlan tags · 99b192da
      Toshiaki Makita 提交于
      If a bridge with vlan_filtering enabled receives frames with stacked
      vlan tags, i.e., they have two vlan tags, br_vlan_untag() strips not
      only the outer tag but also the inner tag.
      
      br_vlan_untag() is called only from br_handle_vlan(), and in this case,
      it is enough to set skb->vlan_tci to 0 here, because vlan_tci has already
      been set before calling br_handle_vlan().
      Signed-off-by: NToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Acked-by: NVlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      99b192da
    • T
      bridge: Fix inabillity to retrieve vlan tags when tx offload is disabled · 12464bb8
      Toshiaki Makita 提交于
      Bridge vlan code (br_vlan_get_tag()) assumes that all frames have vlan_tci
      if they are tagged, but if vlan tx offload is manually disabled on bridge
      device and frames are sent from vlan device on the bridge device, the tags
      are embedded in skb->data and they break this assumption.
      Extract embedded vlan tags and move them to vlan_tci at ingress.
      Signed-off-by: NToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Acked-by: NVlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      12464bb8
    • M
      vhost: validate vhost_get_vq_desc return value · a39ee449
      Michael S. Tsirkin 提交于
      vhost fails to validate negative error code
      from vhost_get_vq_desc causing
      a crash: we are using -EFAULT which is 0xfffffff2
      as vector size, which exceeds the allocated size.
      
      The code in question was introduced in commit
      8dd014ad
          vhost-net: mergeable buffers support
      
      CVE-2014-0055
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a39ee449
    • M
      vhost: fix total length when packets are too short · d8316f39
      Michael S. Tsirkin 提交于
      When mergeable buffers are disabled, and the
      incoming packet is too large for the rx buffer,
      get_rx_bufs returns success.
      
      This was intentional in order for make recvmsg
      truncate the packet and then handle_rx would
      detect err != sock_len and drop it.
      
      Unfortunately we pass the original sock_len to
      recvmsg - which means we use parts of iov not fully
      validated.
      
      Fix this up by detecting this overrun and doing packet drop
      immediately.
      
      CVE-2014-0077
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d8316f39