1. 10 7月, 2014 1 次提交
    • P
      selinux: fix the default socket labeling in sock_graft() · 4da6daf4
      Paul Moore 提交于
      The sock_graft() hook has special handling for AF_INET, AF_INET, and
      AF_UNIX sockets as those address families have special hooks which
      label the sock before it is attached its associated socket.
      Unfortunately, the sock_graft() hook was missing a default approach
      to labeling sockets which meant that any other address family which
      made use of connections or the accept() syscall would find the
      returned socket to be in an "unlabeled" state.  This was recently
      demonstrated by the kcrypto/AF_ALG subsystem and the newly released
      cryptsetup package (cryptsetup v1.6.5 and later).
      
      This patch preserves the special handling in selinux_sock_graft(),
      but adds a default behavior - setting the sock's label equal to the
      associated socket - which resolves the problem with AF_ALG and
      presumably any other address family which makes use of accept().
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NPaul Moore <pmoore@redhat.com>
      Tested-by: NMilan Broz <gmazyland@gmail.com>
      4da6daf4
  2. 05 6月, 2014 1 次提交
    • S
      percpu-refcount: fix usage of this_cpu_ops · 0c36b390
      Sebastian Ott 提交于
      The percpu-refcount infrastructure uses the underscore variants of
      this_cpu_ops in order to modify percpu reference counters.
      (e.g. __this_cpu_inc()).
      
      However the underscore variants do not atomically update the percpu
      variable, instead they may be implemented using read-modify-write
      semantics (more than one instruction).  Therefore it is only safe to
      use the underscore variant if the context is always the same (process,
      softirq, or hardirq). Otherwise it is possible to lose updates.
      
      This problem is something that Sebastian has seen within the aio
      subsystem which uses percpu refcounters both in process and softirq
      context leading to reference counts that never dropped to zeroes; even
      though the number of "get" and "put" calls matched.
      
      Fix this by using the non-underscore this_cpu_ops variant which
      provides correct per cpu atomic semantics and fixes the corrupted
      reference counts.
      
      Cc: Kent Overstreet <kmo@daterainc.com>
      Cc: <stable@vger.kernel.org> # v3.11+
      Reported-by: NSebastian Ott <sebott@linux.vnet.ibm.com>
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      References: http://lkml.kernel.org/g/alpine.LFD.2.11.1406041540520.21183@denkbrett
      0c36b390
  3. 03 6月, 2014 4 次提交
    • J
      kernfs: move the last knowledge of sysfs out from kernfs · c9482a5b
      Jianyu Zhan 提交于
      There is still one residue of sysfs remaining: the sb_magic
      SYSFS_MAGIC. However this should be kernfs user specific,
      so this patch moves it out. Kerrnfs user should specify their
      magic number while mouting.
      Signed-off-by: NJianyu Zhan <nasa4836@gmail.com>
      Acked-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c9482a5b
    • E
      netlink: Only check file credentials for implicit destinations · 2d7a85f4
      Eric W. Biederman 提交于
      It was possible to get a setuid root or setcap executable to write to
      it's stdout or stderr (which has been set made a netlink socket) and
      inadvertently reconfigure the networking stack.
      
      To prevent this we check that both the creator of the socket and
      the currentl applications has permission to reconfigure the network
      stack.
      
      Unfortunately this breaks Zebra which always uses sendto/sendmsg
      and creates it's socket without any privileges.
      
      To keep Zebra working don't bother checking if the creator of the
      socket has privilege when a destination address is specified.  Instead
      rely exclusively on the privileges of the sender of the socket.
      
      Note from Andy: This is exactly Eric's code except for some comment
      clarifications and formatting fixes.  Neither I nor, I think, anyone
      else is thrilled with this approach, but I'm hesitant to wait on a
      better fix since 3.15 is almost here.
      
      Note to stable maintainers: This is a mess.  An earlier series of
      patches in 3.15 fix a rather serious security issue (CVE-2014-0181),
      but they did so in a way that breaks Zebra.  The offending series
      includes:
      
          commit aa4cf945
          Author: Eric W. Biederman <ebiederm@xmission.com>
          Date:   Wed Apr 23 14:28:03 2014 -0700
      
              net: Add variants of capable for use on netlink messages
      
      If a given kernel version is missing that series of fixes, it's
      probably worth backporting it and this patch.  if that series is
      present, then this fix is critical if you care about Zebra.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2d7a85f4
    • J
      team: fix mtu setting · 9d0d68fa
      Jiri Pirko 提交于
      Now it is not possible to set mtu to team device which has a port
      enslaved to it. The reason is that when team_change_mtu() calls
      dev_set_mtu() for port device, notificator for NETDEV_PRECHANGEMTU
      event is called and team_device_event() returns NOTIFY_BAD forbidding
      the change. So fix this by returning NOTIFY_DONE here in case team is
      changing mtu in team_change_mtu().
      
      Introduced-by: 3d249d4c "net: introduce ethernet teaming device"
      Signed-off-by: NJiri Pirko <jiri@resnulli.us>
      Acked-by: NFlavio Leitner <fbl@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9d0d68fa
    • E
      net: fix inet_getid() and ipv6_select_ident() bugs · 39c36094
      Eric Dumazet 提交于
      I noticed we were sending wrong IPv4 ID in TCP flows when MTU discovery
      is disabled.
      Note how GSO/TSO packets do not have monotonically incrementing ID.
      
      06:37:41.575531 IP (id 14227, proto: TCP (6), length: 4396)
      06:37:41.575534 IP (id 14272, proto: TCP (6), length: 65212)
      06:37:41.575544 IP (id 14312, proto: TCP (6), length: 57972)
      06:37:41.575678 IP (id 14317, proto: TCP (6), length: 7292)
      06:37:41.575683 IP (id 14361, proto: TCP (6), length: 63764)
      
      It appears I introduced this bug in linux-3.1.
      
      inet_getid() must return the old value of peer->ip_id_count,
      not the new one.
      
      Lets revert this part, and remove the prevention of
      a null identification field in IPv6 Fragment Extension Header,
      which is dubious and not even done properly.
      
      Fixes: 87c48fa3 ("ipv6: make fragment identifications less predictable")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      39c36094
  4. 28 5月, 2014 2 次提交
  5. 26 5月, 2014 1 次提交
  6. 24 5月, 2014 1 次提交
  7. 22 5月, 2014 2 次提交
  8. 21 5月, 2014 1 次提交
    • A
      dmaengine: omap: hide filter_fn for built-in drivers · a8246fed
      Arnd Bergmann 提交于
      It is not possible to reference the omap_dma_filter_fn filter
      function from a built-in driver if the dmaengine driver itself
      is a loadable module, which is a valid configuration otherwise.
      
      This provides only the dummy alternative if the function
      is referenced by a built-in driver to allow a successful
      build. The filter function is only required by ATAGS based
      platforms, which will continue to be broken after this change
      for the bogus configuration. When booting from DT, with the
      dma channels correctly listed there, it will work fine.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Acked-by: NTony Lindgren <tony@atomide.com>
      Cc: Russell King <rmk+kernel@arm.linux.org.uk>
      Cc: Vinod Koul <vinod.koul@intel.com>
      Cc: dmaengine@vger.kernel.org
      Signed-off-by: NVinod Koul <vinod.koul@intel.com>
      a8246fed
  9. 20 5月, 2014 2 次提交
  10. 19 5月, 2014 1 次提交
    • P
      perf: Fix a race between ring_buffer_detach() and ring_buffer_attach() · b69cf536
      Peter Zijlstra 提交于
      Alexander noticed that we use RCU iteration on rb->event_list but do
      not use list_{add,del}_rcu() to add,remove entries to that list, nor
      do we observe proper grace periods when re-using the entries.
      
      Merge ring_buffer_detach() into ring_buffer_attach() such that
      attaching to the NULL buffer is detaching.
      
      Furthermore, ensure that between any 'detach' and 'attach' of the same
      event we observe the required grace period, but only when strictly
      required. In effect this means that only ioctl(.request =
      PERF_EVENT_IOC_SET_OUTPUT) will wait for a grace period, while the
      normal initial attach and final detach will not be delayed.
      
      This patch should, I think, do the right thing under all
      circumstances, the 'normal' cases all should never see the extra grace
      period, but the two cases:
      
       1) PERF_EVENT_IOC_SET_OUTPUT on an event which already has a
          ring_buffer set, will now observe the required grace period between
          removing itself from the old and attaching itself to the new buffer.
      
          This case is 'simple' in that both buffers are present in
          perf_event_set_output() one could think an unconditional
          synchronize_rcu() would be sufficient; however...
      
       2) an event that has a buffer attached, the buffer is destroyed
          (munmap) and then the event is attached to a new/different buffer
          using PERF_EVENT_IOC_SET_OUTPUT.
      
          This case is more complex because the buffer destruction does:
            ring_buffer_attach(.rb = NULL)
          followed by the ioctl() doing:
            ring_buffer_attach(.rb = foo);
      
          and we still need to observe the grace period between these two
          calls due to us reusing the event->rb_entry list_head.
      
      In order to make 2 happen we use Paul's latest cond_synchronize_rcu()
      call.
      
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Reported-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20140507123526.GD13658@twins.programming.kicks-ass.netSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      b69cf536
  11. 17 5月, 2014 6 次提交
    • V
      bonding: Fix stacked device detection in arp monitoring · 44a40855
      Vlad Yasevich 提交于
      Prior to commit fbd929f2
      	bonding: support QinQ for bond arp interval
      
      the arp monitoring code allowed for proper detection of devices
      stacked on top of vlans.  Since the above commit, the
      code can still detect a device stacked on top of single
      vlan, but not a device stacked on top of Q-in-Q configuration.
      The search will only set the inner vlan tag if the route
      device is the vlan device.  However, this is not always the
      case, as it is possible to extend the stacked configuration.
      
      With this patch it is possible to provision devices on
      top Q-in-Q vlan configuration that should be used as
      a source of ARP monitoring information.
      
      For example:
      ip link add link bond0 vlan10 type vlan proto 802.1q id 10
      ip link add link vlan10 vlan100 type vlan proto 802.1q id 100
      ip link add link vlan100 type macvlan
      
      Note:  This patch limites the number of stacked VLANs to 2,
      just like before.  The original, however had another issue
      in that if we had more then 2 levels of VLANs, we would end
      up generating incorrectly tagged traffic.  This is no longer
      possible.
      
      Fixes: fbd929f2 (bonding: support QinQ for bond arp interval)
      CC: Jay Vosburgh <j.vosburgh@gmail.com>
      CC: Veaceslav Falico <vfalico@redhat.com>
      CC: Andy Gospodarek <andy@greyhouse.net>
      CC: Ding Tianhong <dingtianhong@huawei.com>
      CC: Patric McHardy <kaber@trash.net>
      Signed-off-by: NVlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      44a40855
    • V
      macvlan: Fix lockdep warnings with stacked macvlan devices · c674ac30
      Vlad Yasevich 提交于
      Macvlan devices try to avoid stacking, but that's not always
      successfull or even desired.  As an example, the following
      configuration is perefectly legal and valid:
      
      eth0 <--- macvlan0 <---- vlan0.10 <--- macvlan1
      
      However, this configuration produces the following lockdep
      trace:
      [  115.620418] ======================================================
      [  115.620477] [ INFO: possible circular locking dependency detected ]
      [  115.620516] 3.15.0-rc1+ #24 Not tainted
      [  115.620540] -------------------------------------------------------
      [  115.620577] ip/1704 is trying to acquire lock:
      [  115.620604]  (&vlan_netdev_addr_lock_key/1){+.....}, at: [<ffffffff815df49c>] dev_uc_sync+0x3c/0x80
      [  115.620686]
      but task is already holding lock:
      [  115.620723]  (&macvlan_netdev_addr_lock_key){+.....}, at: [<ffffffff815da5be>] dev_set_rx_mode+0x1e/0x40
      [  115.620795]
      which lock already depends on the new lock.
      
      [  115.620853]
      the existing dependency chain (in reverse order) is:
      [  115.620894]
      -> #1 (&macvlan_netdev_addr_lock_key){+.....}:
      [  115.620935]        [<ffffffff810d57f2>] lock_acquire+0xa2/0x130
      [  115.620974]        [<ffffffff816f62e7>] _raw_spin_lock_nested+0x37/0x50
      [  115.621019]        [<ffffffffa07296c3>] vlan_dev_set_rx_mode+0x53/0x110 [8021q]
      [  115.621066]        [<ffffffff815da557>] __dev_set_rx_mode+0x57/0xa0
      [  115.621105]        [<ffffffff815da5c6>] dev_set_rx_mode+0x26/0x40
      [  115.621143]        [<ffffffff815da6be>] __dev_open+0xde/0x140
      [  115.621174]        [<ffffffff815da9ad>] __dev_change_flags+0x9d/0x170
      [  115.621174]        [<ffffffff815daaa9>] dev_change_flags+0x29/0x60
      [  115.621174]        [<ffffffff815e7f11>] do_setlink+0x321/0x9a0
      [  115.621174]        [<ffffffff815ea59f>] rtnl_newlink+0x51f/0x730
      [  115.621174]        [<ffffffff815e6e75>] rtnetlink_rcv_msg+0x95/0x250
      [  115.621174]        [<ffffffff81608b19>] netlink_rcv_skb+0xa9/0xc0
      [  115.621174]        [<ffffffff815e6dca>] rtnetlink_rcv+0x2a/0x40
      [  115.621174]        [<ffffffff81608150>] netlink_unicast+0xf0/0x1c0
      [  115.621174]        [<ffffffff8160851f>] netlink_sendmsg+0x2ff/0x740
      [  115.621174]        [<ffffffff815bc9db>] sock_sendmsg+0x8b/0xc0
      [  115.621174]        [<ffffffff815bd4b9>] ___sys_sendmsg+0x369/0x380
      [  115.621174]        [<ffffffff815bdbb2>] __sys_sendmsg+0x42/0x80
      [  115.621174]        [<ffffffff815bdc02>] SyS_sendmsg+0x12/0x20
      [  115.621174]        [<ffffffff816ffd69>] system_call_fastpath+0x16/0x1b
      [  115.621174]
      -> #0 (&vlan_netdev_addr_lock_key/1){+.....}:
      [  115.621174]        [<ffffffff810d4d43>] __lock_acquire+0x1773/0x1a60
      [  115.621174]        [<ffffffff810d57f2>] lock_acquire+0xa2/0x130
      [  115.621174]        [<ffffffff816f62e7>] _raw_spin_lock_nested+0x37/0x50
      [  115.621174]        [<ffffffff815df49c>] dev_uc_sync+0x3c/0x80
      [  115.621174]        [<ffffffffa0696d2a>] macvlan_set_mac_lists+0xca/0x110 [macvlan]
      [  115.621174]        [<ffffffff815da557>] __dev_set_rx_mode+0x57/0xa0
      [  115.621174]        [<ffffffff815da5c6>] dev_set_rx_mode+0x26/0x40
      [  115.621174]        [<ffffffff815da6be>] __dev_open+0xde/0x140
      [  115.621174]        [<ffffffff815da9ad>] __dev_change_flags+0x9d/0x170
      [  115.621174]        [<ffffffff815daaa9>] dev_change_flags+0x29/0x60
      [  115.621174]        [<ffffffff815e7f11>] do_setlink+0x321/0x9a0
      [  115.621174]        [<ffffffff815ea59f>] rtnl_newlink+0x51f/0x730
      [  115.621174]        [<ffffffff815e6e75>] rtnetlink_rcv_msg+0x95/0x250
      [  115.621174]        [<ffffffff81608b19>] netlink_rcv_skb+0xa9/0xc0
      [  115.621174]        [<ffffffff815e6dca>] rtnetlink_rcv+0x2a/0x40
      [  115.621174]        [<ffffffff81608150>] netlink_unicast+0xf0/0x1c0
      [  115.621174]        [<ffffffff8160851f>] netlink_sendmsg+0x2ff/0x740
      [  115.621174]        [<ffffffff815bc9db>] sock_sendmsg+0x8b/0xc0
      [  115.621174]        [<ffffffff815bd4b9>] ___sys_sendmsg+0x369/0x380
      [  115.621174]        [<ffffffff815bdbb2>] __sys_sendmsg+0x42/0x80
      [  115.621174]        [<ffffffff815bdc02>] SyS_sendmsg+0x12/0x20
      [  115.621174]        [<ffffffff816ffd69>] system_call_fastpath+0x16/0x1b
      [  115.621174]
      other info that might help us debug this:
      
      [  115.621174]  Possible unsafe locking scenario:
      
      [  115.621174]        CPU0                    CPU1
      [  115.621174]        ----                    ----
      [  115.621174]   lock(&macvlan_netdev_addr_lock_key);
      [  115.621174]                                lock(&vlan_netdev_addr_lock_key/1);
      [  115.621174]                                lock(&macvlan_netdev_addr_lock_key);
      [  115.621174]   lock(&vlan_netdev_addr_lock_key/1);
      [  115.621174]
       *** DEADLOCK ***
      
      [  115.621174] 2 locks held by ip/1704:
      [  115.621174]  #0:  (rtnl_mutex){+.+.+.}, at: [<ffffffff815e6dbb>] rtnetlink_rcv+0x1b/0x40
      [  115.621174]  #1:  (&macvlan_netdev_addr_lock_key){+.....}, at: [<ffffffff815da5be>] dev_set_rx_mode+0x1e/0x40
      [  115.621174]
      stack backtrace:
      [  115.621174] CPU: 3 PID: 1704 Comm: ip Not tainted 3.15.0-rc1+ #24
      [  115.621174] Hardware name: Hewlett-Packard HP xw8400 Workstation/0A08h, BIOS 786D5 v02.38 10/25/2010
      [  115.621174]  ffffffff82339ae0 ffff880465f79568 ffffffff816ee20c ffffffff82339ae0
      [  115.621174]  ffff880465f795a8 ffffffff816e9e1b ffff880465f79600 ffff880465b019c8
      [  115.621174]  0000000000000001 0000000000000002 ffff880465b019c8 ffff880465b01230
      [  115.621174] Call Trace:
      [  115.621174]  [<ffffffff816ee20c>] dump_stack+0x4d/0x66
      [  115.621174]  [<ffffffff816e9e1b>] print_circular_bug+0x200/0x20e
      [  115.621174]  [<ffffffff810d4d43>] __lock_acquire+0x1773/0x1a60
      [  115.621174]  [<ffffffff810d3172>] ? trace_hardirqs_on_caller+0xb2/0x1d0
      [  115.621174]  [<ffffffff810d57f2>] lock_acquire+0xa2/0x130
      [  115.621174]  [<ffffffff815df49c>] ? dev_uc_sync+0x3c/0x80
      [  115.621174]  [<ffffffff816f62e7>] _raw_spin_lock_nested+0x37/0x50
      [  115.621174]  [<ffffffff815df49c>] ? dev_uc_sync+0x3c/0x80
      [  115.621174]  [<ffffffff815df49c>] dev_uc_sync+0x3c/0x80
      [  115.621174]  [<ffffffffa0696d2a>] macvlan_set_mac_lists+0xca/0x110 [macvlan]
      [  115.621174]  [<ffffffff815da557>] __dev_set_rx_mode+0x57/0xa0
      [  115.621174]  [<ffffffff815da5c6>] dev_set_rx_mode+0x26/0x40
      [  115.621174]  [<ffffffff815da6be>] __dev_open+0xde/0x140
      [  115.621174]  [<ffffffff815da9ad>] __dev_change_flags+0x9d/0x170
      [  115.621174]  [<ffffffff815daaa9>] dev_change_flags+0x29/0x60
      [  115.621174]  [<ffffffff811e1db1>] ? mem_cgroup_bad_page_check+0x21/0x30
      [  115.621174]  [<ffffffff815e7f11>] do_setlink+0x321/0x9a0
      [  115.621174]  [<ffffffff810d394c>] ? __lock_acquire+0x37c/0x1a60
      [  115.621174]  [<ffffffff815ea59f>] rtnl_newlink+0x51f/0x730
      [  115.621174]  [<ffffffff815ea169>] ? rtnl_newlink+0xe9/0x730
      [  115.621174]  [<ffffffff815e6e75>] rtnetlink_rcv_msg+0x95/0x250
      [  115.621174]  [<ffffffff810d329d>] ? trace_hardirqs_on+0xd/0x10
      [  115.621174]  [<ffffffff815e6dbb>] ? rtnetlink_rcv+0x1b/0x40
      [  115.621174]  [<ffffffff815e6de0>] ? rtnetlink_rcv+0x40/0x40
      [  115.621174]  [<ffffffff81608b19>] netlink_rcv_skb+0xa9/0xc0
      [  115.621174]  [<ffffffff815e6dca>] rtnetlink_rcv+0x2a/0x40
      [  115.621174]  [<ffffffff81608150>] netlink_unicast+0xf0/0x1c0
      [  115.621174]  [<ffffffff8160851f>] netlink_sendmsg+0x2ff/0x740
      [  115.621174]  [<ffffffff815bc9db>] sock_sendmsg+0x8b/0xc0
      [  115.621174]  [<ffffffff8119d4af>] ? might_fault+0x5f/0xb0
      [  115.621174]  [<ffffffff8119d4f8>] ? might_fault+0xa8/0xb0
      [  115.621174]  [<ffffffff8119d4af>] ? might_fault+0x5f/0xb0
      [  115.621174]  [<ffffffff815cb51e>] ? verify_iovec+0x5e/0xe0
      [  115.621174]  [<ffffffff815bd4b9>] ___sys_sendmsg+0x369/0x380
      [  115.621174]  [<ffffffff816faa0d>] ? __do_page_fault+0x11d/0x570
      [  115.621174]  [<ffffffff810cfe9f>] ? up_read+0x1f/0x40
      [  115.621174]  [<ffffffff816fab04>] ? __do_page_fault+0x214/0x570
      [  115.621174]  [<ffffffff8120a10b>] ? mntput_no_expire+0x6b/0x1c0
      [  115.621174]  [<ffffffff8120a0b7>] ? mntput_no_expire+0x17/0x1c0
      [  115.621174]  [<ffffffff8120a284>] ? mntput+0x24/0x40
      [  115.621174]  [<ffffffff815bdbb2>] __sys_sendmsg+0x42/0x80
      [  115.621174]  [<ffffffff815bdc02>] SyS_sendmsg+0x12/0x20
      [  115.621174]  [<ffffffff816ffd69>] system_call_fastpath+0x16/0x1b
      
      Fix this by correctly providing macvlan lockdep class.
      Signed-off-by: NVlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c674ac30
    • V
      vlan: Fix lockdep warning with stacked vlan devices. · d38569ab
      Vlad Yasevich 提交于
      This reverts commit dc8eaaa0.
      	vlan: Fix lockdep warning when vlan dev handle notification
      
      Instead we use the new new API to find the lock subclass of
      our vlan device.  This way we can support configurations where
      vlans are interspersed with other devices:
        bond -> vlan -> macvlan -> vlan
      Signed-off-by: NVlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d38569ab
    • V
      net: Allow for more then a single subclass for netif_addr_lock · 25175ba5
      Vlad Yasevich 提交于
      Currently netif_addr_lock_nested assumes that there can be only
      a single nesting level between 2 devices.  However, if we
      have multiple devices of the same type stacked, this fails.
      For example:
       eth0 <-- vlan0.10 <-- vlan0.10.20
      
      A more complicated configuration may stack more then one type of
      device in different order.
      Ex:
        eth0 <-- vlan0.10 <-- macvlan0 <-- vlan1.10.20 <-- macvlan1
      
      This patch adds an ndo_* function that allows each stackable
      device to report its nesting level.  If the device doesn't
      provide this function default subclass of 1 is used.
      Signed-off-by: NVlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      25175ba5
    • V
      net: Find the nesting level of a given device by type. · 4085ebe8
      Vlad Yasevich 提交于
      Multiple devices in the kernel can be stacked/nested and they
      need to know their nesting level for the purposes of lockdep.
      This patch provides a generic function that determines a nesting
      level of a particular device by its type (ex: vlan, macvlan, etc).
      We only care about nesting of the same type of devices.
      
      For example:
        eth0 <- vlan0.10 <- macvlan0 <- vlan1.20
      
      The nesting level of vlan1.20 would be 1, since there is another vlan
      in the stack under it.
      Signed-off-by: NVlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4085ebe8
    • M
      net/mlx4_core: Add UPDATE_QP SRIOV wrapper support · ce8d9e0d
      Matan Barak 提交于
      This patch adds UPDATE_QP SRIOV wrapper support.
      
      The mechanism is a general one, but currently only source MAC
      index changes are allowed for VFs.
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ce8d9e0d
  12. 16 5月, 2014 2 次提交
    • D
      ipv6: update Destination Cache entries when gateway turn into host · be7a010d
      Duan Jiong 提交于
      RFC 4861 states in 7.2.5:
      
      	The IsRouter flag in the cache entry MUST be set based on the
               Router flag in the received advertisement.  In those cases
               where the IsRouter flag changes from TRUE to FALSE as a result
               of this update, the node MUST remove that router from the
               Default Router List and update the Destination Cache entries
               for all destinations using that neighbor as a router as
               specified in Section 7.3.3.  This is needed to detect when a
               node that is used as a router stops forwarding packets due to
               being configured as a host.
      
      Currently, when dealing with NA Message which IsRouter flag changes from
      TRUE to FALSE, the kernel only removes router from the Default Router List,
      and don't update the Destination Cache entries.
      
      Now in order to update those Destination Cache entries, i introduce
      function rt6_clean_tohost().
      Signed-off-by: NDuan Jiong <duanj.fnst@cn.fujitsu.com>
      Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      be7a010d
    • C
      rtnetlink: wait for unregistering devices in rtnl_link_unregister() · 200b916f
      Cong Wang 提交于
      From: Cong Wang <cwang@twopensource.com>
      
      commit 50624c93 (net: Delay default_device_exit_batch until no
      devices are unregistering) introduced rtnl_lock_unregistering() for
      default_device_exit_batch(). Same race could happen we when rmmod a driver
      which calls rtnl_link_unregister() as we call dev->destructor without rtnl
      lock.
      
      For long term, I think we should clean up the mess of netdev_run_todo()
      and net namespce exit code.
      
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: David S. Miller <davem@davemloft.net>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NCong Wang <cwang@twopensource.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      200b916f
  13. 15 5月, 2014 2 次提交
    • S
      of: fix CONFIG_OF=n prototype of of_node_full_name() · 4c358e15
      Stephen Rothwell 提交于
      Make the CONFIG_OF=n prototpe of of_node_full_name() mateh the CONFIG_OF=y
      version.
      
      Fixes compile warnings like this:
      
      sound/soc/soc-core.c: In function 'soc_check_aux_dev':
      sound/soc/soc-core.c:1667:3: warning: passing argument 1 of 'of_node_full_name' discards 'const' qualifier from pointer target type [enabled by default]
         codecname = of_node_full_name(aux_dev->codec_of_node);
      
      when CONFIG_OF is not defined.
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NGrant Likely <grant.likely@linaro.org>
      4c358e15
    • J
      asm-generic: remove _STK_LIM_MAX · ffe6902b
      James Hogan 提交于
      _STK_LIM_MAX could be used to override the RLIMIT_STACK hard limit from
      an arch's include/uapi/asm-generic/resource.h file, but is no longer
      used since both parisc and metag removed the override. Therefore remove
      it entirely, setting the hard RLIMIT_STACK limit to RLIM_INFINITY
      directly in include/asm-generic/resource.h.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: linux-arch@vger.kernel.org
      Cc: Helge Deller <deller@gmx.de>
      Cc: John David Anglin <dave.anglin@bell.net>
      ffe6902b
  14. 14 5月, 2014 2 次提交
  15. 13 5月, 2014 3 次提交
    • T
      cgroup: introduce task_css_is_root() · 5024ae29
      Tejun Heo 提交于
      Determining the css of a task usually requires RCU read lock as that's
      the only thing which keeps the returned css accessible till its
      reference is acquired; however, testing whether a task belongs to the
      root can be performed without dereferencing the returned css by
      comparing the returned pointer against the root one in init_css_set[]
      which never changes.
      
      Implement task_css_is_root() which can be invoked in any context.
      This will be used by the scheduled cgroup_freezer change.
      
      v2: cgroup no longer supports modular controllers.  No need to export
          init_css_set.  Pointed out by Li.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      5024ae29
    • J
      nl80211: fix NL80211_FEATURE_P2P_DEVICE_NEEDS_CHANNEL API · f5651986
      Johannes Berg 提交于
      My commit removing that also removed it from the header file
      which can break compilation of userspace that needed it, add
      it back for API/ABI compatibility purposes (but no code to
      implement anything for it.)
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      f5651986
    • T
      kernfs, sysfs, cgroup: restrict extra perm check on open to sysfs · 555724a8
      Tejun Heo 提交于
      The kernfs open method - kernfs_fop_open() - inherited extra
      permission checks from sysfs.  While the vfs layer allows ignoring the
      read/write permissions checks if the issuer has CAP_DAC_OVERRIDE,
      sysfs explicitly denied open regardless of the cap if the file doesn't
      have any of the UGO perms of the requested access or doesn't implement
      the requested operation.  It can be debated whether this was a good
      idea or not but the behavior is too subtle and dangerous to change at
      this point.
      
      After cgroup got converted to kernfs, this extra perm check also got
      applied to cgroup breaking libcgroup which opens write-only files with
      O_RDWR as root.  This patch gates the extra open permission check with
      a new flag KERNFS_ROOT_EXTRA_OPEN_PERM_CHECK and enables it for sysfs.
      For sysfs, nothing changes.  For cgroup, root now can perform any
      operation regardless of the permissions as it was before kernfs
      conversion.  Note that kernfs still fails unimplemented operations
      with -EINVAL.
      
      While at it, add comments explaining KERNFS_ROOT flags.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NAndrey Wagin <avagin@gmail.com>
      Tested-by: NAndrey Wagin <avagin@gmail.com>
      Cc: Li Zefan <lizefan@huawei.com>
      References: http://lkml.kernel.org/g/CANaxB-xUm3rJ-Cbp72q-rQJO5mZe1qK6qXsQM=vh0U8upJ44+A@mail.gmail.com
      Fixes: 2bd59d48 ("cgroup: convert to kernfs")
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      555724a8
  16. 09 5月, 2014 2 次提交
  17. 08 5月, 2014 4 次提交
  18. 07 5月, 2014 3 次提交
    • J
      sched/deadline: Fix sched_yield() behavior · 5bfd126e
      Juri Lelli 提交于
      yield_task_dl() is broken:
      
       o it forces current to be throttled setting its runtime to zero;
       o it sets current's dl_se->dl_new to one, expecting that dl_task_timer()
         will queue it back with proper parameters at replenish time.
      
      Unfortunately, dl_task_timer() has this check at the very beginning:
      
      	if (!dl_task(p) || dl_se->dl_new)
      		goto unlock;
      
      So, it just bails out and the task is never replenished. It actually
      yielded forever.
      
      To fix this, introduce a new flag indicating that the task properly yielded
      the CPU before its current runtime expired. While this is a little overdoing
      at the moment, the flag would be useful in the future to discriminate between
      "good" jobs (of which remaining runtime could be reclaimed, i.e. recycled)
      and "bad" jobs (for which dl_throttled task has been set) that needed to be
      stopped.
      Reported-by: Nyjay.kim <yjay.kim@lge.com>
      Signed-off-by: NJuri Lelli <juri.lelli@gmail.com>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20140429103953.e68eba1b2ac3309214e3dc5a@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      5bfd126e
    • A
      genirq: Provide irq_force_affinity fallback for non-SMP · 4c88d7f9
      Arnd Bergmann 提交于
      Patch 01f8fa4f "genirq: Allow forcing cpu affinity of interrupts" added
      an irq_force_affinity() function, and 30ccf03b "clocksource: Exynos_mct:
      Use irq_force_affinity() in cpu bringup" subsequently uses it. However, the
      driver can be used with CONFIG_SMP disabled, but the function declaration
      is only available for CONFIG_SMP, leading to this build error:
      
      drivers/clocksource/exynos_mct.c:431:3: error: implicit declaration of function 'irq_force_affinity' [-Werror=implicit-function-declaration]
         irq_force_affinity(mct_irqs[MCT_L0_IRQ + cpu], cpumask_of(cpu));
      
      This patch introduces a dummy helper function for the non-SMP case
      that always returns success, to get rid of the build error.
      Since the patches causing the problem are marked for stable backports,
      this one should be as well.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Krzysztof Kozlowski <k.kozlowski@samsung.com>
      Acked-by: NKukjin Kim <kgene.kim@samsung.com>
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/5619084.0zmrrIUZLV@wuerfelSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      4c88d7f9
    • C
      slub: use sysfs'es release mechanism for kmem_cache · 41a21285
      Christoph Lameter 提交于
      debugobjects warning during netfilter exit:
      
          ------------[ cut here ]------------
          WARNING: CPU: 6 PID: 4178 at lib/debugobjects.c:260 debug_print_object+0x8d/0xb0()
          ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x20
          Modules linked in:
          CPU: 6 PID: 4178 Comm: kworker/u16:2 Tainted: G        W 3.11.0-next-20130906-sasha #3984
          Workqueue: netns cleanup_net
          Call Trace:
            dump_stack+0x52/0x87
            warn_slowpath_common+0x8c/0xc0
            warn_slowpath_fmt+0x46/0x50
            debug_print_object+0x8d/0xb0
            __debug_check_no_obj_freed+0xa5/0x220
            debug_check_no_obj_freed+0x15/0x20
            kmem_cache_free+0x197/0x340
            kmem_cache_destroy+0x86/0xe0
            nf_conntrack_cleanup_net_list+0x131/0x170
            nf_conntrack_pernet_exit+0x5d/0x70
            ops_exit_list+0x5e/0x70
            cleanup_net+0xfb/0x1c0
            process_one_work+0x338/0x550
            worker_thread+0x215/0x350
            kthread+0xe7/0xf0
            ret_from_fork+0x7c/0xb0
      
      Also during dcookie cleanup:
      
          WARNING: CPU: 12 PID: 9725 at lib/debugobjects.c:260 debug_print_object+0x8c/0xb0()
          ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x20
          Modules linked in:
          CPU: 12 PID: 9725 Comm: trinity-c141 Not tainted 3.15.0-rc2-next-20140423-sasha-00018-gc4ff6c4 #408
          Call Trace:
            dump_stack (lib/dump_stack.c:52)
            warn_slowpath_common (kernel/panic.c:430)
            warn_slowpath_fmt (kernel/panic.c:445)
            debug_print_object (lib/debugobjects.c:262)
            __debug_check_no_obj_freed (lib/debugobjects.c:697)
            debug_check_no_obj_freed (lib/debugobjects.c:726)
            kmem_cache_free (mm/slub.c:2689 mm/slub.c:2717)
            kmem_cache_destroy (mm/slab_common.c:363)
            dcookie_unregister (fs/dcookies.c:302 fs/dcookies.c:343)
            event_buffer_release (arch/x86/oprofile/../../../drivers/oprofile/event_buffer.c:153)
            __fput (fs/file_table.c:217)
            ____fput (fs/file_table.c:253)
            task_work_run (kernel/task_work.c:125 (discriminator 1))
            do_notify_resume (include/linux/tracehook.h:196 arch/x86/kernel/signal.c:751)
            int_signal (arch/x86/kernel/entry_64.S:807)
      
      Sysfs has a release mechanism.  Use that to release the kmem_cache
      structure if CONFIG_SYSFS is enabled.
      
      Only slub is changed - slab currently only supports /proc/slabinfo and
      not /sys/kernel/slab/*.  We talked about adding that and someone was
      working on it.
      
      [akpm@linux-foundation.org: fix CONFIG_SYSFS=n build]
      [akpm@linux-foundation.org: fix CONFIG_SYSFS=n build even more]
      Signed-off-by: NChristoph Lameter <cl@linux.com>
      Reported-by: NSasha Levin <sasha.levin@oracle.com>
      Tested-by: NSasha Levin <sasha.levin@oracle.com>
      Acked-by: NGreg KH <greg@kroah.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Bart Van Assche <bvanassche@acm.org>
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      41a21285