1. 07 1月, 2016 1 次提交
  2. 06 1月, 2016 2 次提交
  3. 05 1月, 2016 2 次提交
    • R
      af_unix: Fix splice-bind deadlock · c845acb3
      Rainer Weikusat 提交于
      On 2015/11/06, Dmitry Vyukov reported a deadlock involving the splice
      system call and AF_UNIX sockets,
      
      http://lists.openwall.net/netdev/2015/11/06/24
      
      The situation was analyzed as
      
      (a while ago) A: socketpair()
      B: splice() from a pipe to /mnt/regular_file
      	does sb_start_write() on /mnt
      C: try to freeze /mnt
      	wait for B to finish with /mnt
      A: bind() try to bind our socket to /mnt/new_socket_name
      	lock our socket, see it not bound yet
      	decide that it needs to create something in /mnt
      	try to do sb_start_write() on /mnt, block (it's
      	waiting for C).
      D: splice() from the same pipe to our socket
      	lock the pipe, see that socket is connected
      	try to lock the socket, block waiting for A
      B:	get around to actually feeding a chunk from
      	pipe to file, try to lock the pipe.  Deadlock.
      
      on 2015/11/10 by Al Viro,
      
      http://lists.openwall.net/netdev/2015/11/10/4
      
      The patch fixes this by removing the kern_path_create related code from
      unix_mknod and executing it as part of unix_bind prior acquiring the
      readlock of the socket in question. This means that A (as used above)
      will sb_start_write on /mnt before it acquires the readlock, hence, it
      won't indirectly block B which first did a sb_start_write and then
      waited for a thread trying to acquire the readlock. Consequently, A
      being blocked by C waiting for B won't cause a deadlock anymore
      (effectively, both A and B acquire two locks in opposite order in the
      situation described above).
      
      Dmitry Vyukov(<dvyukov@google.com>) tested the original patch.
      Signed-off-by: NRainer Weikusat <rweikusat@mobileactivedefense.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c845acb3
    • D
      net: Propagate lookup failure in l3mdev_get_saddr to caller · b5bdacf3
      David Ahern 提交于
      Commands run in a vrf context are not failing as expected on a route lookup:
          root@kenny:~# ip ro ls table vrf-red
          unreachable default
      
          root@kenny:~# ping -I vrf-red -c1 -w1 10.100.1.254
          ping: Warning: source address might be selected on device other than vrf-red.
          PING 10.100.1.254 (10.100.1.254) from 0.0.0.0 vrf-red: 56(84) bytes of data.
      
          --- 10.100.1.254 ping statistics ---
          2 packets transmitted, 0 received, 100% packet loss, time 999ms
      
      Since the vrf table does not have a route for 10.100.1.254 the ping
      should have failed. The saddr lookup causes a full VRF table lookup.
      Propogating a lookup failure to the user allows the command to fail as
      expected:
      
          root@kenny:~# ping -I vrf-red -c1 -w1 10.100.1.254
          connect: No route to host
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b5bdacf3
  4. 31 12月, 2015 2 次提交
    • X
      sctp: sctp should release assoc when sctp_make_abort_user return NULL in sctp_close · 068d8bd3
      Xin Long 提交于
      In sctp_close, sctp_make_abort_user may return NULL because of memory
      allocation failure. If this happens, it will bypass any state change
      and never free the assoc. The assoc has no chance to be freed and it
      will be kept in memory with the state it had even after the socket is
      closed by sctp_close().
      
      So if sctp_make_abort_user fails to allocate memory, we should abort
      the asoc via sctp_primitive_ABORT as well. Just like the annotation in
      sctp_sf_cookie_wait_prm_abort and sctp_sf_do_9_1_prm_abort said,
      "Even if we can't send the ABORT due to low memory delete the TCB.
      This is a departure from our typical NOMEM handling".
      
      But then the chunk is NULL (low memory) and the SCTP_CMD_REPLY cmd would
      dereference the chunk pointer, and system crash. So we should add
      SCTP_CMD_REPLY cmd only when the chunk is not NULL, just like other
      places where it adds SCTP_CMD_REPLY cmd.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      068d8bd3
    • N
      net, socket, socket_wq: fix missing initialization of flags · 574aab1e
      Nicolai Stange 提交于
      Commit ceb5d58b ("net: fix sock_wake_async() rcu protection") from
      the current 4.4 release cycle introduced a new flags member in
      struct socket_wq and moved SOCKWQ_ASYNC_NOSPACE and SOCKWQ_ASYNC_WAITDATA
      from struct socket's flags member into that new place.
      
      Unfortunately, the new flags field is never initialized properly, at least
      not for the struct socket_wq instance created in sock_alloc_inode().
      
      One particular issue I encountered because of this is that my GNU Emacs
      failed to draw anything on my desktop -- i.e. what I got is a transparent
      window, including the title bar. Bisection lead to the commit mentioned
      above and further investigation by means of strace told me that Emacs
      is indeed speaking to my Xorg through an O_ASYNC AF_UNIX socket. This is
      reproducible 100% of times and the fact that properly initializing the
      struct socket_wq ->flags fixes the issue leads me to the conclusion that
      somehow SOCKWQ_ASYNC_WAITDATA got set in the uninitialized ->flags,
      preventing my Emacs from receiving any SIGIO's due to data becoming
      available and it got stuck.
      
      Make sock_alloc_inode() set the newly created struct socket_wq's ->flags
      member to zero.
      
      Fixes: ceb5d58b ("net: fix sock_wake_async() rcu protection")
      Signed-off-by: NNicolai Stange <nicstange@gmail.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      574aab1e
  5. 30 12月, 2015 1 次提交
    • J
      openvswitch: Fix template leak in error cases. · 90c7afc9
      Joe Stringer 提交于
      Commit 5b48bb8506c5 ("openvswitch: Fix helper reference leak") fixed a
      reference leak on helper objects, but inadvertently introduced a leak on
      the ct template.
      
      Previously, ct_info.ct->general.use was initialized to 0 by
      nf_ct_tmpl_alloc() and only incremented when ovs_ct_copy_action()
      returned successful. If an error occurred while adding the helper or
      adding the action to the actions buffer, the __ovs_ct_free_action()
      cleanup would use nf_ct_put() to free the entry; However, this relies on
      atomic_dec_and_test(ct_info.ct->general.use). This reference must be
      incremented first, or nf_ct_put() will never free it.
      
      Fix the issue by acquiring a reference to the template immediately after
      allocation.
      
      Fixes: cae3a262 ("openvswitch: Allow attaching helpers to ct action")
      Fixes: 5b48bb8506c5 ("openvswitch: Fix helper reference leak")
      Signed-off-by: NJoe Stringer <joe@ovn.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      90c7afc9
  6. 28 12月, 2015 2 次提交
  7. 24 12月, 2015 1 次提交
  8. 23 12月, 2015 3 次提交
  9. 19 12月, 2015 2 次提交
  10. 18 12月, 2015 5 次提交
  11. 17 12月, 2015 1 次提交
  12. 16 12月, 2015 4 次提交
  13. 15 12月, 2015 10 次提交
  14. 14 12月, 2015 3 次提交
    • D
      net: Flush local routes when device changes vrf association · 7f49e7a3
      David Ahern 提交于
      The VRF driver cycles netdevs when an interface is enslaved or released:
      the down event is used to flush neighbor and route tables and the up
      event (if the interface was already up) effectively moves local and
      connected routes to the proper table.
      
      As of 4f823def the local route is left hanging around after a link
      down, so when a netdev is moved from one VRF to another (or released
      from a VRF altogether) local routes are left in the wrong table.
      
      Fix by handling the NETDEV_CHANGEUPPER event. When the upper dev is
      an L3mdev then call fib_disable_ip to flush all routes, local ones
      to.
      
      Fixes: 4f823def ("ipv4: fix to not remove local route on link down")
      Cc: Julian Anastasov <ja@ssi.bg>
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7f49e7a3
    • P
      sched/wait: Fix the signal handling fix · dfd01f02
      Peter Zijlstra 提交于
      Jan Stancek reported that I wrecked things for him by fixing things for
      Vladimir :/
      
      His report was due to an UNINTERRUPTIBLE wait getting -EINTR, which
      should not be possible, however my previous patch made this possible by
      unconditionally checking signal_pending().
      
      We cannot use current->state as was done previously, because the
      instruction after the store to that variable it can be changed.  We must
      instead pass the initial state along and use that.
      
      Fixes: 68985633 ("sched/wait: Fix signal handling in bit wait helpers")
      Reported-by: NJan Stancek <jstancek@redhat.com>
      Reported-by: NChris Mason <clm@fb.com>
      Tested-by: NJan Stancek <jstancek@redhat.com>
      Tested-by: NVladimir Murzin <vladimir.murzin@arm.com>
      Tested-by: NChris Mason <clm@fb.com>
      Reviewed-by: NPaul Turner <pjt@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: tglx@linutronix.de
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: hpa@zytor.com
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      dfd01f02
    • X
      netfilter: nf_tables: use reverse traversal commit_list in nf_tables_abort · a907e36d
      Xin Long 提交于
      When we use 'nft -f' to submit rules, it will build multiple rules into
      one netlink skb to send to kernel, kernel will process them one by one.
      meanwhile, it add the trans into commit_list to record every commit.
      if one of them's return value is -EAGAIN, status |= NFNL_BATCH_REPLAY
      will be marked. after all the process is done. it will roll back all the
      commits.
      
      now kernel use list_add_tail to add trans to commit, and use
      list_for_each_entry_safe to roll back. which means the order of adding
      and rollback is the same. that will cause some cases cannot work well,
      even trigger call trace, like:
      
      1. add a set into table foo  [return -EAGAIN]:
         commit_list = 'add set trans'
      2. del foo:
         commit_list = 'add set trans' -> 'del set trans' -> 'del tab trans'
      then nf_tables_abort will be called to roll back:
      firstly process 'add set trans':
                         case NFT_MSG_NEWSET:
                              trans->ctx.table->use--;
                              list_del_rcu(&nft_trans_set(trans)->list);
      
        it will del the set from the table foo, but it has removed when del
        table foo [step 2], then the kernel will panic.
      
      the right order of rollback should be:
        'del tab trans' -> 'del set trans' -> 'add set trans'.
      which is opposite with commit_list order.
      
      so fix it by rolling back commits with reverse order in nf_tables_abort.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      a907e36d
  15. 12 12月, 2015 1 次提交