1. 30 7月, 2015 1 次提交
    • M
      netfilter: nf_ct_sctp: minimal multihoming support · d7ee3519
      Michal Kubeček 提交于
      Currently nf_conntrack_proto_sctp module handles only packets between
      primary addresses used to establish the connection. Any packets between
      secondary addresses are classified as invalid so that usual firewall
      configurations drop them. Allowing HEARTBEAT and HEARTBEAT-ACK chunks to
      establish a new conntrack would allow traffic between secondary
      addresses to pass through. A more sophisticated solution based on the
      addresses advertised in the initial handshake (and possibly also later
      dynamic address addition and removal) would be much harder to implement.
      Moreover, in general we cannot assume to always see the initial
      handshake as it can be routed through a different path.
      
      The patch adds two new conntrack states:
      
        SCTP_CONNTRACK_HEARTBEAT_SENT  - a HEARTBEAT chunk seen but not acked
        SCTP_CONNTRACK_HEARTBEAT_ACKED - a HEARTBEAT acked by HEARTBEAT-ACK
      
      State transition rules:
      
      - HEARTBEAT_SENT responds to usual chunks the same way as NONE (so that
        the behaviour changes as little as possible)
      - HEARTBEAT_ACKED responds to usual chunks the same way as ESTABLISHED
        does, except the resulting state is HEARTBEAT_ACKED rather than
        ESTABLISHED
      - previously existing states except NONE are preserved when HEARTBEAT or
        HEARTBEAT-ACK is seen
      - NONE (in the initial direction) changes to HEARTBEAT_SENT on HEARTBEAT
        and to CLOSED on HEARTBEAT-ACK
      - HEARTBEAT_SENT changes to HEARTBEAT_ACKED on HEARTBEAT-ACK in the
        reply direction
      - HEARTBEAT_SENT and HEARTBEAT_ACKED are preserved on HEARTBEAT and
        HEARTBEAT-ACK otherwise
      
      Normally, vtag is set from the INIT chunk for the reply direction and
      from the INIT-ACK chunk for the originating direction (i.e. each of
      these defines vtag value for the opposite direction). For secondary
      conntracks, we can't rely on seeing INIT/INIT-ACK and even if we have
      seen them, we would need to connect two different conntracks. Therefore
      simplified logic is applied: vtag of first packet in each direction
      (HEARTBEAT in the originating and HEARTBEAT-ACK in reply direction) is
      saved and all following packets in that direction are compared with this
      saved value. While INIT and INIT-ACK define vtag for the opposite
      direction, vtags extracted from HEARTBEAT and HEARTBEAT-ACK are always
      for their direction.
      
      Default timeout values for new states are
      
        HEARTBEAT_SENT: 30 seconds (default hb_interval)
        HEARTBEAT_ACKED: 210 seconds (hb_interval * path_max_retry + max_rto)
      
      (We cannot expect to see the shutdown sequence so that, unlike
      ESTABLISHED, the HEARTBEAT_ACKED timeout shouldn't be too long.)
      Signed-off-by: NMichal Kubecek <mkubecek@suse.cz>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      d7ee3519
  2. 23 7月, 2015 3 次提交
    • P
      netfilter: rename local nf_hook_list to hook_list · 3bbd14e0
      Pablo Neira Ayuso 提交于
      085db2c0 ("netfilter: Per network namespace netfilter hooks.") introduced a
      new nf_hook_list that is global, so let's avoid this overlap.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Acked-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      3bbd14e0
    • P
      netfilter: fix possible removal of wrong hook · 7181ebaf
      Pablo Neira Ayuso 提交于
      nf_unregister_net_hook() uses the nf_hook_ops fields as tuple to look up for
      the corresponding hook in the list. However, we may have two hooks with exactly
      the same configuration.
      
      This shouldn't be a problem for nftables since every new chain has an unique
      priv field set, but this may still cause us problems in the future, so better
      address this problem now by keeping a reference to the original nf_hook_ops
      structure to make sure we delete the right hook from nf_unregister_net_hook().
      
      Fixes: 085db2c0 ("netfilter: Per network namespace netfilter hooks.")
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Acked-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      7181ebaf
    • P
      netfilter: nf_queue: fix nf_queue_nf_hook_drop() · 2385eb0c
      Pablo Neira Ayuso 提交于
      This function reacquires the rtnl_lock() which is already held by
      nf_unregister_hook().
      
      This can be triggered via: modprobe nf_conntrack_ipv4 && rmmod nf_conntrack_ipv4
      
      [  720.628746] INFO: task rmmod:3578 blocked for more than 120 seconds.
      [  720.628749]       Not tainted 4.2.0-rc2+ #113
      [  720.628752] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [  720.628754] rmmod           D ffff8800ca46fd58     0  3578   3571 0x00000080
      [...]
      [  720.628783] Call Trace:
      [  720.628790]  [<ffffffff8152ea0b>] schedule+0x6b/0x90
      [  720.628795]  [<ffffffff8152ecb3>] schedule_preempt_disabled+0x13/0x20
      [  720.628799]  [<ffffffff8152ff55>] mutex_lock_nested+0x1f5/0x380
      [  720.628803]  [<ffffffff81462622>] ? rtnl_lock+0x12/0x20
      [  720.628807]  [<ffffffff81462622>] ? rtnl_lock+0x12/0x20
      [  720.628812]  [<ffffffff81462622>] rtnl_lock+0x12/0x20
      [  720.628817]  [<ffffffff8148ab25>] nf_queue_nf_hook_drop+0x15/0x160
      [  720.628825]  [<ffffffff81488d48>] nf_unregister_net_hook+0x168/0x190
      [  720.628831]  [<ffffffff81488e24>] nf_unregister_hook+0x64/0x80
      [  720.628837]  [<ffffffff81488e60>] nf_unregister_hooks+0x20/0x30
      [...]
      
      Moreover, nf_unregister_net_hook() should only destroy the queue for this
      netns, not for every netns.
      Reported-by: NFengguang Wu <fengguang.wu@intel.com>
      Fixes: 085db2c0 ("netfilter: Per network namespace netfilter hooks.")
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Acked-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      2385eb0c
  3. 20 7月, 2015 1 次提交
  4. 16 7月, 2015 7 次提交
    • F
      netfilter: xtables: remove __pure annotation · 6c7941de
      Florian Westphal 提交于
      sparse complains:
      ip_tables.c:361:27: warning: incorrect type in assignment (different modifiers)
      ip_tables.c:361:27:    expected struct ipt_entry *[assigned] e
      ip_tables.c:361:27:    got struct ipt_entry [pure] *
      
      doesn't change generated code.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      6c7941de
    • F
      netfilter: add and use jump label for xt_tee · dcebd315
      Florian Westphal 提交于
      Don't bother testing if we need to switch to alternate stack
      unless TEE target is used.
      Suggested-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      dcebd315
    • F
      netfilter: xtables: don't save/restore jumpstack offset · 7814b6ec
      Florian Westphal 提交于
      In most cases there is no reentrancy into ip/ip6tables.
      
      For skbs sent by REJECT or SYNPROXY targets, there is one level
      of reentrancy, but its not relevant as those targets issue an absolute
      verdict, i.e. the jumpstack can be clobbered since its not used
      after the target issues absolute verdict (ACCEPT, DROP, STOLEN, etc).
      
      So the only special case where it is relevant is the TEE target, which
      returns XT_CONTINUE.
      
      This patch changes ip(6)_do_table to always use the jump stack starting
      from 0.
      
      When we detect we're operating on an skb sent via TEE (percpu
      nf_skb_duplicated is 1) we switch to an alternate stack to leave
      the original one alone.
      
      Since there is no TEE support for arptables, it doesn't need to
      test if tee is active.
      
      The jump stack overflow tests are no longer needed as well --
      since ->stacksize is the largest call depth we cannot exceed it.
      
      A much better alternative to the external jumpstack would be to just
      declare a jumps[32] stack on the local stack frame, but that would mean
      we'd have to reject iptables rulesets that used to work before.
      
      Another alternative would be to start rejecting rulesets with a larger
      call depth, e.g. 1000 -- in this case it would be feasible to allocate the
      entire stack in the percpu area which would avoid one dereference.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      7814b6ec
    • F
      netfilter: move tee_active to core · e7c8899f
      Florian Westphal 提交于
      This prepares for a TEE like expression in nftables.
      We want to ensure only one duplicate is sent, so both will
      use the same percpu variable to detect duplication.
      
      The other use case is detection of recursive call to xtables, but since
      we don't want dependency from nft to xtables core its put into core.c
      instead of the x_tables core.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      e7c8899f
    • F
      netfilter: xtables: compute exact size needed for jumpstack · 98d1bd80
      Florian Westphal 提交于
      The {arp,ip,ip6tables} jump stack is currently sized based
      on the number of user chains.
      
      However, its rather unlikely that every user defined chain jumps to the
      next, so lets use the existing loop detection logic to also track the
      chain depths.
      
      The stacksize is then set to the largest chain depth seen.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      98d1bd80
    • E
      netfilter: nftables: Only run the nftables chains in the proper netns · fd2ecda0
      Eric W. Biederman 提交于
      - Register the nftables chains in the network namespace that they need
        to run in.
      
      - Remove the hacks that stopped chains running in the wrong network
        namespace.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      fd2ecda0
    • E
      netfilter: Per network namespace netfilter hooks. · 085db2c0
      Eric W. Biederman 提交于
      - Add a new set of functions for registering and unregistering per
        network namespace hooks.
      
      - Modify the old global namespace hook functions to use the per
        network namespace hooks in their implementation, so their remains a
        single list that needs to be walked for any hook (this is important
        for keeping the hook priority working and for keeping the code
        walking the hooks simple).
      
      - Only allow registering the per netdevice hooks in the network
        namespace where the network device lives.
      
      - Dynamically allocate the structures in the per network namespace
        hook list in nf_register_net_hook, and unregister them in
        nf_unregister_net_hook.
      
        Dynamic allocate is required somewhere as the number of network
        namespaces are not fixed so we might as well allocate them in the
        registration function.
      
        The chain of registered hooks on any list is expected to be small so
        the cost of walking that list to find the entry we are unregistering
        should also be small.
      
        Performing the management of the dynamically allocated list entries
        in the registration and unregistration functions keeps the complexity
        from spreading.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      085db2c0
  5. 15 7月, 2015 5 次提交
  6. 14 7月, 2015 6 次提交
    • D
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 638d3c63
      David S. Miller 提交于
      Conflicts:
      	net/bridge/br_mdb.c
      
      Minor conflict in br_mdb.c, in 'net' we added a memset of the
      on-stack 'ip' variable whereas in 'net-next' we assign a new
      member 'vid'.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      638d3c63
    • N
      bridge: mdb: add vlan support for user entries · 74fe61f1
      Nikolay Aleksandrov 提交于
      Until now all user mdb entries were added in vlan 0, this patch adds
      support to allow the user to specify the vlan for the entry.
      About the uapi change a hole in struct br_mdb_entry is used so the size
      and offsets are kept the same (verified with pahole and tested with older
      iproute2).
      
      Example:
      $ bridge mdb
      dev br0 port eth1 grp 239.0.0.1 permanent vlan 2000
      dev br0 port eth1 grp 239.0.0.1 permanent vlan 200
      dev br0 port eth1 grp 239.0.0.1 permanent
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      74fe61f1
    • D
      ebpf: remove self-assignment in interpreter's tail call · c4675f93
      Daniel Borkmann 提交于
      ARG1 = BPF_R1 as it stands, evaluates to regs[BPF_REG_1] = regs[BPF_REG_1]
      and thus has no effect. Add a comment instead, explaining what happens and
      why it's okay to just remove it. Since from user space side, a tail call is
      invoked as a pseudo helper function via bpf_tail_call_proto, the verifier
      checks the arguments just like with any other helper function and makes
      sure that the first argument (regs[BPF_REG_1])'s type is ARG_PTR_TO_CTX.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c4675f93
    • T
      net: Build IPv6 into kernel by default · de551f2e
      Tom Herbert 提交于
      This patch makes the default to build IPv6 into the kernel. IPv6
      now has significant traction and any remaining vestiges of IPv6
      not being provided parity with IPv4 should be swept away. IPv6 is now
      core to the Internet and kernel.
      
      Points on IPv6 adoption:
      
      - Per Google statistics, IPv6 usage has reached 7% on the Internet
        and continues to exhibit an exponential growth rate
        https://www.google.com/intl/en/ipv6/statistics.html
      - Just a few days ago ARIN officially depleted its IPv4 pool
      - IPv6 only data centers are being successfully built
        (e.g. at Facebook)
      
      This patch changes the IPv6 Kconfig for IPV6. Default for CONFIG_IPV6
      is set to "y" and the text has been updated to reflect the maturity of
      IPv6.
      
      Impact:
      
      Under some circumstances building modules in to kernel might have a
      performance advantage. In my testing, I did notice a very slight
      improvement.
      
      This will obviously increase the size of the kernel image. In my
      configuration I see:
      
      IPv6 as module:
      
         text    data     bss     dec     hex filename
      9703666 1899288  933888 12536842         bf4c0a vmlinux
      
      IPv6 built into kernel
      
        text     data     bss     dec     hex filename
      9436490 1879600  913408 12229498         ba9b7a vmlinux
      
      Which increases text size by ~270K (2.8% increase in size for me). If
      image size is an issue, presumably for a device which does not do IP
      networking (IMO we should be discouraging IPv4-only devices), IPV6 can
      be disabled or still built as a module.
      Acked-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
      Signed-off-by: NTom Herbert <tom@herbertland.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      de551f2e
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · f760b87f
      Linus Torvalds 提交于
      Pull networking fixes from David Miller:
      
       1) Missing list head init in bluetooth hidp session creation, from Tedd
          Ho-Jeong An.
      
       2) Don't leak SKB in bridge netfilter error paths, from Florian
          Westphal.
      
       3) ipv6 netdevice private leak in netfilter bridging, fixed by Julien
          Grall.
      
       4) Fix regression in IP over hamradio bpq encapsulation, from Ralf
          Baechle.
      
       5) Fix race between rhashtable resize events and table walks, from Phil
          Sutter.
      
       6) Missing validation of IFLA_VF_INFO netlink attributes, fix from
          Daniel Borkmann.
      
       7) Missing security layer socket state initialization in tipc code,
          from Stephen Smalley.
      
       8) Fix shared IRQ handling in boomerang 3c59x interrupt handler, from
          Denys Vlasenko.
      
       9) Missing minor_idr destroy on module unload on macvtap driver, from
          Johannes Thumshirn.
      
      10) Various pktgen kernel thread races, from Oleg Nesterov.
      
      11) Fix races that can cause packets to be processed in the backlog even
          after a device attached to that SKB has been fully unregistered.
          From Julian Anastasov.
      
      12) bcmgenet driver doesn't account packet drops vs.  errors properly,
          fix from Petri Gynther.
      
      13) Array index validation and off by one fix in DSA layer from Florian
          Fainelli
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (66 commits)
        can: replace timestamp as unique skb attribute
        ARM: dts: dra7x-evm: Prevent glitch on DCAN1 pinmux
        can: c_can: Fix default pinmux glitch at init
        can: rcar_can: unify error messages
        can: rcar_can: print request_irq() error code
        can: rcar_can: fix typo in error message
        can: rcar_can: print signed IRQ #
        can: rcar_can: fix IRQ check
        net: dsa: Fix off-by-one in switch address parsing
        net: dsa: Test array index before use
        net: switchdev: don't abort unsupported operations
        net: bcmgenet: fix accounting of packet drops vs errors
        cdc_ncm: update specs URL
        Doc: z8530book: Fix typo in API-z8530-sync-txdma-open.html
        net: inet_diag: always export IPV6_V6ONLY sockopt for listening sockets
        bridge: mdb: allow the user to delete mdb entry if there's a querier
        net: call rcu_read_lock early in process_backlog
        net: do not process device backlog during unregistration
        bridge: fix potential crash in __netdev_pick_tx()
        net: axienet: Fix devm_ioremap_resource return value check
        ...
      f760b87f
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · 34bef46e
      Linus Torvalds 提交于
      Pull crypto fixes from Herbert Xu:
       "This fixes a duplicate dma_unmap_sg call in omap-des and reentrancy
        bugs in the powerpc nx driver which may cause bogus output or worse
        memory corruption"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
        crypto: nx - Fix reentrancy bugs
        crypto: omap-des - Fix unmapping of dma channels
      34bef46e
  7. 13 7月, 2015 16 次提交
    • D
      Merge tag 'linux-can-fixes-for-4.2-20150712' of... · cee9f6d0
      David S. Miller 提交于
      Merge tag 'linux-can-fixes-for-4.2-20150712' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
      
      Marc Kleine-Budde says:
      
      ====================
      pull-request: can 2015-07-12
      
      this is a pull request of 8 patchs for net/master.
      
      Sergei Shtylyov contributes 5 patches for the rcar_can driver, fixing the IRQ
      check and several info and error messages. There are two patches by J.D.
      Schroeder and Roger Quadros for the c_can driver and dra7x-evm device tree,
      which precent a glitch in the DCAN1 pinmux. Oliver Hartkopp provides a better
      approach to make the CAN skbs unique, the timestamp is replaced by a counter.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cee9f6d0
    • L
      Linux 4.2-rc2 · bc0195aa
      Linus Torvalds 提交于
      bc0195aa
    • L
      Revert "drm/i915: Use crtc_state->active in primary check_plane func" · 01e2d062
      Linus Torvalds 提交于
      This reverts commit dec4f799.
      
      Jörg Otte reports a NULL pointder dereference due to this commit, as
      'crtc_state' very much can be NULL:
      
              crtc_state = state->base.state ?
                      intel_atomic_get_crtc_state(state->base.state, intel_crtc) : NULL;
      
      So the change to test 'crtc_state->base.active' cannot possibly be
      correct as-is.
      
      There may be some other minimal fix (like just checking crtc_state for
      NULL), but I'm just reverting it now for the rc2 release, and people
      like Daniel Vetter who actually know this code will figure out what the
      right solution is in the longer term.
      Reported-and-bisected-by: NJörg Otte <jrg.otte@gmail.com>
      Cc: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
      Cc: Jani Nikula <jani.nikula@linux.intel.com>
      Cc: Daniel Vetter <daniel.vetter@intel.com>
      CC: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      01e2d062
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · c83727a6
      Linus Torvalds 提交于
      Pull VFS fixes from Al Viro:
       "Fixes for this cycle regression in overlayfs and a couple of
        long-standing (== all the way back to 2.6.12, at least) bugs"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        freeing unlinked file indefinitely delayed
        fix a braino in ovl_d_select_inode()
        9p: don't leave a half-initialized inode sitting around
      c83727a6
    • L
      Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus · 7fbb58a0
      Linus Torvalds 提交于
      Pull MIPS fixes from Ralf Baechle:
       "A fair number of 4.2 fixes also because Markos opened the flood gates.
      
         - Patch up the math used calculate the location for the page bitmap.
      
         - The FDC (Not what you think, FDC stands for Fast Debug Channel) IRQ
           around was causing issues on non-Malta platforms, so move the code
           to a Malta specific location.
      
         - A spelling fix replicated through several files.
      
         - Fix to the emulation of an R2 instruction for R6 cores.
      
         - Fix the JR emulation for R6.
      
         - Further patching of mindless 64 bit issues.
      
         - Ensure the kernel won't crash on CPUs with L2 caches with >= 8
           ways.
      
         - Use compat_sys_getsockopt for O32 ABI on 64 bit kernels.
      
         - Fix cache flushing for multithreaded cores.
      
         - A build fix"
      
      * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
        MIPS: O32: Use compat_sys_getsockopt.
        MIPS: c-r4k: Extend way_string array
        MIPS: Pistachio: Support CDMM & Fast Debug Channel
        MIPS: Malta: Make GIC FDC IRQ workaround Malta specific
        MIPS: c-r4k: Fix cache flushing for MT cores
        Revert "MIPS: Kconfig: Disable SMP/CPS for 64-bit"
        MIPS: cps-vec: Use macros for various arithmetics and memory operations
        MIPS: kernel: cps-vec: Replace KSEG0 with CKSEG0
        MIPS: kernel: cps-vec: Use ta0-ta3 pseudo-registers for 64-bit
        MIPS: kernel: cps-vec: Replace mips32r2 ISA level with mips64r2
        MIPS: kernel: cps-vec: Replace 'la' macro with PTR_LA
        MIPS: kernel: smp-cps: Fix 64-bit compatibility errors due to pointer casting
        MIPS: Fix erroneous JR emulation for MIPS R6
        MIPS: Fix branch emulation for BLTC and BGEC instructions
        MIPS: kernel: traps: Fix broken indentation
        MIPS: bootmem: Don't use memory holes for page bitmap
        MIPS: O32: Do not handle require 32 bytes from the stack to be readable.
        MIPS, CPUFREQ: Fix spelling of Institute.
        MIPS: Lemote 2F: Fix build caused by recent mass rename.
      7fbb58a0
    • O
      can: replace timestamp as unique skb attribute · d3b58c47
      Oliver Hartkopp 提交于
      Commit 514ac99c "can: fix multiple delivery of a single CAN frame for
      overlapping CAN filters" requires the skb->tstamp to be set to check for
      identical CAN skbs.
      
      Without timestamping to be required by user space applications this timestamp
      was not generated which lead to commit 36c01245 "can: fix loss of CAN frames
      in raw_rcv" - which forces the timestamp to be set in all CAN related skbuffs
      by introducing several __net_timestamp() calls.
      
      This forces e.g. out of tree drivers which are not using alloc_can{,fd}_skb()
      to add __net_timestamp() after skbuff creation to prevent the frame loss fixed
      in mainline Linux.
      
      This patch removes the timestamp dependency and uses an atomic counter to
      create an unique identifier together with the skbuff pointer.
      
      Btw: the new skbcnt element introduced in struct can_skb_priv has to be
      initialized with zero in out-of-tree drivers which are not using
      alloc_can{,fd}_skb() too.
      Signed-off-by: NOliver Hartkopp <socketcan@hartkopp.net>
      Cc: linux-stable <stable@vger.kernel.org>
      Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>
      d3b58c47
    • R
      ARM: dts: dra7x-evm: Prevent glitch on DCAN1 pinmux · 2acb5c30
      Roger Quadros 提交于
      Driver core sets "default" pinmux on on probe and CAN driver
      sets "sleep" pinmux during register. This causes a small window
      where the CAN pins are in "default" state with the DCAN module
      being disabled.
      
      Change the "default" state to be like sleep so this glitch is
      avoided. Add a new "active" state that is used by the driver
      when CAN is actually active.
      Signed-off-by: NRoger Quadros <rogerq@ti.com>
      Cc: linux-stable <stable@vger.kernel.org>
      Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>
      2acb5c30
    • J
      can: c_can: Fix default pinmux glitch at init · 03336519
      J.D. Schroeder 提交于
      The previous change 3973c526 (net: can: c_can: Disable pins when CAN
      interface is down) causes a slight glitch on the pinctrl settings when used.
      Since commit ab78029e (drivers/pinctrl: grab default handles from device core),
      the device core will automatically set the default pins. This causes the pins
      to be momentarily set to the default and then to the sleep state in
      register_c_can_dev(). By adding an optional "enable" state, boards can set the
      default pin state to be disabled and avoid the glitch when the switch from
      default to sleep first occurs. If the "enable" state is not available
      c_can_pinctrl_select_state() falls back to using the "default" pinctrl state.
      
      [Roger Q] - Forward port to v4.2 and use pinctrl_get_select().
      Signed-off-by: NJ.D. Schroeder <jay.schroeder@garmin.com>
      Signed-off-by: NRoger Quadros <rogerq@ti.com>
      Reviewed-by: NGrygorii Strashko <grygorii.strashko@ti.com>
      Cc: linux-stable <stable@vger.kernel.org>
      Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>
      03336519
    • S
      can: rcar_can: unify error messages · 585bc2ac
      Sergei Shtylyov 提交于
      All the error messages in the driver but  the ones from devm_clk_get() failures
      use similar format.  Make those  two messages consitent with others.
      Signed-off-by: NSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
      Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>
      585bc2ac
    • S
      can: rcar_can: print request_irq() error code · ae185f19
      Sergei Shtylyov 提交于
      Also print the error code when the request_irq() call fails in rcar_can_open(),
      rewording  the error message...
      Signed-off-by: NSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
      Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>
      ae185f19
    • S
      can: rcar_can: fix typo in error message · 3255f68c
      Sergei Shtylyov 提交于
      Fix typo in the first error message printed by rcar_can_open().
      
      Based on the original patch by Vladimir Barinov.
      
      Fixes: 862e2b6a ("can: rcar_can: support all input clocks")
      Reported-by: NVladimir Barinov <vladimir.barinov@cogentembedded.com>
      Signed-off-by: NSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
      Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>
      3255f68c
    • S
      can: rcar_can: print signed IRQ # · c1a4c87b
      Sergei Shtylyov 提交于
      Printing IRQ # using "%x" and "%u" unsigned formats isn't quite correct as
      'ndev->irq' is of  type *int*, so  the "%d" format  needs to be used instead.
      
      While fixing this, beautify the dev_info() message in rcar_can_probe() a bit.
      
      Fixes: fd115931 ("can: add Renesas R-Car CAN driver")
      Signed-off-by: NSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
      Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>
      c1a4c87b
    • S
      can: rcar_can: fix IRQ check · 5e63e6ba
      Sergei Shtylyov 提交于
      rcar_can_probe() regards 0 as a wrong IRQ #, despite platform_get_irq() that it
      calls returns negative error code in that case. This leads to the following
      being printed to the console when attempting to open the device:
      
      error requesting interrupt fffffffa
      
      because  rcar_can_open() calls request_irq() with a negative IRQ #, and that
      function naturally fails with -EINVAL.
      
      Check for the negative error codes instead and propagate them upstream instead
      of just returning -ENODEV.
      
      Fixes: fd115931 ("can: add Renesas R-Car CAN driver")
      Signed-off-by: NSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
      Cc: linux-stable <stable@vger.kernel.org>
      Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>
      5e63e6ba
    • L
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 1daa1cfb
      Linus Torvalds 提交于
      Pull x86 fixes from Thomas Gleixner:
      
       - the high latency PIT detection fix, which slipped through the cracks
         for rc1
      
       - a regression fix for the early printk mechanism
      
       - the x86 part to plug irq/vector related hotplug races
      
       - move the allocation of the espfix pages on cpu hotplug to non atomic
         context.  The current code triggers a might_sleep() warning.
      
       - a series of KASAN fixes addressing boot crashes and usability
      
       - a trivial typo fix for Kconfig help text
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/kconfig: Fix typo in the CONFIG_CMDLINE_BOOL help text
        x86/irq: Retrieve irq data after locking irq_desc
        x86/irq: Use proper locking in check_irq_vectors_for_cpu_disable()
        x86/irq: Plug irq vector hotplug race
        x86/earlyprintk: Allow early_printk() to use console style parameters like '115200n8'
        x86/espfix: Init espfix on the boot CPU side
        x86/espfix: Add 'cpu' parameter to init_espfix_ap()
        x86/kasan: Move KASAN_SHADOW_OFFSET to the arch Kconfig
        x86/kasan: Add message about KASAN being initialized
        x86/kasan: Fix boot crash on AMD processors
        x86/kasan: Flush TLBs after switching CR3
        x86/kasan: Fix KASAN shadow region page tables
        x86/init: Clear 'init_level4_pgt' earlier
        x86/tsc: Let high latency PIT fail fast in quick_pit_calibrate()
      1daa1cfb
    • L
      Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 7b732169
      Linus Torvalds 提交于
      Pull timer fixes from Thomas Gleixner:
       "This update from the timer departement contains:
      
         - A series of patches which address a shortcoming in the tick
           broadcast code.
      
           If the broadcast device is not available or an hrtimer emulated
           broadcast device, some of the original assumptions lead to boot
           failures.  I rather plugged all of the corner cases instead of only
           addressing the issue reported, so the change got a little larger.
      
           Has been extensivly tested on x86 and arm.
      
         - Get rid of the last holdouts using do_posix_clock_monotonic_gettime()
      
         - A regression fix for the imx clocksource driver
      
         - An update to the new state callbacks mechanism for clockevents.
           This is required to simplify the conversion, which will take place
           in 4.3"
      
      * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        tick/broadcast: Prevent NULL pointer dereference
        time: Get rid of do_posix_clock_monotonic_gettime
        cris: Replace do_posix_clock_monotonic_gettime()
        tick/broadcast: Unbreak CONFIG_GENERIC_CLOCKEVENTS=n build
        tick/broadcast: Handle spurious interrupts gracefully
        tick/broadcast: Check for hrtimer broadcast active early
        tick/broadcast: Return busy when IPI is pending
        tick/broadcast: Return busy if periodic mode and hrtimer broadcast
        tick/broadcast: Move the check for periodic mode inside state handling
        tick/broadcast: Prevent deep idle if no broadcast device available
        tick/broadcast: Make idle check independent from mode and config
        tick/broadcast: Sanity check the shutdown of the local clock_event
        tick/broadcast: Prevent hrtimer recursion
        clockevents: Allow set-state callbacks to be optional
        clocksource/imx: Define clocksource for mx27
      7b732169
    • L
      Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · c4bc680c
      Linus Torvalds 提交于
      Pull irq fix from Thomas Gleixner:
       "A single fix for a cpu hotplug race vs. interrupt descriptors:
      
        Prevent irq setup/teardown across the cpu starting/dying parts of cpu
        hotplug so that the starting/dying cpu has a stable view of the
        descriptor space.  This has been an issue for all architectures in the
        cpu dying phase, where interrupts are migrated away from the dying
        cpu.  In the starting phase its mostly a x86 issue vs the vector space
        update"
      
      * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        hotplug: Prevent alloc/free of irq descriptors during cpu up/down
      c4bc680c
  8. 12 7月, 2015 1 次提交
    • A
      freeing unlinked file indefinitely delayed · 75a6f82a
      Al Viro 提交于
      	Normally opening a file, unlinking it and then closing will have
      the inode freed upon close() (provided that it's not otherwise busy and
      has no remaining links, of course).  However, there's one case where that
      does *not* happen.  Namely, if you open it by fhandle with cold dcache,
      then unlink() and close().
      
      	In normal case you get d_delete() in unlink(2) notice that dentry
      is busy and unhash it; on the final dput() it will be forcibly evicted from
      dcache, triggering iput() and inode removal.  In this case, though, we end
      up with *two* dentries - disconnected (created by open-by-fhandle) and
      regular one (used by unlink()).  The latter will have its reference to inode
      dropped just fine, but the former will not - it's considered hashed (it
      is on the ->s_anon list), so it will stay around until the memory pressure
      will finally do it in.  As the result, we have the final iput() delayed
      indefinitely.  It's trivial to reproduce -
      
      void flush_dcache(void)
      {
              system("mount -o remount,rw /");
      }
      
      static char buf[20 * 1024 * 1024];
      
      main()
      {
              int fd;
              union {
                      struct file_handle f;
                      char buf[MAX_HANDLE_SZ];
              } x;
              int m;
      
              x.f.handle_bytes = sizeof(x);
              chdir("/root");
              mkdir("foo", 0700);
              fd = open("foo/bar", O_CREAT | O_RDWR, 0600);
              close(fd);
              name_to_handle_at(AT_FDCWD, "foo/bar", &x.f, &m, 0);
              flush_dcache();
              fd = open_by_handle_at(AT_FDCWD, &x.f, O_RDWR);
              unlink("foo/bar");
              write(fd, buf, sizeof(buf));
              system("df .");			/* 20Mb eaten */
              close(fd);
              system("df .");			/* should've freed those 20Mb */
              flush_dcache();
              system("df .");			/* should be the same as #2 */
      }
      
      will spit out something like
      Filesystem     1K-blocks   Used Available Use% Mounted on
      /dev/root         322023 303843      1131 100% /
      Filesystem     1K-blocks   Used Available Use% Mounted on
      /dev/root         322023 303843      1131 100% /
      Filesystem     1K-blocks   Used Available Use% Mounted on
      /dev/root         322023 283282     21692  93% /
      - inode gets freed only when dentry is finally evicted (here we trigger
      than by remount; normally it would've happened in response to memory
      pressure hell knows when).
      
      Cc: stable@vger.kernel.org # v2.6.38+; earlier ones need s/kill_it/unhash_it/
      Acked-by: NJ. Bruce Fields <bfields@fieldses.org>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      75a6f82a