1. 03 11月, 2015 1 次提交
  2. 26 9月, 2015 1 次提交
    • G
      ppp: fix lockdep splat in ppp_dev_uninit() · 58a89eca
      Guillaume Nault 提交于
      ppp_dev_uninit() locks all_ppp_mutex while under rtnl mutex protection.
      ppp_create_interface() must then lock these mutexes in that same order
      to avoid possible deadlock.
      
      [  120.880011] ======================================================
      [  120.880011] [ INFO: possible circular locking dependency detected ]
      [  120.880011] 4.2.0 #1 Not tainted
      [  120.880011] -------------------------------------------------------
      [  120.880011] ppp-apitest/15827 is trying to acquire lock:
      [  120.880011]  (&pn->all_ppp_mutex){+.+.+.}, at: [<ffffffffa0145f56>] ppp_dev_uninit+0x64/0xb0 [ppp_generic]
      [  120.880011]
      [  120.880011] but task is already holding lock:
      [  120.880011]  (rtnl_mutex){+.+.+.}, at: [<ffffffff812e4255>] rtnl_lock+0x12/0x14
      [  120.880011]
      [  120.880011] which lock already depends on the new lock.
      [  120.880011]
      [  120.880011]
      [  120.880011] the existing dependency chain (in reverse order) is:
      [  120.880011]
      [  120.880011] -> #1 (rtnl_mutex){+.+.+.}:
      [  120.880011]        [<ffffffff81073a6f>] lock_acquire+0xcf/0x10e
      [  120.880011]        [<ffffffff813ab18a>] mutex_lock_nested+0x56/0x341
      [  120.880011]        [<ffffffff812e4255>] rtnl_lock+0x12/0x14
      [  120.880011]        [<ffffffff812d9d94>] register_netdev+0x11/0x27
      [  120.880011]        [<ffffffffa0147b17>] ppp_ioctl+0x289/0xc98 [ppp_generic]
      [  120.880011]        [<ffffffff8113b367>] do_vfs_ioctl+0x4ea/0x532
      [  120.880011]        [<ffffffff8113b3fd>] SyS_ioctl+0x4e/0x7d
      [  120.880011]        [<ffffffff813ad7d7>] entry_SYSCALL_64_fastpath+0x12/0x6f
      [  120.880011]
      [  120.880011] -> #0 (&pn->all_ppp_mutex){+.+.+.}:
      [  120.880011]        [<ffffffff8107334e>] __lock_acquire+0xb07/0xe76
      [  120.880011]        [<ffffffff81073a6f>] lock_acquire+0xcf/0x10e
      [  120.880011]        [<ffffffff813ab18a>] mutex_lock_nested+0x56/0x341
      [  120.880011]        [<ffffffffa0145f56>] ppp_dev_uninit+0x64/0xb0 [ppp_generic]
      [  120.880011]        [<ffffffff812d5263>] rollback_registered_many+0x19e/0x252
      [  120.880011]        [<ffffffff812d5381>] rollback_registered+0x29/0x38
      [  120.880011]        [<ffffffff812d53fa>] unregister_netdevice_queue+0x6a/0x77
      [  120.880011]        [<ffffffffa0146a94>] ppp_release+0x42/0x79 [ppp_generic]
      [  120.880011]        [<ffffffff8112d9f6>] __fput+0xec/0x192
      [  120.880011]        [<ffffffff8112dacc>] ____fput+0x9/0xb
      [  120.880011]        [<ffffffff8105447a>] task_work_run+0x66/0x80
      [  120.880011]        [<ffffffff81001801>] prepare_exit_to_usermode+0x8c/0xa7
      [  120.880011]        [<ffffffff81001900>] syscall_return_slowpath+0xe4/0x104
      [  120.880011]        [<ffffffff813ad931>] int_ret_from_sys_call+0x25/0x9f
      [  120.880011]
      [  120.880011] other info that might help us debug this:
      [  120.880011]
      [  120.880011]  Possible unsafe locking scenario:
      [  120.880011]
      [  120.880011]        CPU0                    CPU1
      [  120.880011]        ----                    ----
      [  120.880011]   lock(rtnl_mutex);
      [  120.880011]                                lock(&pn->all_ppp_mutex);
      [  120.880011]                                lock(rtnl_mutex);
      [  120.880011]   lock(&pn->all_ppp_mutex);
      [  120.880011]
      [  120.880011]  *** DEADLOCK ***
      
      Fixes: 8cb775bc ("ppp: fix device unregistration upon netns deletion")
      Reported-by: NSedat Dilek <sedat.dilek@gmail.com>
      Tested-by: NSedat Dilek <sedat.dilek@gmail.com>
      Signed-off-by: NGuillaume Nault <g.nault@alphalink.fr>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      58a89eca
  3. 26 8月, 2015 1 次提交
    • G
      ppp: implement x-netns support · 79c441ae
      Guillaume Nault 提交于
      Let packets move from one netns to the other at PPP encapsulation and
      decapsulation time.
      
      PPP units and channels remain in the netns in which they were
      originally created. Only the net_device may move to a different
      namespace. Cross netns handling is thus transparent to lower PPP
      layers (PPPoE, L2TP, etc.).
      
      PPP devices are automatically unregistered when their netns gets
      removed. So read() and poll() on the unit file descriptor will
      respectively receive EOF and POLLHUP. Channels aren't affected.
      Signed-off-by: NGuillaume Nault <g.nault@alphalink.fr>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      79c441ae
  4. 18 8月, 2015 1 次提交
    • G
      ppp: fix device unregistration upon netns deletion · 8cb775bc
      Guillaume Nault 提交于
      PPP devices may get automatically unregistered when their network
      namespace is getting removed. This happens if the ppp control plane
      daemon (e.g. pppd) exits while it is the last user of this namespace.
      
      This leads to several races:
      
        * ppp_exit_net() may destroy the per namespace idr (pn->units_idr)
          before all file descriptors were released. Successive ppp_release()
          calls may then cleanup PPP devices with ppp_shutdown_interface() and
          try to use the already destroyed idr.
      
        * Automatic device unregistration may also happen before the
          ppp_release() call for that device gets executed. Once called on
          the file owning the device, ppp_release() will then clean it up and
          try to unregister it a second time.
      
      To fix these issues, operations defined in ppp_shutdown_interface() are
      moved to the PPP device's ndo_uninit() callback. This allows PPP
      devices to be properly cleaned up by unregister_netdev() and friends.
      So checking for ppp->owner is now an accurate test to decide if a PPP
      device should be unregistered.
      
      Setting ppp->owner is done in ppp_create_interface(), before device
      registration, in order to avoid unprotected modification of this field.
      
      Finally, ppp_exit_net() now starts by unregistering all remaining PPP
      devices to ensure that none will get unregistered after the call to
      idr_destroy().
      Signed-off-by: NGuillaume Nault <g.nault@alphalink.fr>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8cb775bc
  5. 21 4月, 2015 1 次提交
  6. 10 12月, 2014 1 次提交
  7. 12 11月, 2014 1 次提交
  8. 09 10月, 2014 1 次提交
    • A
      fix misuses of f_count() in ppp and netlink · 24dff96a
      Al Viro 提交于
      we used to check for "nobody else could start doing anything with
      that opened file" by checking that refcount was 2 or less - one
      for descriptor table and one we'd acquired in fget() on the way to
      wherever we are.  That was race-prone (somebody else might have
      had a reference to descriptor table and do fget() just as we'd
      been checking) and it had become flat-out incorrect back when
      we switched to fget_light() on those codepaths - unlike fget(),
      it doesn't grab an extra reference unless the descriptor table
      is shared.  The same change allowed a race-free check, though -
      we are safe exactly when refcount is less than 2.
      
      It was a long time ago; pre-2.6.12 for ioctl() (the codepath leading
      to ppp one) and 2.6.17 for sendmsg() (netlink one).  OTOH,
      netlink hadn't grown that check until 3.9 and ppp used to live
      in drivers/net, not drivers/net/ppp until 3.1.  The bug existed
      well before that, though, and the same fix used to apply in old
      location of file.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      24dff96a
  9. 08 10月, 2014 1 次提交
    • E
      net: better IFF_XMIT_DST_RELEASE support · 02875878
      Eric Dumazet 提交于
      Testing xmit_more support with netperf and connected UDP sockets,
      I found strange dst refcount false sharing.
      
      Current handling of IFF_XMIT_DST_RELEASE is not optimal.
      
      Dropping dst in validate_xmit_skb() is certainly too late in case
      packet was queued by cpu X but dequeued by cpu Y
      
      The logical point to take care of drop/force is in __dev_queue_xmit()
      before even taking qdisc lock.
      
      As Julian Anastasov pointed out, need for skb_dst() might come from some
      packet schedulers or classifiers.
      
      This patch adds new helper to cleanly express needs of various drivers
      or qdiscs/classifiers.
      
      Drivers that need skb_dst() in their ndo_start_xmit() should call
      following helper in their setup instead of the prior :
      
      	dev->priv_flags &= ~IFF_XMIT_DST_RELEASE;
      ->
      	netif_keep_dst(dev);
      
      Instead of using a single bit, we use two bits, one being
      eventually rebuilt in bonding/team drivers.
      
      The other one, is permanent and blocks IFF_XMIT_DST_RELEASE being
      rebuilt in bonding/team. Eventually, we could add something
      smarter later.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Julian Anastasov <ja@ssi.bg>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      02875878
  10. 03 8月, 2014 1 次提交
    • A
      net: filter: split 'struct sk_filter' into socket and bpf parts · 7ae457c1
      Alexei Starovoitov 提交于
      clean up names related to socket filtering and bpf in the following way:
      - everything that deals with sockets keeps 'sk_*' prefix
      - everything that is pure BPF is changed to 'bpf_*' prefix
      
      split 'struct sk_filter' into
      struct sk_filter {
      	atomic_t        refcnt;
      	struct rcu_head rcu;
      	struct bpf_prog *prog;
      };
      and
      struct bpf_prog {
              u32                     jited:1,
                                      len:31;
              struct sock_fprog_kern  *orig_prog;
              unsigned int            (*bpf_func)(const struct sk_buff *skb,
                                                  const struct bpf_insn *filter);
              union {
                      struct sock_filter      insns[0];
                      struct bpf_insn         insnsi[0];
                      struct work_struct      work;
              };
      };
      so that 'struct bpf_prog' can be used independent of sockets and cleans up
      'unattached' bpf use cases
      
      split SK_RUN_FILTER macro into:
          SK_RUN_FILTER to be used with 'struct sk_filter *' and
          BPF_PROG_RUN to be used with 'struct bpf_prog *'
      
      __sk_filter_release(struct sk_filter *) gains
      __bpf_prog_release(struct bpf_prog *) helper function
      
      also perform related renames for the functions that work
      with 'struct bpf_prog *', since they're on the same lines:
      
      sk_filter_size -> bpf_prog_size
      sk_filter_select_runtime -> bpf_prog_select_runtime
      sk_filter_free -> bpf_prog_free
      sk_unattached_filter_create -> bpf_prog_create
      sk_unattached_filter_destroy -> bpf_prog_destroy
      sk_store_orig_filter -> bpf_prog_store_orig_filter
      sk_release_orig_filter -> bpf_release_orig_filter
      __sk_migrate_filter -> bpf_migrate_filter
      __sk_prepare_filter -> bpf_prepare_filter
      
      API for attaching classic BPF to a socket stays the same:
      sk_attach_filter(prog, struct sock *)/sk_detach_filter(struct sock *)
      and SK_RUN_FILTER(struct sk_filter *, ctx) to execute a program
      which is used by sockets, tun, af_packet
      
      API for 'unattached' BPF programs becomes:
      bpf_prog_create(struct bpf_prog **)/bpf_prog_destroy(struct bpf_prog *)
      and BPF_PROG_RUN(struct bpf_prog *, ctx) to execute a program
      which is used by isdn, ppp, team, seccomp, ptp, xt_bpf, cls_bpf, test_bpf
      Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7ae457c1
  11. 17 7月, 2014 2 次提交
    • C
      net: ppp: fix creating PPP pass and active filters · cc25eaae
      Christoph Schulz 提交于
      Commit 568f194e ("net: ppp: use
      sk_unattached_filter api") inadvertently changed the logic when setting
      PPP pass and active filters. This applies to both the generic PPP subsystem
      implemented by drivers/net/ppp/ppp_generic.c and the ISDN PPP subsystem
      implemented by drivers/isdn/i4l/isdn_ppp.c. The original code in ppp_ioctl()
      (or isdn_ppp_ioctl(), resp.) handling PPPIOCSPASS and PPPIOCSACTIVE allowed to
      remove a pass/active filter previously set by using a filter of length zero.
      However, with the new code this is not possible anymore as this case is not
      explicitly checked for, which leads to passing NULL as a filter to
      sk_unattached_filter_create(). This results in returning EINVAL to the caller.
      
      Additionally, the variables ppp->pass_filter and ppp->active_filter (or
      is->pass_filter and is->active_filter, resp.) are not reset to NULL, although
      the filters they point to may have been destroyed by
      sk_unattached_filter_destroy(), so in this EINVAL case dangling pointers are
      left behind (provided the pointers were previously non-NULL).
      
      This patch corrects both problems by checking whether the filter passed is
      empty or non-empty, and prevents sk_unattached_filter_create() from being
      called in the first case. Moreover, the pointers are always reset to NULL
      as soon as sk_unattached_filter_destroy() returns.
      Signed-off-by: NChristoph Schulz <develop@kristov.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cc25eaae
    • C
      net: ppp: access ppp->nextseq only if CONFIG_PPP_MULTILINK is defined · a9f559c3
      Christoph Schulz 提交于
      Commit d762d038 resets the counter holding the
      next sequence number for multilink PPP fragments to zero whenever the
      SC_MULTILINK flag is set. However, this counter only exists if
      CONFIG_PPP_MULTILINK is defined. Consequently, the new code has to be enclosed
      within #ifdef CONFIG_PPP_MULTILINK ... #endif.
      Signed-off-by: NChristoph Schulz <develop@kristov.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a9f559c3
  12. 16 7月, 2014 2 次提交
  13. 15 7月, 2014 1 次提交
    • C
      net: ppp: don't call sk_chk_filter twice · 3916a319
      Christoph Schulz 提交于
      Commit 568f194e ("net: ppp: use
      sk_unattached_filter api") causes sk_chk_filter() to be called twice when
      setting a PPP pass or active filter. This applies to both the generic PPP
      subsystem implemented by drivers/net/ppp/ppp_generic.c and the ISDN PPP
      subsystem implemented by drivers/isdn/i4l/isdn_ppp.c. The first call is from
      within get_filter(). The second one is through the call chain
      
        ppp_ioctl() or isdn_ppp_ioctl()
        --> sk_unattached_filter_create()
            --> __sk_prepare_filter()
                --> sk_chk_filter()
      
      The first call from within get_filter() should be deleted as get_filter() is
      called just before calling sk_unattached_filter_create() later on, which
      eventually calls sk_chk_filter() anyway.
      
      For 3.15.x, this proposed change is a bugfix rather than a pure optimization as
      in that branch, sk_chk_filter() may replace filter codes by other codes which
      are not recognized when executing sk_chk_filter() a second time. So with
      3.15.x, if sk_chk_filter() is called twice, the second invocation may yield
      EINVAL (this depends on the filter codes found in the filter to be set, but
      because the replacement is done for frequently used codes, this is almost
      always the case). The net effect is that setting pass and/or active PPP filters
      does not work anymore, since sk_unattached_filter_create() always returns
      EINVAL due to the second call to sk_chk_filter(), regardless whether the filter
      was originally sane or not.
      Signed-off-by: NChristoph Schulz <develop@kristov.de>
      Acked-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3916a319
  14. 24 5月, 2014 1 次提交
    • D
      net: filter: let unattached filters use sock_fprog_kern · b1fcd35c
      Daniel Borkmann 提交于
      The sk_unattached_filter_create() API is used by BPF filters that
      are not directly attached or related to sockets, and are used in
      team, ptp, xt_bpf, cls_bpf, etc. As such all users do their own
      internal managment of obtaining filter blocks and thus already
      have them in kernel memory and set up before calling into
      sk_unattached_filter_create(). As a result, due to __user annotation
      in sock_fprog, sparse triggers false positives (incorrect type in
      assignment [different address space]) when filters are set up before
      passing them to sk_unattached_filter_create(). Therefore, let
      sk_unattached_filter_create() API use sock_fprog_kern to overcome
      this issue.
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Acked-by: NAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b1fcd35c
  15. 31 3月, 2014 1 次提交
  16. 28 2月, 2013 1 次提交
  17. 20 2月, 2013 1 次提交
  18. 16 2月, 2013 1 次提交
  19. 02 11月, 2012 1 次提交
  20. 04 8月, 2012 1 次提交
  21. 19 5月, 2012 1 次提交
  22. 14 4月, 2012 1 次提交
    • D
      ppp: Fix race condition with queue start/stop · 9a5d2bd9
      David Woodhouse 提交于
      Commit e675f0cc ("ppp: Don't stop and
      restart queue on every TX packet") introduced a race condition which
      could leave the net queue stopped even when the channel is no longer
      busy. By calling netif_stop_queue() from ppp_start_xmit(), based on the
      return value from ppp_xmit_process() but *after* all the locks have been
      dropped, we could potentially do so *after* the channel has actually
      finished transmitting and attempted to re-wake the queue.
      
      Fix this by moving the netif_stop_queue() into ppp_xmit_process() under
      the xmit lock. I hadn't done this previously, because it gets called
      from other places than ppp_start_xmit(). But I now think it's the better
      option. The net queue *should* be stopped if the channel becomes
      congested due to writes from pppd, anyway.
      Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9a5d2bd9
  23. 04 4月, 2012 1 次提交
    • D
      ppp: Don't stop and restart queue on every TX packet · e675f0cc
      David Woodhouse 提交于
      For every transmitted packet, ppp_start_xmit() will stop the netdev
      queue and then, if appropriate, restart it. This causes the TX softirq
      to run, entirely gratuitously.
      
      This is "only" a waste of CPU time in the normal case, but it's actively
      harmful when the PPP device is a TEQL slave — the wakeup will cause the
      offending device to receive the next TX packet from the TEQL queue, when
      it *should* have gone to the next slave in the list. We end up seeing
      large bursts of packets on just *one* slave device, rather than using
      the full available bandwidth over all slaves.
      
      This patch fixes the problem by *not* unconditionally stopping the queue
      in ppp_start_xmit(). It adds a return value from ppp_xmit_process()
      which indicates whether the queue should be stopped or not.
      
      It *doesn't* remove the call to netif_wake_queue() from
      ppp_xmit_process(), because other code paths (especially from
      ppp_output_wakeup()) need it there and it's messy to push it out to the
      other callers to do it based on the return value. So we leave it in
      place — it's a no-op in the case where the queue wasn't stopped, so it's
      harmless in the TX path.
      Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e675f0cc
  24. 05 3月, 2012 1 次提交
    • P
      ppp: Move ioctl definitions from if_ppp.h to new ppp-ioctl.h · bf7daebb
      Paul Mackerras 提交于
      This moves the definitions of the ioctls, constants and structures
      relating to the ppp_generic interface to userspace out from if_ppp.h
      to a new file, ppp-ioctl.h.  The new file has my copyright since I
      designed and implemented the ppp_generic interface in the late 1990s.
      None of the contents of this file comes from the original if_ppp.h
      published by Carnegie Mellon University.
      
      Of the remainder of if_ppp.h, only the PPP_MTU definition was being
      used, and this replaces the uses of it with PPP_MRU (which is identical).
      Therefore, this replaces the entire file with the single line
      
      #include <linux/ppp-ioctl.h>
      
      which clearly doesn't contain any CMU code.  Thus I have removed the
      CMU copyright notice with its problematic advertising clause, and in
      fact since it's only one trivial line I have not added any other
      copyright notice.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bf7daebb
  25. 25 2月, 2012 1 次提交
    • B
      ppp: fix 'ppp_mp_reconstruct bad seq' errors · 8a49ad6e
      Ben McKeegan 提交于
      This patch fixes a (mostly cosmetic) bug introduced by the patch
      'ppp: Use SKB queue abstraction interfaces in fragment processing'
      found here: http://www.spinics.net/lists/netdev/msg153312.html
      
      The above patch rewrote and moved the code responsible for cleaning
      up discarded fragments but the new code does not catch every case
      where this is necessary.  This results in some discarded fragments
      remaining in the queue, and triggering a 'bad seq' error on the
      subsequent call to ppp_mp_reconstruct.  Fragments are discarded
      whenever other fragments of the same frame have been lost.
      This can generate a lot of unwanted and misleading log messages.
      
      This patch also adds additional detail to the debug logging to
      make it clearer which fragments were lost and which other fragments
      were discarded as a result of losses. (Run pppd with 'kdebug 1'
      option to enable debug logging.)
      Signed-off-by: NBen McKeegan <ben@netservers.co.uk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8a49ad6e
  26. 14 2月, 2012 1 次提交
  27. 27 8月, 2011 1 次提交
    • J
      ppp: Move the PPP drivers · 224cf5ad
      Jeff Kirsher 提交于
      Move the PPP drivers into drivers/net/ppp/ and make the
      necessary Kconfig and Makefile changes.
      
      CC: Paul Mackerras <paulus@samba.org>
      CC: Frank Cusack <fcusack@fcusack.com>
      CC: Michal Ostrowski <mostrows@speakeasy.net>
      CC: Michal Ostrowski <mostrows@earthlink.net>
      CC: Dmitry Kozlov <xeb@mail.ru>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      224cf5ad
  28. 27 7月, 2011 1 次提交
  29. 31 3月, 2011 1 次提交
  30. 21 1月, 2011 3 次提交
  31. 11 1月, 2011 1 次提交
  32. 29 12月, 2010 1 次提交
    • S
      ppp: allow disabling multilink protocol ID compression · d39cd5e9
      stephen hemminger 提交于
      Linux would not connect to other router running old version Cisco IOS (12.0).
      This is most likely a bug in that version of IOS, since it is fixed
      in later versions. As a workaround this patch allows a module parameter
      to be set to disable compressing the protocol ID.
      
      See: https://bugzilla.vyatta.com/show_bug.cgi?id=3979
      
      RFC 1990 allows an implementation to formulate MP fragments as if protocol
      compression had been negotiated.  This allows us to always send compressed
      protocol IDs.  But some implementations don't accept MP fragments with
      compressed protocol IDs.  This parameter allows us to interoperate with
      them.  The default value of the configurable parameter is the same as the
      current behavior:  protocol compression is enabled.  If protocol compression
      is disabled we will not send compressed protocol IDs.
      
      This is based on an earlier patch by Bob Gilligan (using a sysctl).
      Module parameter is writable to allow for enabling even if ppp
      is already loaded for other uses.
      Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
      Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
      Acked-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d39cd5e9
  33. 29 11月, 2010 1 次提交
  34. 20 11月, 2010 1 次提交
    • E
      filter: optimize sk_run_filter · 93aaae2e
      Eric Dumazet 提交于
      Remove pc variable to avoid arithmetic to compute fentry at each filter
      instruction. Jumps directly manipulate fentry pointer.
      
      As the last instruction of filter[] is guaranteed to be a RETURN, and
      all jumps are before the last instruction, we dont need to check filter
      bounds (number of instructions in filter array) at each iteration, so we
      remove it from sk_run_filter() params.
      
      On x86_32 remove f_k var introduced in commit 57fe93b3
      (filter: make sure filters dont read uninitialized memory)
      
      Note : We could use a CONFIG_ARCH_HAS_{FEW|MANY}_REGISTERS in order to
      avoid too many ifdefs in this code.
      
      This helps compiler to use cpu registers to hold fentry and A
      accumulator.
      
      On x86_32, this saves 401 bytes, and more important, sk_run_filter()
      runs much faster because less register pressure (One less conditional
      branch per BPF instruction)
      
      # size net/core/filter.o net/core/filter_pre.o
         text    data     bss     dec     hex filename
         2948       0       0    2948     b84 net/core/filter.o
         3349       0       0    3349     d15 net/core/filter_pre.o
      
      on x86_64 :
      # size net/core/filter.o net/core/filter_pre.o
         text    data     bss     dec     hex filename
         5173       0       0    5173    1435 net/core/filter.o
         5224       0       0    5224    1468 net/core/filter_pre.o
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Acked-by: NChangli Gao <xiaosuo@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      93aaae2e
  35. 15 10月, 2010 1 次提交
    • A
      llseek: automatically add .llseek fop · 6038f373
      Arnd Bergmann 提交于
      All file_operations should get a .llseek operation so we can make
      nonseekable_open the default for future file operations without a
      .llseek pointer.
      
      The three cases that we can automatically detect are no_llseek, seq_lseek
      and default_llseek. For cases where we can we can automatically prove that
      the file offset is always ignored, we use noop_llseek, which maintains
      the current behavior of not returning an error from a seek.
      
      New drivers should normally not use noop_llseek but instead use no_llseek
      and call nonseekable_open at open time.  Existing drivers can be converted
      to do the same when the maintainer knows for certain that no user code
      relies on calling seek on the device file.
      
      The generated code is often incorrectly indented and right now contains
      comments that clarify for each added line why a specific variant was
      chosen. In the version that gets submitted upstream, the comments will
      be gone and I will manually fix the indentation, because there does not
      seem to be a way to do that using coccinelle.
      
      Some amount of new code is currently sitting in linux-next that should get
      the same modifications, which I will do at the end of the merge window.
      
      Many thanks to Julia Lawall for helping me learn to write a semantic
      patch that does all this.
      
      ===== begin semantic patch =====
      // This adds an llseek= method to all file operations,
      // as a preparation for making no_llseek the default.
      //
      // The rules are
      // - use no_llseek explicitly if we do nonseekable_open
      // - use seq_lseek for sequential files
      // - use default_llseek if we know we access f_pos
      // - use noop_llseek if we know we don't access f_pos,
      //   but we still want to allow users to call lseek
      //
      @ open1 exists @
      identifier nested_open;
      @@
      nested_open(...)
      {
      <+...
      nonseekable_open(...)
      ...+>
      }
      
      @ open exists@
      identifier open_f;
      identifier i, f;
      identifier open1.nested_open;
      @@
      int open_f(struct inode *i, struct file *f)
      {
      <+...
      (
      nonseekable_open(...)
      |
      nested_open(...)
      )
      ...+>
      }
      
      @ read disable optional_qualifier exists @
      identifier read_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      expression E;
      identifier func;
      @@
      ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
      {
      <+...
      (
         *off = E
      |
         *off += E
      |
         func(..., off, ...)
      |
         E = *off
      )
      ...+>
      }
      
      @ read_no_fpos disable optional_qualifier exists @
      identifier read_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      @@
      ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
      {
      ... when != off
      }
      
      @ write @
      identifier write_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      expression E;
      identifier func;
      @@
      ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
      {
      <+...
      (
        *off = E
      |
        *off += E
      |
        func(..., off, ...)
      |
        E = *off
      )
      ...+>
      }
      
      @ write_no_fpos @
      identifier write_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      @@
      ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
      {
      ... when != off
      }
      
      @ fops0 @
      identifier fops;
      @@
      struct file_operations fops = {
       ...
      };
      
      @ has_llseek depends on fops0 @
      identifier fops0.fops;
      identifier llseek_f;
      @@
      struct file_operations fops = {
      ...
       .llseek = llseek_f,
      ...
      };
      
      @ has_read depends on fops0 @
      identifier fops0.fops;
      identifier read_f;
      @@
      struct file_operations fops = {
      ...
       .read = read_f,
      ...
      };
      
      @ has_write depends on fops0 @
      identifier fops0.fops;
      identifier write_f;
      @@
      struct file_operations fops = {
      ...
       .write = write_f,
      ...
      };
      
      @ has_open depends on fops0 @
      identifier fops0.fops;
      identifier open_f;
      @@
      struct file_operations fops = {
      ...
       .open = open_f,
      ...
      };
      
      // use no_llseek if we call nonseekable_open
      ////////////////////////////////////////////
      @ nonseekable1 depends on !has_llseek && has_open @
      identifier fops0.fops;
      identifier nso ~= "nonseekable_open";
      @@
      struct file_operations fops = {
      ...  .open = nso, ...
      +.llseek = no_llseek, /* nonseekable */
      };
      
      @ nonseekable2 depends on !has_llseek @
      identifier fops0.fops;
      identifier open.open_f;
      @@
      struct file_operations fops = {
      ...  .open = open_f, ...
      +.llseek = no_llseek, /* open uses nonseekable */
      };
      
      // use seq_lseek for sequential files
      /////////////////////////////////////
      @ seq depends on !has_llseek @
      identifier fops0.fops;
      identifier sr ~= "seq_read";
      @@
      struct file_operations fops = {
      ...  .read = sr, ...
      +.llseek = seq_lseek, /* we have seq_read */
      };
      
      // use default_llseek if there is a readdir
      ///////////////////////////////////////////
      @ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier readdir_e;
      @@
      // any other fop is used that changes pos
      struct file_operations fops = {
      ... .readdir = readdir_e, ...
      +.llseek = default_llseek, /* readdir is present */
      };
      
      // use default_llseek if at least one of read/write touches f_pos
      /////////////////////////////////////////////////////////////////
      @ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier read.read_f;
      @@
      // read fops use offset
      struct file_operations fops = {
      ... .read = read_f, ...
      +.llseek = default_llseek, /* read accesses f_pos */
      };
      
      @ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier write.write_f;
      @@
      // write fops use offset
      struct file_operations fops = {
      ... .write = write_f, ...
      +	.llseek = default_llseek, /* write accesses f_pos */
      };
      
      // Use noop_llseek if neither read nor write accesses f_pos
      ///////////////////////////////////////////////////////////
      
      @ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier read_no_fpos.read_f;
      identifier write_no_fpos.write_f;
      @@
      // write fops use offset
      struct file_operations fops = {
      ...
       .write = write_f,
       .read = read_f,
      ...
      +.llseek = noop_llseek, /* read and write both use no f_pos */
      };
      
      @ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier write_no_fpos.write_f;
      @@
      struct file_operations fops = {
      ... .write = write_f, ...
      +.llseek = noop_llseek, /* write uses no f_pos */
      };
      
      @ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier read_no_fpos.read_f;
      @@
      struct file_operations fops = {
      ... .read = read_f, ...
      +.llseek = noop_llseek, /* read uses no f_pos */
      };
      
      @ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      @@
      struct file_operations fops = {
      ...
      +.llseek = noop_llseek, /* no read or write fn */
      };
      ===== End semantic patch =====
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Julia Lawall <julia@diku.dk>
      Cc: Christoph Hellwig <hch@infradead.org>
      6038f373
  36. 05 10月, 2010 1 次提交