1. 19 6月, 2012 2 次提交
    • P
      netfilter: nf_ct_helper: disable automatic helper re-assignment of different type · 32f53760
      Pablo Neira Ayuso 提交于
      This patch modifies __nf_ct_try_assign_helper in a way that invalidates support
      for the following scenario:
      
      1) attach the helper A for first time when the conntrack is created
      2) attach new (different) helper B due to changes the reply tuple caused by NAT
      
      eg. port redirection from TCP/21 to TCP/5060 with both FTP and SIP helpers
      loaded, which seems to be a quite unorthodox scenario.
      
      I can provide a more elaborated patch to support this scenario but explicit
      helper attachment provides a better solution for this since now the use can
      attach the helpers consistently, without relying on the automatic helper
      lookup magic.
      
      This patch fixes a possible out of bound zeroing of the conntrack helper
      extension if the helper B uses more memory for its private data than
      helper A.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      32f53760
    • P
      netfilter: ctnetlink: fix NULL dereference while trying to change helper · fd7462de
      Pablo Neira Ayuso 提交于
      The patch 1afc5679: "netfilter: nf_ct_helper: implement variable
      length helper private data" from Jun 7, 2012, leads to the following
      Smatch complaint:
      
      net/netfilter/nf_conntrack_netlink.c:1231 ctnetlink_change_helper()
               error: we previously assumed 'help->helper' could be null (see line 1228)
      
      This NULL dereference can be triggered with the following sequence:
      
      1) attach the helper for first time when the conntrack is created.
      2) remove the helper module or detach the helper from the conntrack
         via ctnetlink.
      3) attach helper again (the same or different one, no matter) to the
         that existing conntrack again via ctnetlink.
      
      This patch fixes the problem by removing the use case that allows you
      to re-assign again a helper for one conntrack entry via ctnetlink since
      I cannot find any practical use for it.
      Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      fd7462de
  2. 18 6月, 2012 3 次提交
  3. 17 6月, 2012 6 次提交
  4. 16 6月, 2012 18 次提交
    • P
      netfilter: add user-space connection tracking helper infrastructure · 12f7a505
      Pablo Neira Ayuso 提交于
      There are good reasons to supports helpers in user-space instead:
      
      * Rapid connection tracking helper development, as developing code
        in user-space is usually faster.
      
      * Reliability: A buggy helper does not crash the kernel. Moreover,
        we can monitor the helper process and restart it in case of problems.
      
      * Security: Avoid complex string matching and mangling in kernel-space
        running in privileged mode. Going further, we can even think about
        running user-space helpers as a non-root process.
      
      * Extensibility: It allows the development of very specific helpers (most
        likely non-standard proprietary protocols) that are very likely not to be
        accepted for mainline inclusion in the form of kernel-space connection
        tracking helpers.
      
      This patch adds the infrastructure to allow the implementation of
      user-space conntrack helpers by means of the new nfnetlink subsystem
      `nfnetlink_cthelper' and the existing queueing infrastructure
      (nfnetlink_queue).
      
      I had to add the new hook NF_IP6_PRI_CONNTRACK_HELPER to register
      ipv[4|6]_helper which results from splitting ipv[4|6]_confirm into
      two pieces. This change is required not to break NAT sequence
      adjustment and conntrack confirmation for traffic that is enqueued
      to our user-space conntrack helpers.
      
      Basic operation, in a few steps:
      
      1) Register user-space helper by means of `nfct':
      
       nfct helper add ftp inet tcp
      
       [ It must be a valid existing helper supported by conntrack-tools ]
      
      2) Add rules to enable the FTP user-space helper which is
         used to track traffic going to TCP port 21.
      
      For locally generated packets:
      
       iptables -I OUTPUT -t raw -p tcp --dport 21 -j CT --helper ftp
      
      For non-locally generated packets:
      
       iptables -I PREROUTING -t raw -p tcp --dport 21 -j CT --helper ftp
      
      3) Run the test conntrackd in helper mode (see example files under
         doc/helper/conntrackd.conf
      
       conntrackd
      
      4) Generate FTP traffic going, if everything is OK, then conntrackd
         should create expectations (you can check that with `conntrack':
      
       conntrack -E expect
      
          [NEW] 301 proto=6 src=192.168.1.136 dst=130.89.148.12 sport=0 dport=54037 mask-src=255.255.255.255 mask-dst=255.255.255.255 sport=0 dport=65535 master-src=192.168.1.136 master-dst=130.89.148.12 sport=57127 dport=21 class=0 helper=ftp
      [DESTROY] 301 proto=6 src=192.168.1.136 dst=130.89.148.12 sport=0 dport=54037 mask-src=255.255.255.255 mask-dst=255.255.255.255 sport=0 dport=65535 master-src=192.168.1.136 master-dst=130.89.148.12 sport=57127 dport=21 class=0 helper=ftp
      
      This confirms that our test helper is receiving packets including the
      conntrack information, and adding expectations in kernel-space.
      
      The user-space helper can also store its private tracking information
      in the conntrack structure in the kernel via the CTA_HELP_INFO. The
      kernel will consider this a binary blob whose layout is unknown. This
      information will be included in the information that is transfered
      to user-space via glue code that integrates nfnetlink_queue and
      ctnetlink.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      12f7a505
    • P
      netfilter: ctnetlink: add CTA_HELP_INFO attribute · ae243bee
      Pablo Neira Ayuso 提交于
      This attribute can be used to modify and to dump the internal
      protocol information.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      ae243bee
    • P
      netfilter: nfnetlink_queue: add NAT TCP sequence adjustment if packet mangled · 8c88f87c
      Pablo Neira Ayuso 提交于
      User-space programs that receive traffic via NFQUEUE may mangle packets.
      If NAT is enabled, this usually puzzles sequence tracking, leading to
      traffic disruptions.
      
      With this patch, nfnl_queue will make the corresponding NAT TCP sequence
      adjustment if:
      
      1) The packet has been mangled,
      2) the NFQA_CFG_F_CONNTRACK flag has been set, and
      3) NAT is detected.
      
      There are some records on the Internet complaning about this issue:
      http://stackoverflow.com/questions/260757/packet-mangling-utilities-besides-iptables
      
      By now, we only support TCP since we have no helpers for DCCP or SCTP.
      Better to add this if we ever have some helper over those layer 4 protocols.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      8c88f87c
    • P
      netfilter: add glue code to integrate nfnetlink_queue and ctnetlink · 9cb01766
      Pablo Neira Ayuso 提交于
      This patch allows you to include the conntrack information together
      with the packet that is sent to user-space via NFQUEUE.
      
      Previously, there was no integration between ctnetlink and
      nfnetlink_queue. If you wanted to access conntrack information
      from your libnetfilter_queue program, you required to query
      ctnetlink from user-space to obtain it. Thus, delaying the packet
      processing even more.
      
      Including the conntrack information is optional, you can set it
      via NFQA_CFG_F_CONNTRACK flag with the new NFQA_CFG_FLAGS attribute.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      9cb01766
    • P
      netfilter: nf_ct_helper: implement variable length helper private data · 1afc5679
      Pablo Neira Ayuso 提交于
      This patch uses the new variable length conntrack extensions.
      
      Instead of using union nf_conntrack_help that contain all the
      helper private data information, we allocate variable length
      area to store the private helper data.
      
      This patch includes the modification of all existing helpers.
      It also includes a couple of include header to avoid compilation
      warnings.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      1afc5679
    • P
      netfilter: nf_ct_ext: support variable length extensions · 3cf4c7e3
      Pablo Neira Ayuso 提交于
      We can now define conntrack extensions of variable size. This
      patch is useful to get rid of these unions:
      
      union nf_conntrack_help
      union nf_conntrack_proto
      union nf_conntrack_nat_help
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      3cf4c7e3
    • P
      netfilter: nf_ct_helper: allocate 16 bytes for the helper and policy names · 3a8fc53a
      Pablo Neira Ayuso 提交于
      This patch modifies the struct nf_conntrack_helper to allocate
      the room for the helper name. The maximum length is 16 bytes
      (this was already introduced in 2.6.24).
      
      For the maximum length for expectation policy names, I have
      also selected 16 bytes.
      
      This patch is required by the follow-up patch to support
      user-space connection tracking helpers.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      3a8fc53a
    • D
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · aee289ba
      David S. Miller 提交于
      Conflicts:
      	net/ipv6/route.c
      
      Pull in 'net' again to get the revert of Thomas's change
      which introduced regressions.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      aee289ba
    • D
      Revert "ipv6: Prevent access to uninitialized fib_table_hash via /proc/net/ipv6_route" · e8803b6c
      David S. Miller 提交于
      This reverts commit 2a0c451a.
      
      It causes crashes, because now ip6_null_entry is used before
      it is initialized.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e8803b6c
    • D
      ipv6: Fix types of ip6_update_pmtu(). · 42ae66c8
      David S. Miller 提交于
      The mtu should be a __be32, not the mark.
      Reported-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      42ae66c8
    • D
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 7e52b33b
      David S. Miller 提交于
      Conflicts:
      	net/ipv6/route.c
      
      This deals with a merge conflict between the net-next addition of the
      inetpeer network namespace ops, and Thomas Graf's bug fix in
      2a0c451a which makes sure we don't
      register /proc/net/ipv6_route before it is actually safe to do so.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7e52b33b
    • D
    • T
      ipv6: Prevent access to uninitialized fib_table_hash via /proc/net/ipv6_route · 2a0c451a
      Thomas Graf 提交于
      /proc/net/ipv6_route reflects the contents of fib_table_hash. The proc
      handler is installed in ip6_route_net_init() whereas fib_table_hash is
      allocated in fib6_net_init() _after_ the proc handler has been installed.
      
      This opens up a short time frame to access fib_table_hash with its pants
      down.
      
      fib6_init() as a whole can't be moved to an earlier position as it also
      registers the rtnetlink message handlers which should be registered at
      the end. Therefore split it into fib6_init() which is run early and
      fib6_init_late() to register the rtnetlink message handlers.
      Signed-off-by: NThomas Graf <tgraf@suug.ch>
      Reviewed-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2a0c451a
    • D
      qlcnic: off by one in qlcnic_init_pci_info() · 0f6efff9
      Dan Carpenter 提交于
      The adapter->npars[] array has QLCNIC_MAX_PCI_FUNC elements.  We
      allocate it that way a few lines earlier in the function.  So this test
      is off by one.
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Acked-by: NAnirban Chakraborty <anirban.chakraborty@qlogic.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0f6efff9
    • E
      net: remove skb_orphan_try() · 62b1a8ab
      Eric Dumazet 提交于
      Orphaning skb in dev_hard_start_xmit() makes bonding behavior
      unfriendly for applications sending big UDP bursts : Once packets
      pass the bonding device and come to real device, they might hit a full
      qdisc and be dropped. Without orphaning, the sender is automatically
      throttled because sk->sk_wmemalloc reaches sk->sk_sndbuf (assuming
      sk_sndbuf is not too big)
      
      We could try to defer the orphaning adding another test in
      dev_hard_start_xmit(), but all this seems of little gain,
      now that BQL tends to make packets more likely to be parked
      in Qdisc queues instead of NIC TX ring, in cases where performance
      matters.
      
      Reverts commits :
      fc6055a5 net: Introduce skb_orphan_try()
      87fd308c net: skb_tx_hash() fix relative to skb_orphan_try()
      and removes SKBTX_DRV_NEEDS_SK_REF flag
      Reported-and-bisected-by: NJean-Michel Hautbois <jhautbois@gmail.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Tested-by: NOliver Hartkopp <socketcan@hartkopp.net>
      Acked-by: NOliver Hartkopp <socketcan@hartkopp.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      62b1a8ab
    • E
      bnx2x: fix panic when TX ring is full · bc14786a
      Eric Dumazet 提交于
      There is a off by one error in the minimal number of BD in
      bnx2x_start_xmit() and bnx2x_tx_int() before stopping/resuming tx queue.
      
      A full size GSO packet, with data included in skb->head really needs
      (MAX_SKB_FRAGS + 4) BDs, because of bnx2x_tx_split()
      
      This error triggers if BQL is disabled and heavy TCP transmit traffic
      occurs.
      
      bnx2x_tx_split() definitely can be called, remove a wrong comment.
      Reported-by: NTomas Hruby <thruby@google.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Eilon Greenstein <eilong@broadcom.com>
      Cc: Yaniv Rosner <yanivr@broadcom.com>
      Cc: Merav Sicron <meravs@broadcom.com>
      Cc: Tom Herbert <therbert@google.com>
      Cc: Robert Evans <evansr@google.com>
      Cc: Willem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bc14786a
    • D
      can: c_can: precedence error in c_can_chip_config() · d9cb9bd6
      Dan Carpenter 提交于
      (CAN_CTRLMODE_LISTENONLY & CAN_CTRLMODE_LOOPBACK) is (0x02 & 0x01) which
      is zero so the condition is never true.  The intent here was to test
      that both flags were set.
      
      Cc: <stable@kernel.org> # 2.6.39+
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Acked-by: NOliver Hartkopp <socketcan@hartkopp.net>
      Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d9cb9bd6
    • D
      ipv6: Handle PMTU in ICMP error handlers. · 81aded24
      David S. Miller 提交于
      One tricky issue on the ipv6 side vs. ipv4 is that the ICMP callouts
      to handle the error pass the 32-bit info cookie in network byte order
      whereas ipv4 passes it around in host byte order.
      
      Like the ipv4 side, we have two helper functions.  One for when we
      have a socket context and one for when we do not.
      
      ip6ip6 tunnels are not handled here, because they handle PMTU events
      by essentially relaying another ICMP packet-too-big message back to
      the original sender.
      
      This patch allows us to get rid of rt6_do_pmtu_disc().  It handles all
      kinds of situations that simply cannot happen when we do the PMTU
      update directly using a fully resolved route.
      
      In fact, the "plen == 128" check in ip6_rt_update_pmtu() can very
      likely be removed or changed into a BUG_ON() check.  We should never
      have a prefixed ipv6 route when we get there.
      
      Another piece of strange history here is that TCP and DCCP, unlike in
      ipv4, never invoke the update_pmtu() method from their ICMP error
      handlers.  This is incredibly astonishing since this is the context
      where we have the most accurate context in which to make a PMTU
      update, namely we have a fully connected socket and associated cached
      socket route.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      81aded24
  5. 15 6月, 2012 1 次提交
    • D
      ipv4: Handle PMTU in all ICMP error handlers. · 36393395
      David S. Miller 提交于
      With ip_rt_frag_needed() removed, we have to explicitly update PMTU
      information in every ICMP error handler.
      
      Create two helper functions to facilitate this.
      
      1) ipv4_sk_update_pmtu()
      
         This updates the PMTU when we have a socket context to
         work with.
      
      2) ipv4_update_pmtu()
      
         Raw version, used when no socket context is available.  For this
         interface, we essentially just pass in explicit arguments for
         the flow identity information we would have extracted from the
         socket.
      
         And you'll notice that ipv4_sk_update_pmtu() is simply implemented
         in terms of ipv4_update_pmtu()
      
      Note that __ip_route_output_key() is used, rather than something like
      ip_route_output_flow() or ip_route_output_key().  This is because we
      absolutely do not want to end up with a route that does IPSEC
      encapsulation and the like.  Instead, we only want the route that
      would get us to the node described by the outermost IP header.
      Reported-by: NSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      36393395
  6. 14 6月, 2012 10 次提交
    • L
      Merge git://git.kernel.org/pub/scm/virt/kvm/kvm · 424d54d2
      Linus Torvalds 提交于
      Pull kvm fix from Marcelo Tosatti:
       "Fix a spurious warning on CPU offline path"
      
      * git://git.kernel.org/pub/scm/virt/kvm/kvm:
        x86: kvmclock: remove check_and_clear_guest_paused warning
      424d54d2
    • L
      Merge tag 'pinctrl-fixes-for-v3.5' of... · 09531359
      Linus Torvalds 提交于
      Merge tag 'pinctrl-fixes-for-v3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
      
      Pull pinctrl fixes from Linus Walleij:
       - section markup fixes
       - clk_prepare() fix to conform to the clk API
       - memory leaks
       - incorrect debug messages
       - bad errorpaths
       - typos
      
      * tag 'pinctrl-fixes-for-v3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
        pinctrl: pinctrl-mxs: set platform driver data to NULL at errpath and at unregister
        pinctrl: pinctrl-mxs: Take care of frees if the kzalloc fails
        pinctrl: pinctrl-imx: fix incorrect debug message of maps
        pinctrl: pinctrl-imx: free if of_get_parent fails to get the parent node
        pinctrl: pinctrl-imx: free allocated pinctrl_map structure only once and use kernel facilities for IMX_PMX_DUMP
        pinctrl: nomadik: fix up typo
        pinctrl: nomadik: add clk_prepare() call
        pinctrl: fix a minor harmless typo
        pinctrl: sirf: mark of_device_id match table as __devinitconst
      09531359
    • L
      Merge tag 'sound-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · b532ff20
      Linus Torvalds 提交于
      Pull sound fixes from Takashi Iwai:
      
       - Fix a regression of USB-audio PCM assignment since 3.4
       - A few VGA-switcheroo-related fixes for proper HDMI audio enablement
       - Fixed the missing initializations of HD-audio verbs, which may have
         resulted in various breakage
       - Some driver-specific ASoC updates
       - A few fixes for the dynamic PCM code
       - The addition of pinctrl support for the i.MX audmux which didn't make
         it into -rc1 due to cross tree dependency issues
       - A few minor fixes in compress API codes
      
      * tag 'sound-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
        ALSA: hda - Don't forget to call init verbs added by fixup list
        ALSA: HDA: Pin fixup for Zotac Z68 motherboard
        ALSA: compress_core: cleanup pointers on stop
        ALSA: compress_core: don't wake up on pause
        ALSA: hda - Fix detection of Creative SoundCore3D controllers
        vga_switcheroo: Enable/disable audio clients at the right time
        ALSA: hda - HDMI Audio init all connectors when VGA-switcheroo is off
        vga_switcheroo: Fix error without CONFIG_VGA_SWITCHEROO
        ALSA: hda - Fix uninitialized HDMI controllers with VGA-switcheroo
        vga_switcheroo: Add a helper function to get the client state
        ALSA: usb-audio: Fix substream assignments
        ASoC: tegra: add MODULE_DEVICE_TABLE to tegra30_ahub
        ASoC: wm2000: Always use a 4s timeout for the firmware
        ASoC: dapm: Fix input list to use source widgets
        ASoC: dpcm: Fix dpcm_get_be() to check that DAI is BE
        ASoC: wm8994: Apply volume updates with clocks enabled
        ASoC: wm8994: Ensure all AIFnCLK events are run from the _late variants
        ASoC: imx-audmux: add pinctrl support
        ASoC: dapm: Fix connected widget capture path query.
      b532ff20
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · fea7c783
      Linus Torvalds 提交于
      Pull networking fixes from David S. Miller:
      
      This has the fix for the wireless issues I ran into the other week as
      well as:
      
       1) Fix CAN c_can driver transmit handling resulting in BUG check
          triggers, from AnilKumar Ch.
      
       2) Fix packet drop monitor sleeping in atomic context, from Eric
          Dumazet.
      
       3) Fix mv643xx_eth driver build regression, from Andrew Lunn.
      
       4) Inetpeer freeing needs an RCU grace period in order to avoid races
          during tree invalidation.  From Eric Dumazet.
      
       5) Fix endianness bugs in xt_HMARK netfilter module, from Hans
          Schillstrom.
      
       6) Add proper module refcounting to l2tp_eth to avoid crash on module
          unload, from Eric Dumazet.
      
       7) Fix truncation of neighbour entry dumps due to logic errors in
          neigh_dump_info() and friends, from Eric Dumazet.
      
       8) The conversion of fib6_age() to dst_neigh_lookup() accidently
          reversed the logic of a flags test, fix from Thomas Graf.
      
       9) Fix checksum configuration in newer sky2 chips, from Stephen
          Hemminger.
      
      10) Revert BQL support in NIU driver, doesn't work.
      
      11) l2tp_ip_sendmsg() illegally uses a route without a proper reference.
          From Eric Dumazet.
      
      12) be2net driver references an SKB after it's potentially been freed,
          also from Eric Dumazet.
      
      13) Fix RCU stalls in dummy net driver init.  Also from Eric Dumazet.
      
      14) lpc_eth has several bugs in it's transmit engine leading to packet
          leaks and improper queue wakes, from Eric Dumazet.
      
      15) Apply short DMA workaround to more tg3 chips, from Matt Carlson.
      
      16) Add tilegx network driver.
      
      17) Bonding queue mapping for a packet can get corrupted, fix from Eric
          Dumazet.
      
      18) Fix bug in netpoll_send_udp() SKB management that can leave garbage
          in the payload in certain situations.  From Eric Dumazet.
      
      19) bnx2x driver interprets chip RX checksum offload incorrectly in
          encapsulation situations.  Fix from Eric Dumazet.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (75 commits)
        bnx2x: fix checksum validation
        netpoll: fix netpoll_send_udp() bugs
        bonding: Fix corrupted queue_mapping
        bonding:record primary when modify it via sysfs
        tilegx network driver: initial support
        tg3: Apply short DMA frag workaround to 5906
        net: stmmac: Fix clock en-/disable calls
        lpc_eth: fix tx completion
        lpc_eth: add missing ndo_change_mtu()
        dummy: fix rcu_sched self-detected stalls
        net: Reorder initialization in ip_route_output to fix gcc warning
        virtio-net: fix a race on 32bit arches
        r8169: avoid NAPI scheduling delay.
        net: Make linux/tcp.h C++ friendly (trivial)
        netdev: fix drivers/net/phy/ kernel-doc warnings
        net/core: fix kernel-doc warnings
        be2net: fix a race in be_xmit()
        l2tp: fix a race in l2tp_ip_sendmsg()
        mac80211: add back channel change flag
        NFC: Fix possible NULL ptr deref when getting the name of a socket
        ...
      fea7c783
    • J
      ixgbe: Check PTP Rx timestamps via BPF filter · 1d1a79b5
      Jacob Keller 提交于
      This patch fixes a potential Rx timestamp deadlock that causes the Rx
      timestamping to stall indefinitely. The issue could occur when a PTP packet is
      timestamped by hardware but never reaches the Rx queue. In order to prevent a
      permanent loss of timestamping, the RXSTMP(L/H) registers have to be read to
      unlock them. (This used to only occur when a packet that was timestamped
      reached the software.) However the registers can't be read early otherwise
      there is no way to correlate them to the packet.
      
      This patch introduces a filter function which can be used to determine if a
      packet should have been timestamped. Supplied with the filter setup by the
      hwtstamp ioctl, check to make sure the PTP protocol and message type match the
      expected values. If so, then read the timestamp registers (to free them.) At
      this point check the descriptor bit, if the bit is set then we know this
      packet correlates to the timestamp stored in the RXTSTAMP registers.
      Otherwise, assume that packet was dropped by the hardware, and ignore this
      timestamp value. However, we have at least unlocked the rxtstamp registers for
      future timestamping.
      
      Due to the way the driver handles skb data, it cannot be directly accessed. In
      order to work around this, a copy of the skb data into a linear buffer is
      made. From this buffer it becomes possible to read the data correctly
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Reviewed-by: NRichard Cochran <richardcochran@gmail.com>
      Tested-by: NPhil Schmitt <phillip.j.schmitt@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      1d1a79b5
    • J
      ixgbe: PTP Fix hwtstamp mode settings · c19197a7
      Jacob Keller 提交于
      When enabling the hwtstamp mode for Rx timestamping the V2 ptp event type
      specific modes (Delay Request and Sync) have been rolled into the V2 all event
      packet modes, in order to more accurately represent what hardware is doing.
      Hardware always timestamps the Path delay packets when a V2 mode is selected,
      regardless of what type was selected (in order to always support Path delay
      mode). However this means the user selected modes of timestamping only Sync or
      Delay Request is not truly supported. This patch correctly sets the mode for
      the hwtstamp config and returns to the user that all V2 event packets will be
      timestamped.
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NPhil Schmitt <phillip.j.schmitt@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      c19197a7
    • J
      ixgbe: ptp code cleanup · 0ede4a60
      Jacob Keller 提交于
      This patch fixes two minor nits from Richard Cochran. The first is a case of
      ambitious line wrapping that wasn't necessary. The second is to re-order the
      flag checks for PPS support. Previously, the hardware test was done first, and
      the interrupt flag test was done second. Now, test the interrupt flag and use
      the unlikely macro.
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NPhil Schmitt <phillip.j.schmitt@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      0ede4a60
    • E
      ixgbe: do not compile ixgbe_sysfs.c when CONFIG_IXGBE_HWMON is not set · 6cbc52ef
      Emil Tantilov 提交于
      ixgbe_sysfs.c is only needed when CONFIG_IXGBE_HWMON is configured in the
      kernel.
      Signed-off-by: NEmil Tantilov <emil.s.tantilov@intel.com>
      Acked-by: NDon Skidmore <Donald.c.skidmore@intel.com>
      Tested-by: NPhil Schmitt <phillip.j.schmitt@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      6cbc52ef
    • J
      ixgbe: align flow control DV macros with datasheet · 4f8a91ad
      John Fastabend 提交于
      The flow control DV macros are used to calculate the flow control
      high and low thresholds. This patch annotates these macros slightly
      better and fixes the issues below.
      
      The macro variables are renamed LINK to _max_frame_link and TC to
      _max_frame_tc. This was to avoid confusion and make them more
      readable. It was found that people auditing the code read TC to be
      'traffic class' in the 802.1Q definition instead of the max frame
      size of the tc. Hopefully it is clear now.
      
      This audit also found the following real deviations from the
      theoretical values. Fixed in this patch.
      
        * I multiplied the DV calculations by (36/25) which always
          evaluates to 1. This does not match the intended theoretical
          value of 1.44.
      
        * IXGBE_BT2KB added 1023 to account for rounding however this
          really should be 8 * 1023 - 1 to account for division by 8k.
      
        * x2 multiplication of max frame in DV calculations to account
          for updated hardware recommendations.
      
      With this patch the DV values are inline with the recommendations
      in the 82599 and 82598 data sheets. Its worth noting I did not
      see any dropped frames with flow control on in my experiments without
      this patch. However aligning with the hardware specs and
      recommendations seems like a good idea here to account for worst
      case scenarios.
      Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com>
      Tested-by: NRoss Brattain <ross.b.brattain@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      4f8a91ad
    • B
      e1000e: use more informative logging macros when netdev not yet registered · 185095fb
      Bruce Allan 提交于
      Based on a report from Ethan Zhao, before calling register_netdev() the
      driver should be using logging macros that do not display the potentially
      confusing "(unregistered net_device)" yet still display the useful driver
      name and PCI bus/device/function.
      Reported-by: NEthan Zhao <ethan.kernel@gmail.com>
      Signed-off-by: NBruce Allan <bruce.w.allan@intel.com>
      Tested-by: NAaron Brown <aaron.f.brown@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      185095fb