1. 28 4月, 2011 1 次提交
    • E
      net: filter: Just In Time compiler for x86-64 · 0a14842f
      Eric Dumazet 提交于
      In order to speedup packet filtering, here is an implementation of a
      JIT compiler for x86_64
      
      It is disabled by default, and must be enabled by the admin.
      
      echo 1 >/proc/sys/net/core/bpf_jit_enable
      
      It uses module_alloc() and module_free() to get memory in the 2GB text
      kernel range since we call helpers functions from the generated code.
      
      EAX : BPF A accumulator
      EBX : BPF X accumulator
      RDI : pointer to skb   (first argument given to JIT function)
      RBP : frame pointer (even if CONFIG_FRAME_POINTER=n)
      r9d : skb->len - skb->data_len (headlen)
      r8  : skb->data
      
      To get a trace of generated code, use :
      
      echo 2 >/proc/sys/net/core/bpf_jit_enable
      
      Example of generated code :
      
      # tcpdump -p -n -s 0 -i eth1 host 192.168.20.0/24
      
      flen=18 proglen=147 pass=3 image=ffffffffa00b5000
      JIT code: ffffffffa00b5000: 55 48 89 e5 48 83 ec 60 48 89 5d f8 44 8b 4f 60
      JIT code: ffffffffa00b5010: 44 2b 4f 64 4c 8b 87 b8 00 00 00 be 0c 00 00 00
      JIT code: ffffffffa00b5020: e8 24 7b f7 e0 3d 00 08 00 00 75 28 be 1a 00 00
      JIT code: ffffffffa00b5030: 00 e8 fe 7a f7 e0 24 00 3d 00 14 a8 c0 74 49 be
      JIT code: ffffffffa00b5040: 1e 00 00 00 e8 eb 7a f7 e0 24 00 3d 00 14 a8 c0
      JIT code: ffffffffa00b5050: 74 36 eb 3b 3d 06 08 00 00 74 07 3d 35 80 00 00
      JIT code: ffffffffa00b5060: 75 2d be 1c 00 00 00 e8 c8 7a f7 e0 24 00 3d 00
      JIT code: ffffffffa00b5070: 14 a8 c0 74 13 be 26 00 00 00 e8 b5 7a f7 e0 24
      JIT code: ffffffffa00b5080: 00 3d 00 14 a8 c0 75 07 b8 ff ff 00 00 eb 02 31
      JIT code: ffffffffa00b5090: c0 c9 c3
      
      BPF program is 144 bytes long, so native program is almost same size ;)
      
      (000) ldh      [12]
      (001) jeq      #0x800           jt 2    jf 8
      (002) ld       [26]
      (003) and      #0xffffff00
      (004) jeq      #0xc0a81400      jt 16   jf 5
      (005) ld       [30]
      (006) and      #0xffffff00
      (007) jeq      #0xc0a81400      jt 16   jf 17
      (008) jeq      #0x806           jt 10   jf 9
      (009) jeq      #0x8035          jt 10   jf 17
      (010) ld       [28]
      (011) and      #0xffffff00
      (012) jeq      #0xc0a81400      jt 16   jf 13
      (013) ld       [38]
      (014) and      #0xffffff00
      (015) jeq      #0xc0a81400      jt 16   jf 17
      (016) ret      #65535
      (017) ret      #0
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Ben Hutchings <bhutchings@solarflare.com>
      Cc: Hagen Paul Pfeifer <hagen@jauu.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0a14842f
  2. 20 4月, 2011 1 次提交
  3. 15 4月, 2011 1 次提交
    • A
      ethtool: allow custom interval for physical identification · fce55922
      Allan, Bruce W 提交于
      When physical identification of an adapter is done by toggling the
      mechanism on and off through software utilizing the set_phys_id operation,
      it is done with a fixed duration for both on and off states.  Some drivers
      may want to set a custom duration for the on/off intervals.  This patch
      changes the API so the return code from the driver's entry point when it
      is called with ETHTOOL_ID_ACTIVE can specify the frequency at which to
      cycle the on/off states, and updates the drivers that have already been
      converted to use the new set_phys_id and use the synchronous method for
      identifying an adapter.
      
      The physical identification frequency set in the updated drivers is based
      on how it was done prior to the introduction of set_phys_id.
      
      Compile tested only.  Also fixes a compiler warning in sfc.
      
      v2: drivers do not return -EINVAL for ETHOOL_ID_ACTIVE
      v3: fold patchset into single patch and cleanup per Ben's feedback
      Signed-off-by: NBruce Allan <bruce.w.allan@intel.com>
      Cc: Ben Hutchings <bhutchings@solarflare.com>
      Cc: Sathya Perla <sathya.perla@emulex.com>
      Cc: Subbu Seetharaman <subbu.seetharaman@emulex.com>
      Cc: Ajit Khaparde <ajit.khaparde@emulex.com>
      Cc: Michael Chan <mchan@broadcom.com>
      Cc: Eilon Greenstein <eilong@broadcom.com>
      Cc: Divy Le Ray <divy@chelsio.com>
      Cc: Don Fry <pcnet32@frontier.com>
      Cc: Jon Mason <jdmason@kudzu.us>
      Cc: Solarflare linux maintainers <linux-net-drivers@solarflare.com>
      Cc: Steve Hodgson <shodgson@solarflare.com>
      Cc: Stephen Hemminger <shemminger@linux-foundation.org>
      Cc: Matt Carlson <mcarlson@broadcom.com>
      Acked-by: NJon Mason <jdmason@kudzu.us>
      Acked-by: NBen Hutchings <bhutchings@solarflare.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fce55922
  4. 14 4月, 2011 4 次提交
  5. 13 4月, 2011 9 次提交
  6. 12 4月, 2011 1 次提交
  7. 08 4月, 2011 1 次提交
  8. 07 4月, 2011 1 次提交
  9. 06 4月, 2011 3 次提交
    • M
      dm: improve block integrity support · a63a5cf8
      Mike Snitzer 提交于
      The current block integrity (DIF/DIX) support in DM is verifying that
      all devices' integrity profiles match during DM device resume (which
      is past the point of no return).  To some degree that is unavoidable
      (stacked DM devices force this late checking).  But for most DM
      devices (which aren't stacking on other DM devices) the ideal time to
      verify all integrity profiles match is during table load.
      
      Introduce the notion of an "initialized" integrity profile: a profile
      that was blk_integrity_register()'d with a non-NULL 'blk_integrity'
      template.  Add blk_integrity_is_initialized() to allow checking if a
      profile was initialized.
      
      Update DM integrity support to:
      - check all devices with _initialized_ integrity profiles match
        during table load; uninitialized profiles (e.g. for underlying DM
        device(s) of a stacked DM device) are ignored.
      - disallow a table load that would result in an integrity profile that
        conflicts with a DM device's existing (in-use) integrity profile
      - avoid clearing an existing integrity profile
      - validate all integrity profiles match during resume; but if they
        don't all we can do is report the mismatch (during resume we're past
        the point of no return)
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Cc: Martin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      a63a5cf8
    • J
      fs: export empty_aops · 7dcda1c9
      Jens Axboe 提交于
      With the ->sync_page() hook gone, we have a few users that
      add their own static address_space_operations without any
      functions defined.
      
      fs/inode.c already has an empty_aops that it uses for init
      purposes. Lets export that and use it in the places where
      an otherwise empty aops was defined.
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      7dcda1c9
    • J
      block: get rid of elv_insert() interface · b710a480
      Jens Axboe 提交于
      Merge it with __elv_add_request(), it's pretty pointless to
      have a function with only two callers. The main interface
      is elv_add_request()/__elv_add_request().
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      b710a480
  10. 05 4月, 2011 7 次提交
    • B
      ethtool: Change ETHTOOL_PHYS_ID implementation to allow dropping RTNL · 68f512f2
      Ben Hutchings 提交于
      The ethtool ETHTOOL_PHYS_ID command runs for an arbitrarily long
      period of time, holding the RTNL lock.  This blocks routing updates,
      device enumeration, and various important operations that one might
      want to keep running while hunting for the flashing LED.
      
      We need to drop the RTNL lock during this operation, but currently the
      core implementation is a thin wrapper around a driver operation and
      drivers may well depend upon holding the lock.
      
      Define a new driver operation 'set_phys_id' with an argument that sets
      the ID indicator on/off/inactive/active (the last optional, for any
      driver or firmware that prefers to handle blinking asynchronously).
      When this is defined, the ethtool core drops the lock while waiting
      and only acquires it around calls to this operation.
      
      Deprecate the 'phys_id' operation in favour of this.  It can be
      removed once all in-tree drivers are converted.
      Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
      68f512f2
    • B
      ethtool: Fill out and update comment for struct ethtool_ops · 8717d07b
      Ben Hutchings 提交于
      Briefly document all operations (except get_rx_ntuple), including
      whether they may return an error code and whether they are deprecated.
      Also mention some things that should be handled by the ethtool core
      rather than by drivers.
      
      Briefly document general requirements for callers.
      Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
      8717d07b
    • B
    • T
      net: Allow no-cache copy from user on transmit · c6e1a0d1
      Tom Herbert 提交于
      This patch uses __copy_from_user_nocache on transmit to bypass data
      cache for a performance improvement.  skb_add_data_nocache and
      skb_copy_to_page_nocache can be called by sendmsg functions to use
      this feature, initial support is in tcp_sendmsg.  This functionality is
      configurable per device using ethtool.
      
      Presumably, this feature would only be useful when the driver does
      not touch the data.  The feature is turned on by default if a device
      indicates that it does some form of checksum offload; it is off by
      default for devices that do no checksum offload or indicate no checksum
      is necessary.  For the former case copy-checksum is probably done
      anyway, in the latter case the device is likely loopback in which case
      the no cache copy is probably not beneficial.
      
      This patch was tested using 200 instances of netperf TCP_RR with
      1400 byte request and one byte reply.  Platform is 16 core AMD x86.
      
      No-cache copy disabled:
         672703 tps, 97.13% utilization
         50/90/99% latency:244.31 484.205 1028.41
      
      No-cache copy enabled:
         702113 tps, 96.16% utilization,
         50/90/99% latency 238.56 467.56 956.955
      
      Using 14000 byte request and response sizes demonstrate the
      effects more dramatically:
      
      No-cache copy disabled:
         79571 tps, 34.34 %utlization
         50/90/95% latency 1584.46 2319.59 5001.76
      
      No-cache copy enabled:
         83856 tps, 34.81% utilization
         50/90/95% latency 2508.42 2622.62 2735.88
      
      Note especially the effect on latency tail (95th percentile).
      
      This seems to provide a nice performance improvement and is
      consistent in the tests I ran.  Presumably, this would provide
      the greatest benfits in the presence of an application workload
      stressing the cache and a lot of transmit data happening.
      Signed-off-by: NTom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c6e1a0d1
    • R
    • B
      ieee80211: add HT extended capabilities masks · 4dd365fd
      Bing Zhao 提交于
      IEEE Std 802.11n, Oct. 29, 2009:
      7.3.2.56.5 HT Extended Capabilities field
      Signed-off-by: NBing Zhao <bzhao@marvell.com>
      Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
      4dd365fd
    • S
      pkt_sched: QFQ - quick fair queue scheduler · 0545a303
      stephen hemminger 提交于
      This is an implementation of the Quick Fair Queue scheduler developed
      by Fabio Checconi. The same algorithm is already implemented in ipfw
      in FreeBSD. Fabio had an earlier version developed on Linux, I just
      cleaned it up.  Thanks to Eric Dumazet for testing this under load.
      Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0545a303
  11. 04 4月, 2011 4 次提交
  12. 03 4月, 2011 1 次提交
  13. 02 4月, 2011 2 次提交
    • M
      net: Fix dev dev_ethtool_get_rx_csum() for forced NETIF_F_RXCSUM · 4dd5ffe4
      Michał Mirosław 提交于
      dev_ethtool_get_rx_csum() won't report rx checksumming when it's not
      changeable and driver is converted to hw_features and friends. Fix this.
      
      (dev->hw_features & NETIF_F_RXCSUM) check is dropped - if the
      ethtool_ops->get_rx_csum is set, then driver is not coverted, yet.
      Signed-off-by: NMichał Mirosław <mirq-linux@rere.qmqm.pl>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4dd5ffe4
    • A
      usbnet: use eth%d name for known ethernet devices · c261344d
      Arnd Bergmann 提交于
      The documentation for the USB ethernet devices suggests that
      only some devices are supposed to use usb0 as the network interface
      name instead of eth0. The logic used there, and documented in
      Kconfig for CDC is that eth0 will be used when the mac address
      is a globally assigned one, but usb0 is used for the locally
      managed range that is typically used on point-to-point links.
      
      Unfortunately, this has caused a lot of pain on the smsc95xx
      device that is used on the popular pandaboard without an
      EEPROM to store the MAC address, which causes the driver to
      call random_ether_address().
      
      Obviously, there should be a proper MAC addressed assigned to
      the device, and discussions are ongoing about how to solve
      this, but this patch at least makes sure that the default
      interface naming gets a little saner and matches what the
      user can expect based on the documentation, including for
      new devices.
      
      The approach taken here is to flag whether a device might be a
      point-to-point link with the new FLAG_POINTTOPOINT setting in
      the usbnet driver_info. A driver can set both FLAG_POINTTOPOINT
      and FLAG_ETHER if it is not sure (e.g. cdc_ether), or just one
      of the two.  The usbnet framework only looks at the MAC address
      for device naming if both flags are set, otherwise it trusts the
      flag.
      Signed-off-by: NArnd Bergmann <arnd.bergmann@linaro.org>
      Tested-by: NAndy Green <andy.green@linaro.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c261344d
  14. 01 4月, 2011 2 次提交
  15. 31 3月, 2011 2 次提交
    • L
      Fix common misspellings · 25985edc
      Lucas De Marchi 提交于
      Fixes generated by 'codespell' and manually reviewed.
      Signed-off-by: NLucas De Marchi <lucas.demarchi@profusion.mobi>
      25985edc
    • P
      perf: Fix task context scheduling · ab711fe0
      Peter Zijlstra 提交于
      Jiri reported:
      
       |
       | - once an event is created by sys_perf_event_open, task context
       |   is created and it stays even if the event is closed, until the
       |   task is finished ... thats what I see in code and I assume it's
       |   correct
       |
       | - when the task opens event, perf_sched_events jump label is
       |   incremented and following callbacks are started from scheduler
       |
       |         __perf_event_task_sched_in
       |         __perf_event_task_sched_out
       |
       |   These callback *in/out set/unset cpuctx->task_ctx value to the
       |   task context.
       |
       | - close is called on event on CPU 0:
       |         - the task is scheduled on CPU 0
       |         - __perf_event_task_sched_in is called
       |         - cpuctx->task_ctx is set
       |         - perf_sched_events jump label is decremented and == 0
       |         - __perf_event_task_sched_out is not called
       |         - cpuctx->task_ctx on CPU 0 stays set
       |
       | - exit is called on CPU 1:
       |         - the task is scheduled on CPU 1
       |         - perf_event_exit_task is called
       |         - task_ctx_sched_out unsets cpuctx->task_ctx on CPU 1
       |         - put_ctx destroys the context
       |
       | - another call of perf_rotate_context on CPU 0 will use invalid
       |   task_ctx pointer, and eventualy panic.
       |
      
      Cure this the simplest possibly way by partially reverting the
      jump_label optimization for the sched_out case.
      Reported-and-tested-by: NJiri Olsa <jolsa@redhat.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: <stable@kernel.org> # .37+
      LKML-Reference: <1301520405.4859.213.camel@twins>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ab711fe0