1. 15 2月, 2010 1 次提交
  2. 13 2月, 2010 5 次提交
  3. 11 2月, 2010 2 次提交
  4. 05 2月, 2010 3 次提交
    • S
      packet: Add GSO/csum offload support. · bfd5f4a3
      Sridhar Samudrala 提交于
      This patch adds GSO/checksum offload to af_packet sockets using
      virtio_net_hdr. Based on Rusty's patch to add this support to tun.
      It allows GSO/checksum offload to be enabled when using raw socket
      backend with virtio_net.
      Adds PACKET_VNET_HDR socket option to prepend virtio_net_hdr in the
      receive path and process/skip virtio_net_hdr in the send path. This
      option is only allowed with SOCK_RAW sockets attached to ethernet
      type devices.
      
      v2 updates
      ----------
      Michael's Comments
      - Perform length check in packet_snd() when GSO is off even when
        vnet_hdr is present.
      - Check for SKB_GSO_FCOE type and return -EINVAL
      - don't allow tx/rx ring when vnet_hdr is enabled.
      Herbert's Comments
      - Removed ethernet specific code.
      - protocol value is assumed to be passed in by the caller.
      Signed-off-by: NSridhar Samudrala <sri@us.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bfd5f4a3
    • J
      libphy: add phy_find_first function · f8f76db1
      Jiri Pirko 提交于
      Many drivers do this in them manually. Now they can use this function.
      Signed-off-by: NJiri Pirko <jpirko@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f8f76db1
    • J
      net: use helpers to access mc list V2 · 6683ece3
      Jiri Pirko 提交于
      This patch introduces the similar helpers as those already done for uc list.
      However multicast lists are no list_head lists but "mademanually". The three
      macros added by this patch will make the transition of mc_list to list_head
      smooth in two steps:
      
      1) convert all drivers to use these macros (with the original iterator of type
         "struct dev_mc_list")
      2) once all drivers are converted, convert list type and iterators to "struct
         netdev_hw_addr" in one patch.
      
      >From now on, drivers can (and should) use "netdev_for_each_mc_addr" to iterate
      over the addresses with iterator of type "struct netdev_hw_addr". Also macros
      "netdev_mc_count" and "netdev_mc_empty" to read list's length. This is the state
      which should be reached in all drivers.
      Signed-off-by: NJiri Pirko <jpirko@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6683ece3
  5. 04 2月, 2010 4 次提交
    • A
      net: CONFIG_COMPAT redux · 1621e094
      Alexey Dobriyan 提交于
      Ifdef out
      	struct proto_ops::compat_ioctl
      	struct proto_ops::compat_setsockopt
      	struct proto_ops::compat_getsockopt
      to make structures smaller on COMPAT=n kernels.
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1621e094
    • A
      net: macvtap driver · 20d29d7a
      Arnd Bergmann 提交于
      In order to use macvlan with qemu and other tools that require
      a tap file descriptor, the macvtap driver adds a small backend
      with a character device with the same interface as the tun
      driver, with a minimum set of features.
      
      Macvtap interfaces are created in the same way as macvlan
      interfaces using ip link, but the netif is just used as a
      handle for configuration and accounting, while the data
      goes through the chardev. Each macvtap interface has its
      own character device, simplifying permission management
      significantly over the generic tun/tap driver.
      
      Cc: Patrick McHardy <kaber@trash.net>
      Cc: Stephen Hemminger <shemminger@linux-foundation.org>
      Cc: David S. Miller" <davem@davemloft.net>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Or Gerlitz <ogerlitz@voltaire.com>
      Cc: netdev@vger.kernel.org
      Cc: bridge@lists.linux-foundation.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      20d29d7a
    • A
      macvlan: allow multiple driver backends · fc0663d6
      Arnd Bergmann 提交于
      This makes it possible to hook into the macvlan driver
      from another kernel module. In particular, the goal is
      to extend it with the macvtap backend that provides
      a tun/tap compatible interface directly on the macvlan
      device.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fc0663d6
    • A
      net: maintain namespace isolation between vlan and real device · 8a83a00b
      Arnd Bergmann 提交于
      In the vlan and macvlan drivers, the start_xmit function forwards
      data to the dev_queue_xmit function for another device, which may
      potentially belong to a different namespace.
      
      To make sure that classification stays within a single namespace,
      this resets the potentially critical fields.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8a83a00b
  6. 03 2月, 2010 2 次提交
    • E
      connector: Delete buggy notification code. · f98bfbd7
      Evgeniy Polyakov 提交于
      On Tue, Feb 02, 2010 at 02:57:14PM -0800, Greg KH (gregkh@suse.de) wrote:
      > > There are at least two ways to fix it: using a big cannon and a small
      > > one. The former way is to disable notification registration, since it is
      > > not used by anyone at all. Second way is to check whether calling
      > > process is root and its destination group is -1 (kind of priveledged
      > > one) before command is dispatched to workqueue.
      > 
      > Well if no one is using it, removing it makes the most sense, right?
      > 
      > No objection from me, care to make up a patch either way for this?
      
      Getting it is not used, let's drop support for notifications about
      (un)registered events from connector.
      Another option was to check credentials on receiving, but we can always
      restore it without bugs if needed, but genetlink has a wider code base
      and none complained, that userspace can not get notification when some
      other clients were (un)registered.
      
      Kudos for Sebastian Krahmer <krahmer@suse.de>, who found a bug in the
      code.
      Signed-off-by: NEvgeniy Polyakov <zbr@ioremap.net>
      Acked-by: NGreg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f98bfbd7
    • S
      virtio: Add ability to detach unused buffers from vrings · f9bfbebf
      Shirley Ma 提交于
      There's currently no way for a virtio driver to ask for unused
      buffers, so it has to keep a list itself to reclaim them at shutdown.
      This is redundant, since virtio_ring stores that information.  So
      add a new hook to do this.
      Signed-off-by: NShirley Ma <xma@us.ibm.com>
      Signed-off-by: NAmit Shah <amit.shah@redhat.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f9bfbebf
  7. 02 2月, 2010 1 次提交
  8. 26 1月, 2010 1 次提交
  9. 23 1月, 2010 2 次提交
  10. 21 1月, 2010 3 次提交
    • P
      perf: Change the is_software_event() definition · 92b67598
      Peter Zijlstra 提交于
      The is_software_event() definition always confuses me because its an
      exclusive expression, make it an inclusive one.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      92b67598
    • M
      sched: Fix vmark regression on big machines · 50b926e4
      Mike Galbraith 提交于
      SD_PREFER_SIBLING is set at the CPU domain level if power saving isn't
      enabled, leading to many cache misses on large machines as we traverse
      looking for an idle shared cache to wake to.  Change the enabler of
      select_idle_sibling() to SD_SHARE_PKG_RESOURCES, and enable same at the
      sibling domain level.
      Reported-by: NLin Ming <ming.m.lin@intel.com>
      Signed-off-by: NMike Galbraith <efault@gmx.de>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1262612696.15495.15.camel@marge.simson.net>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      50b926e4
    • S
      USB: Fix duplicate sysfs problem after device reset. · 04a723ea
      Sarah Sharp 提交于
      Borislav Petkov reports issues with duplicate sysfs endpoint files after a
      resume from a hibernate.  It turns out that the code to support alternate
      settings under xHCI has issues when a device with a non-default alternate
      setting is reset during the hibernate:
      
      [  427.681810] Restarting tasks ...
      [  427.681995] hub 1-0:1.0: state 7 ports 6 chg 0004 evt 0000
      [  427.682019] usb usb3: usb resume
      [  427.682030] ohci_hcd 0000:00:12.0: wakeup root hub
      [  427.682191] hub 1-0:1.0: port 2, status 0501, change 0000, 480 Mb/s
      [  427.682205] usb 1-2: usb wakeup-resume
      [  427.682226] usb 1-2: finish reset-resume
      [  427.682886] done.
      [  427.734658] ehci_hcd 0000:00:12.2: port 2 high speed
      [  427.734663] ehci_hcd 0000:00:12.2: GetStatus port 2 status 001005 POWER sig=se0 PE CONNECT
      [  427.746682] hub 3-0:1.0: hub_reset_resume
      [  427.746693] hub 3-0:1.0: trying to enable port power on non-switchable hub
      [  427.786715] usb 1-2: reset high speed USB device using ehci_hcd and address 2
      [  427.839653] ehci_hcd 0000:00:12.2: port 2 high speed
      [  427.839666] ehci_hcd 0000:00:12.2: GetStatus port 2 status 001005 POWER sig=se0 PE CONNECT
      [  427.847717] ohci_hcd 0000:00:12.0: GetStatus roothub.portstatus [1] = 0x00010100 CSC PPS
      [  427.915497] hub 1-2:1.0: remove_intf_ep_devs: if: ffff88022f9e8800 ->ep_devs_created: 1
      [  427.915774] hub 1-2:1.0: remove_intf_ep_devs: bNumEndpoints: 1
      [  427.915934] hub 1-2:1.0: if: ffff88022f9e8800: endpoint devs removed.
      [  427.916158] hub 1-2:1.0: create_intf_ep_devs: if: ffff88022f9e8800 ->ep_devs_created: 0, ->unregistering: 0
      [  427.916434] hub 1-2:1.0: create_intf_ep_devs: bNumEndpoints: 1
      [  427.916609]  ep_81: create, parent hub
      [  427.916632] ------------[ cut here ]------------
      [  427.916644] WARNING: at fs/sysfs/dir.c:477 sysfs_add_one+0x82/0x96()
      [  427.916649] Hardware name: System Product Name
      [  427.916653] sysfs: cannot create duplicate filename '/devices/pci0000:00/0000:00:12.2/usb1/1-2/1-2:1.0/ep_81'
      [  427.916658] Modules linked in: binfmt_misc kvm_amd kvm powernow_k8 cpufreq_ondemand cpufreq_powersave cpufreq_userspace freq_table cpufreq_conservative ipv6 vfat fat
      +8250_pnp 8250 pcspkr ohci_hcd serial_core k10temp edac_core
      [  427.916694] Pid: 278, comm: khubd Not tainted 2.6.33-rc2-00187-g08d869aa-dirty #13
      [  427.916699] Call Trace:
      
      The problem is caused by a mismatch between the USB core's view of the
      device state and the USB device and xHCI host's view of the device state.
      
      After the device reset and re-configuration, the device and the xHCI host
      think they are using alternate setting 0 of all interfaces.  However, the
      USB core keeps track of the old state, which may include non-zero
      alternate settings.  It uses intf->cur_altsetting to keep the endpoint
      sysfs files for the old state across the reset.
      
      The bandwidth allocation functions need to know what the xHCI host thinks
      the current alternate settings are, so original patch set
      intf->cur_altsetting to the alternate setting 0.  This caused duplicate
      endpoint files to be created.
      
      The solution is to not set intf->cur_altsetting before calling
      usb_set_interface() in usb_reset_and_verify_device().  Instead, we add a
      new flag to struct usb_interface to tell usb_hcd_alloc_bandwidth() to use
      alternate setting 0 as the currently installed alternate setting.
      Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      Tested-by: NBorislav Petkov <petkovbb@googlemail.com>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      04a723ea
  11. 20 1月, 2010 1 次提交
  12. 19 1月, 2010 1 次提交
    • A
      phylib: Move workqueue initialization to a proper place · 4f9c85a1
      Anton Vorontsov 提交于
      commit 541cd3ee ("phylib: Fix deadlock
      on resume") caused TI DaVinci EMAC ethernet driver to oops upon resume:
      
       PM: resume of devices complete after 237.098 msecs
       Restarting tasks ... done.
       kernel BUG at kernel/workqueue.c:354!
       Unable to handle kernel NULL pointer dereference at virtual address 00000000
       [...]
       Backtrace:
       [<c002c598>] (__bug+0x0/0x2c) from [<c0052a54>] (queue_delayed_work_on+0x74/0xf8)
       [<c00529e0>] (queue_delayed_work_on+0x0/0xf8) from [<c0052b30>] (queue_delayed_work+0x2c/0x30)
      
      The oops pops up because TI DaVinci EMAC driver detaches PHY on
      suspend and attaches it back on resume. Attaching makes phylib call
      phy_start_machine() that initializes a workqueue. On the other hand,
      PHY's resume routine will call phy_start_machine() again, and that
      will cause the oops since we just destroyed the already scheduled
      workqueue.
      
      This patch fixes the issue by moving workqueue initialization to
      phy_device_create().
      
      p.s. We don't see this oops with ucc_geth and gianfar drivers because
      they perform a fine-grained suspend, i.e. they just stop the PHYs
      without detaching.
      Reported-by: NSekhar Nori <nsekhar@ti.com>
      Signed-off-by: NAnton Vorontsov <avorontsov@ru.mvista.com>
      Tested-by: NSekhar Nori <nsekhar@ti.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4f9c85a1
  13. 18 1月, 2010 2 次提交
  14. 17 1月, 2010 11 次提交
  15. 15 1月, 2010 1 次提交
    • M
      vhost_net: a kernel-level virtio server · 3a4d5c94
      Michael S. Tsirkin 提交于
      What it is: vhost net is a character device that can be used to reduce
      the number of system calls involved in virtio networking.
      Existing virtio net code is used in the guest without modification.
      
      There's similarity with vringfd, with some differences and reduced scope
      - uses eventfd for signalling
      - structures can be moved around in memory at any time (good for
        migration, bug work-arounds in userspace)
      - write logging is supported (good for migration)
      - support memory table and not just an offset (needed for kvm)
      
      common virtio related code has been put in a separate file vhost.c and
      can be made into a separate module if/when more backends appear.  I used
      Rusty's lguest.c as the source for developing this part : this supplied
      me with witty comments I wouldn't be able to write myself.
      
      What it is not: vhost net is not a bus, and not a generic new system
      call. No assumptions are made on how guest performs hypercalls.
      Userspace hypervisors are supported as well as kvm.
      
      How it works: Basically, we connect virtio frontend (configured by
      userspace) to a backend. The backend could be a network device, or a tap
      device.  Backend is also configured by userspace, including vlan/mac
      etc.
      
      Status: This works for me, and I haven't see any crashes.
      Compared to userspace, people reported improved latency (as I save up to
      4 system calls per packet), as well as better bandwidth and CPU
      utilization.
      
      Features that I plan to look at in the future:
      - mergeable buffers
      - zero copy
      - scalability tuning: figure out the best threading model to use
      
      Note on RCU usage (this is also documented in vhost.h, near
      private_pointer which is the value protected by this variant of RCU):
      what is happening is that the rcu_dereference() is being used in a
      workqueue item.  The role of rcu_read_lock() is taken on by the start of
      execution of the workqueue item, of rcu_read_unlock() by the end of
      execution of the workqueue item, and of synchronize_rcu() by
      flush_workqueue()/flush_work(). In the future we might need to apply
      some gcc attribute or sparse annotation to the function passed to
      INIT_WORK(). Paul's ack below is for this RCU usage.
      
      (Includes fixes by Alan Cox <alan@linux.intel.com>,
      David L Stevens <dlstevens@us.ibm.com>,
      Chris Wright <chrisw@redhat.com>)
      Acked-by: NRusty Russell <rusty@rustcorp.com.au>
      Acked-by: NArnd Bergmann <arnd@arndb.de>
      Acked-by: N"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3a4d5c94