1. 18 3月, 2013 2 次提交
    • C
      tcp: Remove TCPCT · 1a2c6181
      Christoph Paasch 提交于
      TCPCT uses option-number 253, reserved for experimental use and should
      not be used in production environments.
      Further, TCPCT does not fully implement RFC 6013.
      
      As a nice side-effect, removing TCPCT increases TCP's performance for
      very short flows:
      
      Doing an apache-benchmark with -c 100 -n 100000, sending HTTP-requests
      for files of 1KB size.
      
      before this patch:
      	average (among 7 runs) of 20845.5 Requests/Second
      after:
      	average (among 7 runs) of 21403.6 Requests/Second
      Signed-off-by: NChristoph Paasch <christoph.paasch@uclouvain.be>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1a2c6181
    • D
      vxlan: generalize forwarding tables · 6681712d
      David Stevens 提交于
      This patch generalizes VXLAN forwarding table entries allowing an administrator
      to:
      	1) specify multiple destinations for a given MAC
      	2) specify alternate vni's in the VXLAN header
      	3) specify alternate destination UDP ports
      	4) use multicast MAC addresses as fdb lookup keys
      	5) specify multicast destinations
      	6) specify the outgoing interface for forwarded packets
      
      The combination allows configuration of more complex topologies using VXLAN
      encapsulation.
      
      Changes since v1: rebase to 3.9.0-rc2
      Signed-Off-By: NDavid L Stevens <dlstevens@us.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6681712d
  2. 12 3月, 2013 2 次提交
    • N
      tcp: TLP loss detection. · 9b717a8d
      Nandita Dukkipati 提交于
      This is the second of the TLP patch series; it augments the basic TLP
      algorithm with a loss detection scheme.
      
      This patch implements a mechanism for loss detection when a Tail
      loss probe retransmission plugs a hole thereby masking packet loss
      from the sender. The loss detection algorithm relies on counting
      TLP dupacks as outlined in Sec. 3 of:
      http://tools.ietf.org/html/draft-dukkipati-tcpm-tcp-loss-probe-01
      
      The basic idea is: Sender keeps track of TLP "episode" upon
      retransmission of a TLP packet. An episode ends when the sender receives
      an ACK above the SND.NXT (tracked by tlp_high_seq) at the time of the
      episode. We want to make sure that before the episode ends the sender
      receives a "TLP dupack", indicating that the TLP retransmission was
      unnecessary, so there was no loss/hole that needed plugging. If the
      sender gets no TLP dupack before the end of the episode, then it reduces
      ssthresh and the congestion window, because the TLP packet arriving at
      the receiver probably plugged a hole.
      Signed-off-by: NNandita Dukkipati <nanditad@google.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9b717a8d
    • N
      tcp: Tail loss probe (TLP) · 6ba8a3b1
      Nandita Dukkipati 提交于
      This patch series implement the Tail loss probe (TLP) algorithm described
      in http://tools.ietf.org/html/draft-dukkipati-tcpm-tcp-loss-probe-01. The
      first patch implements the basic algorithm.
      
      TLP's goal is to reduce tail latency of short transactions. It achieves
      this by converting retransmission timeouts (RTOs) occuring due
      to tail losses (losses at end of transactions) into fast recovery.
      TLP transmits one packet in two round-trips when a connection is in
      Open state and isn't receiving any ACKs. The transmitted packet, aka
      loss probe, can be either new or a retransmission. When there is tail
      loss, the ACK from a loss probe triggers FACK/early-retransmit based
      fast recovery, thus avoiding a costly RTO. In the absence of loss,
      there is no change in the connection state.
      
      PTO stands for probe timeout. It is a timer event indicating
      that an ACK is overdue and triggers a loss probe packet. The PTO value
      is set to max(2*SRTT, 10ms) and is adjusted to account for delayed
      ACK timer when there is only one oustanding packet.
      
      TLP Algorithm
      
      On transmission of new data in Open state:
        -> packets_out > 1: schedule PTO in max(2*SRTT, 10ms).
        -> packets_out == 1: schedule PTO in max(2*RTT, 1.5*RTT + 200ms)
        -> PTO = min(PTO, RTO)
      
      Conditions for scheduling PTO:
        -> Connection is in Open state.
        -> Connection is either cwnd limited or no new data to send.
        -> Number of probes per tail loss episode is limited to one.
        -> Connection is SACK enabled.
      
      When PTO fires:
        new_segment_exists:
          -> transmit new segment.
          -> packets_out++. cwnd remains same.
      
        no_new_packet:
          -> retransmit the last segment.
             Its ACK triggers FACK or early retransmit based recovery.
      
      ACK path:
        -> rearm RTO at start of ACK processing.
        -> reschedule PTO if need be.
      
      In addition, the patch includes a small variation to the Early Retransmit
      (ER) algorithm, such that ER and TLP together can in principle recover any
      N-degree of tail loss through fast recovery. TLP is controlled by the same
      sysctl as ER, tcp_early_retrans sysctl.
      tcp_early_retrans==0; disables TLP and ER.
      		 ==1; enables RFC5827 ER.
      		 ==2; delayed ER.
      		 ==3; TLP and delayed ER. [DEFAULT]
      		 ==4; TLP only.
      
      The TLP patch series have been extensively tested on Google Web servers.
      It is most effective for short Web trasactions, where it reduced RTOs by 15%
      and improved HTTP response time (average by 6%, 99th percentile by 10%).
      The transmitted probes account for <0.5% of the overall transmissions.
      Signed-off-by: NNandita Dukkipati <nanditad@google.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Acked-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6ba8a3b1
  3. 09 3月, 2013 1 次提交
  4. 07 3月, 2013 1 次提交
  5. 03 3月, 2013 1 次提交
    • J
      metag: ptrace · bc3966bf
      James Hogan 提交于
      The ptrace interface for metag provides access to some core register
      sets using the PTRACE_GETREGSET and PTRACE_SETREGSET operations. The
      details of the internal context structures is abstracted into user API
      structures to both ease use and allow flexibility to change the internal
      context layouts. Copyin and copyout functions for these register sets
      are exposed to allow signal handling code to use them to copy to and
      from the signal context.
      
      struct user_gp_regs (NT_PRSTATUS) provides access to the core general
      purpose register context.
      
      struct user_cb_regs (NT_METAG_CBUF) provides access to the TXCATCH*
      registers which contains information abuot a memory fault, unaligned
      access error or watchpoint. This can be modified to alter the way the
      fault is replayed on resume ("catch replay"), or to prevent the replay
      taking place.
      
      struct user_rp_state (NT_METAG_RPIPE) provides access to the state of
      the Meta read pipeline which can be used to hide memory latencies in
      hand optimised data loops.
      
      Extended DSP register state, DSP RAM, and hardware breakpoint registers
      aren't yet exposed through ptrace.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Denys Vlasenko <vda.linux@googlemail.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Tony Lindgren <tony@atomide.com>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      bc3966bf
  6. 02 3月, 2013 2 次提交
    • M
      dm ioctl: allow message to return data · a2606241
      Mikulas Patocka 提交于
      This patch introduces enhanced message support that allows the
      device-mapper core to recognise messages that are common to all devices,
      and for messages to return data to userspace.
      
      Core messages are processed by the function "message_for_md".  If the
      device mapper doesn't support the message, it is passed to the target
      driver.
      
      If the message returns data, the kernel sets the flag
      DM_MESSAGE_OUT_FLAG.
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      a2606241
    • M
      dm ioctl: optimize functions without variable params · 02cde50b
      Mikulas Patocka 提交于
      Device-mapper ioctls receive and send data in a buffer supplied
      by userspace.  The buffer has two parts.  The first part contains
      a 'struct dm_ioctl' and has a fixed size.  The second part depends
      on the ioctl and has a variable size.
      
      This patch recognises the specific ioctls that do not use the variable
      part of the buffer and skips allocating memory for it.
      
      In particular, when a device is suspended and a resume ioctl is sent,
      this now avoid memory allocation completely.
      
      The variable "struct dm_ioctl tmp" is moved from the function
      copy_params to its caller ctl_ioctl and renamed to param_kernel.
      It is used directly when the ioctl function doesn't need any arguments.
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      02cde50b
  7. 28 2月, 2013 5 次提交
    • A
      nbd: support FLUSH requests · 75f187ab
      Alex Bligh 提交于
      Currently, the NBD device does not accept flush requests from the Linux
      block layer.  If the NBD server opened the target with neither O_SYNC nor
      O_DSYNC, however, the device will be effectively backed by a writeback
      cache.  Without issuing flushes properly, operation of the NBD device will
      not be safe against power losses.
      
      The NBD protocol has support for both a cache flush command and a FUA
      command flag; the server will also pass a flag to note its support for
      these features.  This patch adds support for the cache flush command and
      flag.  In the kernel, we receive the flags via the NBD_SET_FLAGS ioctl,
      and map NBD_FLAG_SEND_FLUSH to the argument of blk_queue_flush.  When the
      flag is active the block layer will send REQ_FLUSH requests, which we
      translate to NBD_CMD_FLUSH commands.
      
      FUA support is not included in this patch because all free software
      servers implement it with a full fdatasync; thus it has no advantage over
      supporting flush only.  Because I [Paolo] cannot really benchmark it in a
      realistic scenario, I cannot tell if it is a good idea or not.  It is also
      not clear if it is valid for an NBD server to support FUA but not flush.
      The Linux block layer gives a warning for this combination, the NBD
      protocol documentation says nothing about it.
      
      The patch also fixes a small problem in the handling of flags: nbd->flags
      must be cleared at the end of NBD_DO_IT, but the driver was not doing
      that.  The bug manifests itself as follows.  Suppose you two different
      client/server pairs to start the NBD device.  Suppose also that the first
      client supports NBD_SET_FLAGS, and the first server sends
      NBD_FLAG_SEND_FLUSH; the second pair instead does neither of these two
      things.  Before this patch, the second invocation of NBD_DO_IT will use a
      stale value of nbd->flags, and the second server will issue an error every
      time it receives an NBD_CMD_FLUSH command.
      
      This bug is pre-existing, but it becomes much more important after this
      patch; flush failures make the device pretty much unusable, unlike
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NAlex Bligh <alex@alex.org.uk>
      Acked-by: NPaul Clements <Paul.Clements@steeleye.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      75f187ab
    • R
      ipmi: remove superfluous kernel/userspace explanation · 59fb1b9f
      Robert P. J. Day 提交于
      Given the obvious distinction between kernel and userspace supported
      by uapi/, it seems unnecessary to comment on that.
      Signed-off-by: NRobert P. J. Day <rpjday@crashcourse.ca>
      Signed-off-by: NCorey Minyard <cminyard@mvista.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      59fb1b9f
    • O
      fat: mark fs as dirty on mount and clean on umount · b88a1058
      Oleksij Rempel 提交于
      There is no documented methods to mark FAT as dirty.  Unofficially MS
      started to use reserved Byte in boot sector for this purpose, at least
      since Win 2000.  With Win 7 user is warned if fs is dirty and asked to
      clean it.
      
      Different versions of Win, handle it in different ways, but always have
      same meaning:
      
      - Win 2000 and XP, set it on write operations and
        remove it after operation was finnished
      - Win 7, set dirty flag on first write and remove it on umount.
      
      We will do it as follows:
      
      - set dirty flag on mount. If fs was initially dirty, warn user,
        remember it and do not do any changes to boot sector.
      - clean it on umount. If fs was initially dirty, leave it dirty.
      - do not do any thing if fs mounted read-only.
      - TODO: leave fs dirty if we found some error after mount.
      Signed-off-by: NOleksij Rempel <bug-track@fisher-privat.net>
      Signed-off-by: NOGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b88a1058
    • O
      fat: add extended fileds to struct fat_boot_sector · 6b46419b
      Oleksij Rempel 提交于
      Later we will need "state" field to check if volume was cleanly unmounted.
      Signed-off-by: NOleksij Rempel <bug-track@fisher-privat.net>
      Signed-off-by: NOGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6b46419b
    • V
      hfsplus: add osx.* prefix for handling namespace of Mac OS X extended attributes · 5841ca09
      Vyacheslav Dubeyko 提交于
      hfsplus: reworked support of extended attributes.
      
      Current mainline implementation of hfsplus file system driver treats as
      extended attributes only two fields (fdType and fdCreator) of user_info
      field in file description record (struct hfsplus_cat_file).  It is
      possible to get or set only these two fields as extended attributes.
      But HFS+ treats as com.apple.FinderInfo extended attribute an union of
      user_info and finder_info fields as for file (struct hfsplus_cat_file)
      as for folder (struct hfsplus_cat_folder).  Moreover, current mainline
      implementation of hfsplus file system driver doesn't support special
      metadata file - attributes tree.
      
      Mac OS X 10.4 and later support extended attributes by making use of the
      HFS+ filesystem Attributes file B*-tree feature which allows for named
      forks.  Mac OS X supports only inline extended attributes, limiting
      their size to 3802 bytes.  Any regular file may have a list of extended
      attributes.  HFS+ supports an arbitrary number of named forks.  Each
      attribute is denoted by a name and the associated data.  The name is a
      null-terminated Unicode string.  It is possible to list, to get, to set,
      and to remove extended attributes from files or directories.
      
      It exists some peculiarity during getting of extended attributes list by
      means of getfattr utility.  The getfattr utility expects prefix "user."
      before any extended attribute's name.  So, it ignores any names that
      don't contained such prefix.  Such behavior of getfattr utility results
      in unexpected empty output of extended attributes list even in the case
      when file (or folder) contains extended attributes.  It needs to use
      empty string as regular expression pattern for names matching (getfattr
      --match="").
      
      For support of extended attributes in HFS+:
      1. It was added necessary on-disk layout declarations related to Attributes
         tree into hfsplus_raw.h file.
      2. It was added attributes.c file with implementation of functionality of
         manipulation by records in Attributes tree.
      3. It was reworked hfsplus_listxattr, hfsplus_getxattr, hfsplus_setxattr
         functions in ioctl.c. Moreover, it was added hfsplus_removexattr method.
      
      This patch:
      
      Add osx.* prefix for handling namespace of Mac OS X extended attributes.
      
      [akpm@linux-foundation.org: checkpatch fixes]
      Signed-off-by: NVyacheslav Dubeyko <slava@dubeyko.com>
      Reported-by: NHin-Tak Leung <htl10@users.sourceforge.net>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Jan Kara <jack@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5841ca09
  8. 22 2月, 2013 3 次提交
  9. 21 2月, 2013 4 次提交
  10. 20 2月, 2013 1 次提交
  11. 19 2月, 2013 2 次提交
  12. 17 2月, 2013 1 次提交
    • R
      drm/omap: move out of staging · 8bb0daff
      Rob Clark 提交于
      Now that the omapdss interface has been reworked so that omapdrm can use
      dispc directly, we have been able to fix the remaining functional kms
      issues with omapdrm.  And in the mean time the PM sequencing and many
      other of that open issues have been solved.  So I think it makes sense
      to finally move omapdrm out of staging.
      Signed-off-by: NRob Clark <robdclark@gmail.com>
      8bb0daff
  13. 15 2月, 2013 5 次提交
    • J
      nl80211: renumber NL80211_FEATURE_FULL_AP_CLIENT_STATE · 932dd97c
      Johannes Berg 提交于
      Adding the flag to mac80211 already without testing was
      clearly a mistake, one that we now pay for by having to
      reserve bit 13 forever. The problem is cfg80211 doesn't
      allow capability/rate changes for station entries that
      were added unassociated, so the station entries cannot
      be set up properly when marked associated.
      
      Change the NL80211_FEATURE_FULL_AP_CLIENT_STATE value
      to make it clear to userspace implementations that all
      current kernels don't actually support it, even though
      the previous bit is set, and of course also remove the
      flag from mac80211 until we test and fix the issues.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      932dd97c
    • J
      cfg80211: Pass station (extended) capability info to kernel · 9d62a986
      Jouni Malinen 提交于
      The information of the peer's capabilities and extended capabilities are
      required for the driver to perform TDLS Peer UAPSD operations and off
      channel operations. This information of the peer is passed from user space
      using NL80211_CMD_SET_STATION command. This commit enhances
      the function nl80211_set_station to pass the capability information of
      the peer to the driver.
      
      Similarly, there may be need for capability information for other modes,
      so allow this to be provided with both add_station and change_station.
      Signed-off-by: NJouni Malinen <jouni@qca.qualcomm.com>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      9d62a986
    • J
      cfg80211: advertise extended capabilities to userspace · a50df0c4
      Johannes Berg 提交于
      In many cases, userspace may need to know which of the
      802.11 extended capabilities ("Extended Capabilities
      element") are implemented in the driver or device, to
      include them e.g. in beacons, assoc request/response
      or other frames. Add a new nl80211 attribute to hold
      the extended capabilities bitmap for this.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      a50df0c4
    • J
      nl80211: advertise HT/VHT channel limitations · 50640f16
      Johannes Berg 提交于
      When drivers or regulatory have limitations on
      40, 80 or 160 MHz channels, advertise these to
      userspace via nl80211. Also add a new feature
      flag to let userspace know this is supported.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      50640f16
    • S
      nl80211/cfg80211: add radar detection command/event · 04f39047
      Simon Wunderlich 提交于
      Add new NL80211_CMD_RADAR_DETECT, which starts the Channel
      Availability Check (CAC). This command will also notify the
      usermode about events (CAC finished, CAC aborted, radar
      detected, NOP finished).
      Once radar detection has started it should continuously
      monitor for radars as long as the channel is active.
      
      This patch enables DFS for AP mode in nl80211/cfg80211.
      
      Based on original patch by Victor Goldenshtein <victorg@ti.com>
      Signed-off-by: NSimon Wunderlich <siwu@hrz.tu-chemnitz.de>
      [remove WIPHY_FLAG_HAS_RADAR_DETECT again -- my mistake]
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      04f39047
  14. 14 2月, 2013 8 次提交
  15. 13 2月, 2013 2 次提交