1. 09 3月, 2012 2 次提交
    • B
      GFS2: call gfs2_write_alloc_required for each chunk · 58a7d5fb
      Benjamin Marzinski 提交于
      gfs2_fallocate was calling gfs2_write_alloc_required() once at the start of
      the function. This caused problems since gfs2_write_alloc_required used a
      long unsigned int for the len, but gfs2_fallocate could allocate a much
      larger amount.  This patch will move the call into the loop where the
      chunks are actually allocated and zeroed out. This will keep the allocation
      size under the limit, and also allow gfs2_fallocate to quickly skip over
      sections of the file that are already completely allocated.
      
      fallcate_chunk was also not correctly setting the file size.  It was using the
      len veriable to find the last block written to, but by the time it was setting
      the size, the len variable had already been decremented to 0.
      Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      58a7d5fb
    • S
      GFS2: Clean up log flush header writing · 34cc1781
      Steven Whitehouse 提交于
      We already send both a pre and post flush to the block device
      when writing a journal header. There is no need to wait for
      the previous I/O specifically when we do this, unless we've
      turned "barriers" off.
      
      As a side effect, this also cleans up the code path for flushing
      the journal and makes it more readable.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      34cc1781
  2. 08 3月, 2012 1 次提交
    • S
      GFS2: Remove a __GFP_NOFAIL allocation · 75ca61c1
      Steven Whitehouse 提交于
      In order to ensure that we've got enough buffer heads for flushing
      the journal, the orignal code used __GFP_NOFAIL when performing
      this allocation. Here we dispense with that in favour of using a
      mempool. This should improve efficiency in low memory conditions
      since flushing the journal is a good way to get memory back, we
      don't want to be spinning, waiting on memory allocations. The
      buffers which are allocated via this mempool are fairly short lived,
      so that we'll recycle them pretty quickly.
      
      Although there are other memory allocations which occur during the
      journal flush process, this is the one which can potentially require
      the most memory, so the most important one to fix.
      
      The amount of memory reserved is a fixed amount, and we should not need
      to scale it when there are a greater number of filesystems in use.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      75ca61c1
  3. 07 3月, 2012 1 次提交
  4. 05 3月, 2012 2 次提交
    • B
      GFS2: make sure rgrps are up to date in func gfs2_blk2rgrpd · 58884c4d
      Bob Peterson 提交于
      This patch adds a call to gfs2_rindex_update from function gfs2_blk2rgrpd
      and removes calls to it that are made redundant by it. The problem is
      that a gfs2_grow can add rgrps to the rindex, then put those rgrps into
      use, thus rendering the rindex we read in at mount time incomplete.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      58884c4d
    • B
      GFS2: Eliminate sd_rindex_mutex · 6aad1c3d
      Bob Peterson 提交于
      Over time, we've slowly eliminated the use of sd_rindex_mutex.
      Up to this point, it was only used in two places: function
      gfs2_ri_total (which totals the file system size by reading
      and parsing the rindex file) and function gfs2_rindex_update
      which updates the rgrps in memory. Both of these functions have
      the rindex glock to protect them, so the rindex is unnecessary.
      Since gfs2_grow writes to the rindex via the meta_fs, the mutex
      is in the wrong order according to the normal rules. This patch
      eliminates the mutex entirely to avoid the problem.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      6aad1c3d
  5. 01 3月, 2012 1 次提交
  6. 29 2月, 2012 7 次提交
    • S
      GFS2: Make bd_cmp() static · 08728f2d
      Steven Whitehouse 提交于
      Add missing static to bd_cmp()
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      08728f2d
    • B
      GFS2: Sort the ordered write list · 4a36d08d
      Bob Peterson 提交于
      This patch sorts the ordered write list for GFS2 writes.
      This increases the throughput for simultaneous writes.
      For example, if you have ten processes, all doing:
      dd if=/dev/zero of=/mnt/gfs2/fileX
      on different files, the throughput will be much better.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      4a36d08d
    • S
      GFS2: FITRIM ioctl support · 66fc061b
      Steven Whitehouse 提交于
      The FITRIM ioctl provides an alternative way to send discard requests to
      the underlying device. Using the discard mount option results in every
      freed block generating a discard request to the block device. This can
      be slow, since many block devices can only process discard requests of
      larger sizes, and also such operations can be time consuming.
      
      Rather than using the discard mount option, FITRIM allows a sweep of the
      filesystem on an occasional basis, and also to optionally avoid sending
      down discard requests for smaller regions.
      
      In GFS2 FITRIM will work at resource group granularity. There is a flag
      for each resource group which keeps track of which resource groups have
      been trimmed. This flag is reset whenever a deallocation occurs in the
      resource group, and set whenever a successful FITRIM of that resource
      group has taken place. This helps to reduce repeated discard requests
      for the same block ranges, again improving performance.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      66fc061b
    • S
      GFS2: Move two functions from log.c to lops.c · 47ac5537
      Steven Whitehouse 提交于
      gfs2_log_get_buf() and gfs2_log_fake_buf() are both used
      only in lops.c, so move them next to their callers and they
      can then become static.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      47ac5537
    • S
      GFS2: glock statistics gathering · a245769f
      Steven Whitehouse 提交于
      The stats are divided into two sets: those relating to the
      super block and those relating to an individual glock. The
      super block stats are done on a per cpu basis in order to
      try and reduce the overhead of gathering them. They are also
      further divided by glock type.
      
      In the case of both the super block and glock statistics,
      the same information is gathered in each case. The super
      block statistics are used to provide default values for
      most of the glock statistics, so that newly created glocks
      should have, as far as possible, a sensible starting point.
      
      The statistics are divided into three pairs of mean and
      variance, plus two counters. The mean/variance pairs are
      smoothed exponential estimates and the algorithm used is
      one which will be very familiar to those used to calculation
      of round trip times in network code.
      
      The three pairs of mean/variance measure the following
      things:
      
       1. DLM lock time (non-blocking requests)
       2. DLM lock time (blocking requests)
       3. Inter-request time (again to the DLM)
      
      A non-blocking request is one which will complete right
      away, whatever the state of the DLM lock in question. That
      currently means any requests when (a) the current state of
      the lock is exclusive (b) the requested state is either null
      or unlocked or (c) the "try lock" flag is set. A blocking
      request covers all the other lock requests.
      
      There are two counters. The first is there primarily to show
      how many lock requests have been made, and thus how much data
      has gone into the mean/variance calculations. The other counter
      is counting queueing of holders at the top layer of the glock
      code. Hopefully that number will be a lot larger than the number
      of dlm lock requests issued.
      
      So why gather these statistics? There are several reasons
      we'd like to get a better idea of these timings:
      
      1. To be able to better set the glock "min hold time"
      2. To spot performance issues more easily
      3. To improve the algorithm for selecting resource groups for
      allocation (to base it on lock wait time, rather than blindly
      using a "try lock")
      Due to the smoothing action of the updates, a step change in
      some input quantity being sampled will only fully be taken
      into account after 8 samples (or 4 for the variance) and this
      needs to be carefully considered when interpreting the
      results.
      
      Knowing both the time it takes a lock request to complete and
      the average time between lock requests for a glock means we
      can compute the total percentage of the time for which the
      node is able to use a glock vs. time that the rest of the
      cluster has its share. That will be very useful when setting
      the lock min hold time.
      
      The other point to remember is that all times are in
      nanoseconds. Great care has been taken to ensure that we
      measure exactly the quantities that we want, as accurately
      as possible. There are always inaccuracies in any
      measuring system, but I hope this is as accurate as we
      can reasonably make it.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      a245769f
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-fixes · 891003ab
      Linus Torvalds 提交于
      * git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-fixes:
        GFS2: Read resource groups on mount
        GFS2: Ensure rindex is uptodate for fallocate
        GFS2: Read in rindex if necessary during unlink
        GFS2: Fix race between lru_list and glock ref count
      891003ab
    • L
      Merge tag 'iommu-fixes-v3.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu · d5a74afd
      Linus Torvalds 提交于
      IOMMU fixes for Linux 3.3-rc5
      
      All the fixes are for the OMAP IOMMU driver. The first patch is the
      biggest one. It fixes the calls of the function omap_find_iovm_area() in
      the omap-iommu-debug module which expects a 'struct device' parameter
      since commit fabdbca8 instead of an omap_iommu handle. The
      omap-iommu-debug code still passed the handle to the function which
      caused a crash.
      
      The second patch fixes a NULL pointer dereference in the OMAP code and
      the third patch makes sure that the omap-iommu is initialized before the
      omap-isp driver, which relies on the iommu. The last patch is only a
      workaround until defered probing is implemented.
      
      * tag 'iommu-fixes-v3.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
        ARM: OMAP: make iommu subsys_initcall to fix builtin omap3isp
        iommu/omap: fix NULL pointer dereference
        iommu/omap: fix erroneous omap-iommu-debug API calls
      d5a74afd
  7. 28 2月, 2012 6 次提交
  8. 27 2月, 2012 14 次提交
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/aia21/ntfs · 5ffca28a
      Linus Torvalds 提交于
      Here are some trivial NTFS changes (a spelling fix and two use before
      NULL check cases found by Coverity as well as an update in MAINTAINERS
      for the path to the ntfs git repo) together with a simple LDM fix for
      parsing fragmented VBLKs.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/aia21/ntfs:
        NTFS: Update git repo path in MAINTAINERS file.
        LDM: Fix reassembly of extended VBLKs.
        NTFS: Correct two spelling errors "dealocate" to "deallocate" in mft.c.
        NTFS: Do not dereference pointer before checking for NULL.
        NTFS: Remove unused variable.
      5ffca28a
    • L
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e25bda56
      Linus Torvalds 提交于
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/mce/AMD: Fix UP build error
        x86: Specify a size for the cmp in the NMI handler
        x86/nmi: Test saved %cs in NMI to determine nested NMI case
        x86/amd: Fix L1i and L2 cache sharing information for AMD family 15h processors
        x86/microcode: Remove noisy AMD microcode warning
      e25bda56
    • L
      Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 70ca00db
      Linus Torvalds 提交于
      * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched/events: Revert trace_sched_stat_sleeptime()
      70ca00db
    • L
      Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · faf3502a
      Linus Torvalds 提交于
      * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        genirq: Handle pending irqs in irq_startup()
        genirq: Unmask oneshot irqs when thread was not woken
      faf3502a
    • H
      compat: fix compile breakage on s390 · 048cd4e5
      Heiko Carstens 提交于
      The new is_compat_task() define for the !COMPAT case in
      include/linux/compat.h conflicts with a similar define in
      arch/s390/include/asm/compat.h.
      
      This is the minimal patch which fixes the build issues.
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      048cd4e5
    • O
      ARM: OMAP: make iommu subsys_initcall to fix builtin omap3isp · 435792d9
      Ohad Ben-Cohen 提交于
      omap3isp depends on omap's iommu and will fail to probe if
      initialized before it (which always happen if they are builtin).
      
      Make omap's iommu subsys_initcall as an interim solution until
      the probe deferral mechanism is merged.
      Reported-by: NJames <angweiyang@gmail.com>
      Debugged-by: NLaurent Pinchart <laurent.pinchart@ideasonboard.com>
      Signed-off-by: NOhad Ben-Cohen <ohad@wizery.com>
      Cc: stable <stable@vger.kernel.org>
      Cc: Tony Lindgren <tony@atomide.com>
      Cc: Hiroshi Doyu <hdoyu@nvidia.com>
      Cc: Joerg Roedel <Joerg.Roedel@amd.com>
      Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
      435792d9
    • A
      e6f4dee7
    • A
      f621c533
    • L
      Merge tag 'stable/for-linus-fixes-3.3-rc5' of... · 500dd237
      Linus Torvalds 提交于
      Merge tag 'stable/for-linus-fixes-3.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
      
      Two fixes to fix a memory corruption bug when WC pages never get
      converted back to WB but end up being recycled in the general memory
      pool as WC.
      
      There is a better way of fixing this, but there is not enough time to do
      the full benchmarking to pick one of the right options - so picking the
      one that favors stability for right now.
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      
      * tag 'stable/for-linus-fixes-3.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
        xen/pat: Disable PAT support for now.
        xen/setup: Remove redundant filtering of PTE masks.
      500dd237
    • L
      Merge tag 'for-linus' of git://github.com/rustyrussell/linux · f6bd5798
      Linus Torvalds 提交于
      * tag 'for-linus' of git://github.com/rustyrussell/linux:
        mod/file2alias: make modpost compile on darwin again
      f6bd5798
    • I
      autofs4 - update MAINTAINERS mailing list entry · f694fc97
      Ian Kent 提交于
      The autofs mailing list has moved to vger.kernel.org.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f694fc97
    • A
      mod/file2alias: make modpost compile on darwin again · dd2a3aca
      Andreas Bießmann 提交于
      commit e49ce141 breaks cross compiling
      the linux kernel on darwin hosts.
      This fix introduce some minimal glue to adopt linker section handling
      for darwin hosts.
      Signed-off-by: NAndreas Bießmann <andreas@biessmann.de>
      CC: Rusty Russell <rusty@rustcorp.com.au>
      CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      CC: Jochen Friedrich <jochen@scram.de>
      CC: Samuel Ortiz <sameo@linux.intel.com>
      CC: "K. Y. Srinivasan" <kys@microsoft.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Tested-by: NBernhard Walle <bernhard@bwalle.de>
      dd2a3aca
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 203738e5
      Linus Torvalds 提交于
      1) ICMP sockets leave err uninitialized but we try to return it for the
         unsupported MSG_OOB case, reported by Dave Jones.
      
      2) Add new Zaurus device ID entries, from Dave Jones.
      
      3) Pointer calculation in hso driver memset is wrong, from Dan
         Carpenter.
      
      4) ks8851_probe() checks unsigned value as negative, fix also from Dan
         Carpenter.
      
      5) Fix crashes in atl1c driver due to TX queue handling, from Eric
         Dumazet.  I anticipate some TX side locking fixes coming in the near
         future for this driver as well.
      
      6) The inline directive fix in Bluetooth which was breaking the build
         only with very new versions of GCC, from Johan Hedberg.
      
      7) Fix crashes in the ATP CLIP code due to ARP cleanups this merge
         window, reported by Meelis Roos and fixed by Eric Dumazet.
      
      8) JME driver doesn't flush RX FIFO correctly, from Guo-Fu Tseng.
      
      9) Some ip6_route_output() callers test the return value for NULL, but
         this never happens as the convention is to return a dst entry with
         dst->error set.  Fixes from RonQing Li.
      
      10) Logitech Harmony 900 should be handled by zaurus driver not
         cdc_ether, update white lists and black lists accordingly.  From
         Scott Talbert.
      
      11) Receiving from certain kinds of devices there won't be a MAC header,
         so there is no MAC header to fixup in the IPSEC code, and if we try
         to do it we'll crash.  Fix from Eric Dumazet.
      
      12) Port type array indexing off-by-one in mlx4 driver, fix from Yevgeny
         Petrilin.
      
      13) Fix regression in link-down handling in davinci_emac which causes
         all RX descriptors to be freed up and therefore RX to wedge
         completely, from Christian Riesch.
      
      14) It took two attempts, but ctnetlink soft lockups seem to be
         cured now, from Pablo Neira Ayuso.
      
      15) Endianness bug fix in ENIC driver, from Santosh Nayak.
      
      16) The long ago conversion of the PPP fragmentation code over to
         abstracted SKB list handling wasn't perfect, once we get an
         out of sequence SKB we don't flush the rest of them like we
         should.  From Ben McKeegan.
      
      17) Fix regression of ->ip_summed initialization in sfc driver.
         From Ben Hutchings.
      
      18) Bluetooth timeout mistakenly using msecs instead of jiffies,
         from Andrzej Kaczmarek.
      
      19) Using _sync variant of work cancellation results in deadlocks,
         use the non _sync variants instead.  From Andre Guedes.
      
      20) Bluetooth rfcomm code had reference counting problems leading
         to crashes, fix from Octavian Purdila.
      
      21) The conversion of netem over to classful qdisc handling added
         two bugs to netem_dequeue(), fixes from Eric Dumazet.
      
      22) Missing pci_iounmap() in ATM Solos driver.  Fix from Julia Lawall.
      
      23) b44_pci_exit() should not have __exit tag since it's invoked from
         non-__exit code.  From Nikola Pajkovsky.
      
      24) The conversion of the neighbour hash tables over to RCU added a
         race, fixed here by adding the necessary reread of tbl->nht, fix
         from Michel Machado.
      
      25) When we added VF (virtual function) attributes for network device
         dumps, this potentially bloats up the size of the dump of one
         network device such that the dump size is too large for the buffer
         allocated by properly written netlink applications.
      
         In particular, if you add 255 VFs to a network device, parts of
         GLIBC stop working.
      
         To fix this, we add an attribute that is used to turn on these
         extended portions of the network device dump.  Sophisticaed
         applications like 'ip' that want to see this stuff  will be changed
         to set the attribute, whereas things like GLIBC that don't care
         about VFs simply will not, and therefore won't be busted by the
         mere presence of VFs on a network device.
      
         Thanks to the tireless work of Greg Rose on this fix.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (53 commits)
        sfc: Fix assignment of ip_summed for pre-allocated skbs
        ppp: fix 'ppp_mp_reconstruct bad seq' errors
        enic: Fix endianness bug.
        gre: fix spelling in comments
        netfilter: ctnetlink: fix soft lockup when netlink adds new entries (v2)
        Revert "netfilter: ctnetlink: fix soft lockup when netlink adds new entries"
        davinci_emac: Do not free all rx dma descriptors during init
        mlx4_core: Fixing array indexes when setting port types
        phy: IC+101G and PHY_HAS_INTERRUPT flag
        netdev/phy/icplus: Correct broken phy_init code
        ipsec: be careful of non existing mac headers
        Move Logitech Harmony 900 from cdc_ether to zaurus
        hso: memsetting wrong data in hso_get_count()
        netfilter: ip6_route_output() never returns NULL.
        ethernet/broadcom: ip6_route_output() never returns NULL.
        ipv6: ip6_route_output() never returns NULL.
        jme: Fix FIFO flush issue
        atm: clip: remove clip_tbl
        ipv4: ping: Fix recvmsg MSG_OOB error handling.
        rtnetlink: Fix problem with buffer allocation
        ...
      203738e5
    • L
      Fix autofs compile without CONFIG_COMPAT · 3c761ea0
      Linus Torvalds 提交于
      The autofs compat handling fix caused a compile failure when
      CONFIG_COMPAT isn't defined.
      
      Instead of adding random #ifdef'fery in autofs, let's just make the
      compat helpers earlier to use: without CONFIG_COMPAT, is_compat_task()
      just hardcodes to zero.
      
      We could probably do something similar for a number of other cases where
      we have #ifdef's in code, but this is the low-hanging fruit.
      Reported-and-tested-by: NAndreas Schwab <schwab@linux-m68k.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3c761ea0
  9. 26 2月, 2012 4 次提交
    • L
      Linux 3.3-rc5 · 6b21d18e
      Linus Torvalds 提交于
      6b21d18e
    • L
      Merge tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging · 00b10ecf
      Linus Torvalds 提交于
      Couple of minor driver fixes.
      
      * tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
        hwmon: (max34440) Fix resetting temperature history
        hwmon: (f75375s) Fix register write order when setting fans to full speed
        hwmon: (ads1015) Fix file leak in probe function
        hwmon: (max6639) Fix PPR register initialization to set both channels
        hwmon: (max6639) Fix FAN_FROM_REG calculation
      00b10ecf
    • L
      Merge branch 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild · 1e73fde5
      Linus Torvalds 提交于
      three kbuild fixes for 3.3:
       - make deb-pkg symlink race fix.
       - make coccicheck fix.
       - Dropping the check for modutils.  This is not a regression, but
         allows the module-init-tools replacement kmod work with the 3.3
         kernel.
      
      * 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
        coccicheck: change handling of C={1,2} when M= is set
        builddeb: Don't create files in /tmp with predictable names
        kbuild: do not check for ancient modutils tools
      1e73fde5
    • I
      autofs: work around unhappy compat problem on x86-64 · a32744d4
      Ian Kent 提交于
      When the autofs protocol version 5 packet type was added in commit
      5c0a32fc ("autofs4: add new packet type for v5 communications"), it
      obvously tried quite hard to be word-size agnostic, and uses explicitly
      sized fields that are all correctly aligned.
      
      However, with the final "char name[NAME_MAX+1]" array at the end, the
      actual size of the structure ends up being not very well defined:
      because the struct isn't marked 'packed', doing a "sizeof()" on it will
      align the size of the struct up to the biggest alignment of the members
      it has.
      
      And despite all the members being the same, the alignment of them is
      different: a "__u64" has 4-byte alignment on x86-32, but native 8-byte
      alignment on x86-64.  And while 'NAME_MAX+1' ends up being a nice round
      number (256), the name[] array starts out a 4-byte aligned.
      
      End result: the "packed" size of the structure is 300 bytes: 4-byte, but
      not 8-byte aligned.
      
      As a result, despite all the fields being in the same place on all
      architectures, sizeof() will round up that size to 304 bytes on
      architectures that have 8-byte alignment for u64.
      
      Note that this is *not* a problem for 32-bit compat mode on POWER, since
      there __u64 is 8-byte aligned even in 32-bit mode.  But on x86, 32-bit
      and 64-bit alignment is different for 64-bit entities, and as a result
      the structure that has exactly the same layout has different sizes.
      
      So on x86-64, but no other architecture, we will just subtract 4 from
      the size of the structure when running in a compat task.  That way we
      will write the properly sized packet that user mode expects.
      
      Not pretty.  Sadly, this very subtle, and unnecessary, size difference
      has been encoded in user space that wants to read packets of *exactly*
      the right size, and will refuse to touch anything else.
      Reported-and-tested-by: NThomas Meyer <thomas@m3y3r.de>
      Signed-off-by: NIan Kent <raven@themaw.net>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a32744d4
  10. 25 2月, 2012 2 次提交