1. 14 8月, 2008 19 次提交
  2. 13 8月, 2008 21 次提交
    • S
      crypto: padlock - fix VIA PadLock instruction usage with irq_ts_save/restore() · e4914012
      Suresh Siddha 提交于
      Wolfgang Walter reported this oops on his via C3 using padlock for
      AES-encryption:
      
      ##################################################################
      
      BUG: unable to handle kernel NULL pointer dereference at 000001f0
      IP: [<c01028c5>] __switch_to+0x30/0x117
      *pde = 00000000
      Oops: 0002 [#1] PREEMPT
      Modules linked in:
      
      Pid: 2071, comm: sleep Not tainted (2.6.26 #11)
      EIP: 0060:[<c01028c5>] EFLAGS: 00010002 CPU: 0
      EIP is at __switch_to+0x30/0x117
      EAX: 00000000 EBX: c0493300 ECX: dc48dd00 EDX: c0493300
      ESI: dc48dd00 EDI: c0493530 EBP: c04cff8c ESP: c04cff7c
       DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
      Process sleep (pid: 2071, ti=c04ce000 task=dc48dd00 task.ti=d2fe6000)
      Stack: dc48df30 c0493300 00000000 00000000 d2fe7f44 c03b5b43 c04cffc8 00000046
             c0131856 0000005a dc472d3c c0493300 c0493470 d983ae00 00002696 00000000
             c0239f54 00000000 c04c4000 c04cffd8 c01025fe c04f3740 00049800 c04cffe0
      Call Trace:
       [<c03b5b43>] ? schedule+0x285/0x2ff
       [<c0131856>] ? pm_qos_requirement+0x3c/0x53
       [<c0239f54>] ? acpi_processor_idle+0x0/0x434
       [<c01025fe>] ? cpu_idle+0x73/0x7f
       [<c03a4dcd>] ? rest_init+0x61/0x63
       =======================
      
      Wolfgang also found out that adding kernel_fpu_begin() and kernel_fpu_end()
      around the padlock instructions fix the oops.
      
      Suresh wrote:
      
      These padlock instructions though don't use/touch SSE registers, but it behaves
      similar to other SSE instructions. For example, it might cause DNA faults
      when cr0.ts is set. While this is a spurious DNA trap, it might cause
      oops with the recent fpu code changes.
      
      This is the code sequence  that is probably causing this problem:
      
      a) new app is getting exec'd and it is somewhere in between
         start_thread() and flush_old_exec() in the load_xyz_binary()
      
      b) At pont "a", task's fpu state (like TS_USEDFPU, used_math() etc) is
         cleared.
      
      c) Now we get an interrupt/softirq which starts using these encrypt/decrypt
         routines in the network stack. This generates a math fault (as
         cr0.ts is '1') which sets TS_USEDFPU and restores the math that is
         in the task's xstate.
      
      d) Return to exec code path, which does start_thread() which does
         free_thread_xstate() and sets xstate pointer to NULL while
         the TS_USEDFPU is still set.
      
      e) At the next context switch from the new exec'd task to another task,
         we have a scenarios where TS_USEDFPU is set but xstate pointer is null.
         This can cause an oops during unlazy_fpu() in __switch_to()
      
      Now:
      
      1) This should happen with or with out pre-emption. Viro also encountered
         similar problem with out CONFIG_PREEMPT.
      
      2) kernel_fpu_begin() and kernel_fpu_end() will fix this problem, because
         kernel_fpu_begin() will manually do a clts() and won't run in to the
         situation of setting TS_USEDFPU in step "c" above.
      
      3) This was working before the fpu changes, because its a spurious
         math fault  which doesn't corrupt any fpu/sse registers and the task's
         math state was always in an allocated state.
      
      With out the recent lazy fpu allocation changes, while we don't see oops,
      there is a possible race still present in older kernels(for example,
      while kernel is using kernel_fpu_begin() in some optimized clear/copy
      page and an interrupt/softirq happens which uses these padlock
      instructions generating DNA fault).
      
      This is the failing scenario that existed even before the lazy fpu allocation
      changes:
      
      0. CPU's TS flag is set
      
      1. kernel using FPU in some optimized copy  routine and while doing
      kernel_fpu_begin() takes an interrupt just before doing clts()
      
      2. Takes an interrupt and ipsec uses padlock instruction. And we
      take a DNA fault as TS flag is still set.
      
      3. We handle the DNA fault and set TS_USEDFPU and clear cr0.ts
      
      4. We complete the padlock routine
      
      5. Go back to step-1, which resumes clts() in kernel_fpu_begin(), finishes
      the optimized copy routine and does kernel_fpu_end(). At this point,
      we have cr0.ts again set to '1' but the task's TS_USEFPU is stilll
      set and not cleared.
      
      6. Now kernel resumes its user operation. And at the next context
      switch, kernel sees it has do a FP save as TS_USEDFPU is still set
      and then will do a unlazy_fpu() in __switch_to(). unlazy_fpu()
      will take a DNA fault, as cr0.ts is '1' and now, because we are
      in __switch_to(), math_state_restore() will get confused and will
      restore the next task's FP state and will save it in prev tasks's FP state.
      Remember, in __switch_to() we are already on the stack of the next task
      but take a DNA fault for the prev task.
      
      This causes the fpu leakage.
      
      Fix the padlock instruction usage by calling them inside the
      context of new routines irq_ts_save/restore(), which clear/restore cr0.ts
      manually in the interrupt context. This will not generate spurious DNA
      in the  context of the interrupt which will fix the oops encountered and
      the possible FPU leakage issue.
      Reported-and-bisected-by: NWolfgang Walter <wolfgang.walter@stwm.de>
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      e4914012
    • H
      crypto: hash - Add missing top-level functions · 318e5313
      Herbert Xu 提交于
      The top-level functions init/update/final were missing for ahash.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      318e5313
    • H
      crypto: hash - Fix digest size check for digest type · dbaaba1d
      Herbert Xu 提交于
      The changeset ca786dc7
      
      	crypto: hash - Fixed digest size check
      
      missed one spot for the digest type.  This patch corrects that
      error.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      dbaaba1d
    • H
      crypto: tcrypt - Fix AEAD chunk testing · f176e632
      Herbert Xu 提交于
      My changeset 4b22f0dd
      
      	crypto: tcrpyt - Remove unnecessary kmap/kunmap calls
      
      introduced a typo that broke AEAD chunk testing.  In particular,
      axbuf should really be xbuf.
      
      There is also an issue with testing the last segment when encrypting.
      The additional part produced by AEAD wasn't tested.  Similarly, on
      decryption the additional part of the AEAD input is mistaken for
      corruption.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      f176e632
    • L
      crypto: talitos - Add handling for SEC 3.x treatment of link table · f3c85bc1
      Lee Nipper 提交于
      Later SEC revision requires the link table (used for scatter/gather)
      to have an extra entry to account for the total length in descriptor [4],
      which contains cipher Input and ICV.
      This only applies to decrypt, not encrypt.
      Without this change, on 837x, a gather return/length error results
      when a decryption uses a link table to gather the fragments.
      This is observed by doing a ping with size of 1447 or larger with AES,
      or a ping with size 1455 or larger with 3des.
      
      So, add check for SEC compatible "fsl,3.0" for using extra link table entry.
      Signed-off-by: NLee Nipper <lee.nipper@freescale.com>
      Signed-off-by: NKim Phillips <kim.phillips@freescale.com>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      f3c85bc1
    • L
      [XFS] Fix use after free in xfs_log_done(). · c6a7b0f8
      Lachlan McIlroy 提交于
      The ticket allocation code got reworked in 2.6.26 and we now free tickets
      whereas before we used to cache them so the use-after-free went
      undetected.
      
      SGI-PV: 985525
      
      SGI-Modid: xfs-linux-melb:xfs-kern:31877a
      Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
      Signed-off-by: NDavid Chinner <david@fromorbit.com>
      c6a7b0f8
    • R
      [XFS] Make xfs_bmap_*_count_leaves void. · c94312de
      Ruben Porras 提交于
      xfs_bmap_count_leaves and xfs_bmap_disk_count_leaves always return always
      0, make them void.
      
      SGI-PV: 981498
      
      SGI-Modid: xfs-linux-melb:xfs-kern:31844a
      Signed-off-by: NRuben Porras <ruben.porras@linworks.de>
      Signed-off-by: NDonald Douwsma <donaldd@sgi.com>
      Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
      c94312de
    • L
      [XFS] Use KM_NOFS for debug trace buffers · 5695ef46
      Lachlan McIlroy 提交于
      Use KM_NOFS to prevent recursion back into the filesystem which can cause
      deadlocks.
      
      In the case of xfs_iread() we hold the lock on the inode cluster buffer
      while allocating memory for the trace buffers. If we recurse back into XFS
      to flush data that may require a transaction to allocate extents which
      needs log space. This can deadlock with the xfsaild thread which can't
      push the tail of the log because it is trying to get the inode cluster
      buffer lock.
      
      SGI-PV: 981498
      
      SGI-Modid: xfs-linux-melb:xfs-kern:31838a
      Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
      Signed-off-by: NDavid Chinner <david@fromorbit.com>
      5695ef46
    • C
      [XFS] use KM_MAYFAIL in xfs_mountfs · d62c251f
      Christoph Hellwig 提交于
      Use KM_MAYFAIL for the m_perag allocation, we can deal with the error
      easily and blocking forever during mount is not a good idea either.
      
      SGI-PV: 981498
      
      SGI-Modid: xfs-linux-melb:xfs-kern:31837a
      Signed-off-by: NChristoph Hellwig <hch@infradead.org>
      Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
      d62c251f
    • C
      [XFS] refactor xfs_mount_free · ff4f038c
      Christoph Hellwig 提交于
      xfs_mount_free mostly frees the perag data, which is something that is
      duplicated in the mount error path.
      
      Move the XFS_QM_DONE call to the caller and remove the useless
      mutex_destroy/spinlock_destroy calls so that we can re-use it for the
      mount error path. Also rename it to xfs_free_perag to reflect what it
      does.
      
      SGI-PV: 981498
      
      SGI-Modid: xfs-linux-melb:xfs-kern:31836a
      Signed-off-by: NChristoph Hellwig <hch@infradead.org>
      Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
      ff4f038c
    • C
      [XFS] don't call xfs_freesb from xfs_unmountfs · 6203300e
      Christoph Hellwig 提交于
      xfs_readsb is called before xfs_mount so xfs_freesb should be called after
      xfs_unmountfs, too. This means it now happens after a few things during
      the of xfs_unmount which all have nothing to do with the superblock.
      
      SGI-PV: 981498
      
      SGI-Modid: xfs-linux-melb:xfs-kern:31835a
      Signed-off-by: NChristoph Hellwig <hch@infradead.org>
      Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
      6203300e
    • C
      [XFS] xfs_unmountfs should return void · 41b5c2e7
      Christoph Hellwig 提交于
      xfs_unmounts can't and shouldn't return errors so declare it as returning
      void.
      
      SGI-PV: 981498
      
      SGI-Modid: xfs-linux-melb:xfs-kern:31833a
      Signed-off-by: NChristoph Hellwig <hch@infradead.org>
      Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
      41b5c2e7
    • C
      [XFS] cleanup xfs_mountfs · 4249023a
      Christoph Hellwig 提交于
      Remove all the useless flags and code keyed off it in xfs_mountfs.
      
      SGI-PV: 981498
      
      SGI-Modid: xfs-linux-melb:xfs-kern:31831a
      Signed-off-by: NChristoph Hellwig <hch@infradead.org>
      Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
      4249023a
    • C
      [XFS] move root inode IRELE into xfs_unmountfs · 77508ec8
      Christoph Hellwig 提交于
      The root inode is allocated in xfs_mountfs so it should be release in
      xfs_unmountfs. For the unmount case that means we do it after the the
      xfs_sync(mp, SYNC_WAIT | SYNC_CLOSE) in the forced shutdown case and the
      dmapi unmount event. Note that both reference the rip variable which might
      be freed by that time in case inode flushing has kicked in, so strictly
      speaking this might count as a bug fix
      
      SGI-PV: 981498
      
      SGI-Modid: xfs-linux-melb:xfs-kern:31830a
      Signed-off-by: NChristoph Hellwig <hch@infradead.org>
      Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
      77508ec8
    • C
      [XFS] stop using file_update_time · 3a76c1ea
      Christoph Hellwig 提交于
      xfs_ichtime updates the xfs_inode and Linux inode timestamps just fine, no
      need to call file_update_time and then copy the values over to the XFS
      inode. The only additional thing in file_update_time are checks not
      applicable to the write path.
      
      SGI-PV: 981498
      
      SGI-Modid: xfs-linux-melb:xfs-kern:31829a
      Signed-off-by: NChristoph Hellwig <hch@infradead.org>
      Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
      Signed-off-by: NDavid Chinner <david@fromorbit.com>
      3a76c1ea
    • C
      [XFS] optimize xfs_ichgtime · 8e5975c8
      Christoph Hellwig 提交于
      Port a little optmization from file_update_time to xfs_ichgtime, and only
      update the timestamp and mark the inode dirty if the timestamp actually
      changes in the timer tick resultion supported by the running kernel.
      
      SGI-PV: 981498
      
      SGI-Modid: xfs-linux-melb:xfs-kern:31827a
      Signed-off-by: NChristoph Hellwig <hch@infradead.org>
      Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
      8e5975c8
    • C
      [XFS] update timestamp in xfs_ialloc manually · dff35fd4
      Christoph Hellwig 提交于
      In xfs_ialloc we just want to set all timestamps to the current time. We
      don't need to mark the inode dirty like xfs_ichgtime does, and we don't
      need nor want the opimizations in xfs_ichgtime that I will introduce in
      the next patch.
      
      So just opencode the timestamp update in xfs_ialloc, and remove the new
      unused XFS_ICHGTIME_ACC case in xfs_ichgtime.
      
      SGI-PV: 981498
      
      SGI-Modid: xfs-linux-melb:xfs-kern:31825a
      Signed-off-by: NChristoph Hellwig <hch@infradead.org>
      Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
      dff35fd4
    • D
      [XFS] remove the sema_t from XFS. · ab4a9b04
      David Chinner 提交于
      Now that all users of the sema_t are gone from XFS we can finally kill it.
      
      SGI-PV: 981498
      
      SGI-Modid: xfs-linux-melb:xfs-kern:31823a
      Signed-off-by: NDavid Chinner <david@fromorbit.com>
      Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
      ab4a9b04
    • D
      [XFS] replace dquot flush semaphore with a completion · e1f49cf2
      David Chinner 提交于
      Use the new completion flush code to implement the dquot flush lock.
      Removes one of the final users of semaphores in the XFS code base.
      
      SGI-PV: 981498
      
      SGI-Modid: xfs-linux-melb:xfs-kern:31822a
      Signed-off-by: NDavid Chinner <david@fromorbit.com>
      Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
      e1f49cf2
    • D
      [XFS] replace inode flush semaphore with a completion · c63942d3
      David Chinner 提交于
      Use the new completion flush code to implement the inode flush lock.
      Removes one of the final users of semaphores in the XFS code base.
      
      SGI-PV: 981498
      
      SGI-Modid: xfs-linux-melb:xfs-kern:31817a
      Signed-off-by: NDavid Chinner <david@fromorbit.com>
      Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
      c63942d3
    • D
      [XFS] extend completions to provide XFS object flush requirements · 39d2f1ab
      David Chinner 提交于
      XFS object flushing doesn't quite match existing completion semantics.  It
      mixed exclusive access with completion.  That is, we need to mark an object as
      being flushed before flushing it to disk, and then block any other attempt to
      flush it until the completion occurs.  We do this but adding an extra count to
      the completion before we start using them.  However, we still need to
      determine if there is a completion in progress, and allow no-blocking attempts
      fo completions to decrement the count.
      
      To do this we introduce:
      
      int try_wait_for_completion(struct completion *x)
      	returns a failure status if done == 0, otherwise decrements done
      	to zero and returns a "started" status. This is provided
      	to allow counted completions to begin safely while holding
      	object locks in inverted order.
      
      int completion_done(struct completion *x)
      	returns 1 if there is no waiter, 0 if there is a waiter
      	(i.e. a completion in progress).
      
      This replaces the use of semaphores for providing this exclusion
      and completion mechanism.
      
      SGI-PV: 981498
      
      SGI-Modid: xfs-linux-melb:xfs-kern:31816a
      Signed-off-by: NDavid Chinner <david@fromorbit.com>
      Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
      39d2f1ab