-
由 Jesper Nilsson 提交于
- Slight tweaks, use $acr + addoq to propagate carry across the loop boundary. - Better use of latency cycles. - Remove duplicate folding of carry, it is not needed.
41f9412b
- Slight tweaks, use $acr + addoq to propagate carry across the loop boundary. - Better use of latency cycles. - Remove duplicate folding of carry, it is not needed.