## 1. Inference Benchmark ### 1.1 Environment The PP-Helixfold model's inference test is tested on single-card NVIDIA A100 (40G), batch size=1. To reproduce the results reported in our paper, specific environment settings are required as below. * Python: 3.7 * CUDA 11.2 * CUDNN 8.10.1 * NCCL 2.12.12. ### 1.2 Datasets For training, the PP-Helixfold model uses 25% of samples from RCSB PDB and 75% of self-distillation samples. For evaluation, we collect 87 domain targets from CASP14 and 371 protein targets from CAMEO, ranging from 2021-09-04 to 2022-02-19. ### 1.3 Performance Compared with the computational performance of AlphaFold2 reported in the paper and OpenFold implemented through PyTorch, PP-Helixfold reduces the training time from about 11 days to 7.5 days, and it can be further reduced to only 5.3 days when using hybrid parallelism. Training PP-Helixfold from scratch can achieve competitive accuracy with AlphaFold2. ![](https://github.com/PaddlePaddle/PaddleHelix/blob/dev/.github/HelixFold_computational_performance.png) ![](https://github.com/PaddlePaddle/PaddleHelix/blob/dev/.github/HelixFold_accuracy.png) ## 2. Reference Ref: https://github.com/PaddlePaddle/PaddleHelix/tree/dev/apps/protein_folding/helixfold