The PP-Helixfold model's inference test is tested on single-card NVIDIA A100 (40G), batch size=1. To reproduce the results reported in our paper, specific environment settings are required as below.
The PP-Helixfold model's inference test is tested on single-card NVIDIA A100 (40G), batch size=1. To reproduce the results reported in our paper, specific environment settings are required as below.
* Python: 3.7
* Python: 3.7
* CUDA 11.2
* CUDA 11.6
* CUDNN 8.10.1
* CUDNN 8.4.0
* NCCL 2.12.12.
* NCCL 2.14.3
### 1.2 Datasets
### 1.2 Datasets
For training, the PP-Helixfold model uses 25% of samples from RCSB PDB and 75% of self-distillation samples. For evaluation, we collect 87 domain targets from CASP14 and 371 protein targets from CAMEO, ranging from 2021-09-04 to 2022-02-19.
For training, the PP-Helixfold model uses 25% of samples from RCSB PDB and 75% of self-distillation samples. For evaluation, we collect 87 domain targets from CASP14 and 60 protein targets from CAMEO, ranging from 2022-08-01 to 2022-08-31.
### 1.3 Performance
### 1.3 Performance
Compared with the computational performance of AlphaFold2 reported in the paper and OpenFold implemented through PyTorch, PP-Helixfold reduces the training time from about 11 days to 7.5 days, and it can be further reduced to only 5.3 days when using hybrid parallelism. Training PP-Helixfold from scratch can achieve competitive accuracy with AlphaFold2.
Compared with the computational performance of AlphaFold2 reported in the paper and OpenFold implemented through PyTorch, PP-Helixfold reduces the training time from about 11 days to 5.12 days, and it can be further reduced to only 2.89 days when using hybrid parallelism. Training PP-Helixfold from scratch can achieve competitive accuracy with AlphaFold2.