The Alphafold Module
Alphafold is available to use as a module. Use the following command to load the latest version of Alphafold:
module load cuda module load alphafold
This will load the Alphafold module and also CUDA driver to ensure GPU is used for accelerating the prediction.
This package is still undergoing testing. Please help report issues and slow predictions to the service-now portal.
AlphaFold needs multiple genetic (sequence) databases to run:
PDB structures in the mmCIF format
PDB seqres – only for AlphaFold-Multimer
UniProt – only for AlphaFold-Multimer
Which are stored on the flash drive and accessible to all users at:
Ensure the protein sequence (.fasta) you’d like to perform prediction on has been downloaded to your JADE home directory.
Once the Alphafold module has been loaded, the script
run_alphafold.sh can be used to run the prediction. For example, if trying to run a sequence located at
run_alphafold.sh -d /jmain02/flash/share/datasets/GeneticDB -o predictions/ -f ~/my_protein.fasta -t 2023-02-03
Explanation of the parameters for the above command:
-d - Location of the genetic databases, on JADE it is stored at
-o - Path to a directory that will store the results.
-f - Path to a FASTA file containing sequence. If a FASTA file contains multiple sequences, then it will be folded as a multimer.
-t - Maximum template release date to consider (ISO-8601 format - i.e. YYYY-MM-DD). Important if folding historical test sets.
run_alphafold.sh --help for explanations on all the parameters that can be used.
A folder of the protein file name is created inside the output directory e.g.
predictions/my_protein/ for this example. Explanations of outputs can be found at on the official repository.