Supplementary table for benchmarking methods on CATHS20: we benchmark against HHBlits, DIAMOND, MMseqs2, FoldSeek, and ProtTucker.
We first created an HHBlits database using MMseqs2, and then followed that up by running HHblits using: ‘hhblits -I️ {input_seq} -o {write_path} -n 1 -d searchMsa —format-output “query,target,evalue,qaln,taln”’. We used DIAMOND v2.0.14 with the following command ‘./diamond blastp -d reference -q cath-dataset-nonredundant-S20.fa -o matches.tsv —ultra-sensitive’.  We ran MMseqs2 with the following commands: ‘mmseqs search -e 1000 -s 7.5 —num-iterations 3 —cov-mode 0 mmtargetdb mmtargetdb alnRes.m8 tmp’, and ‘mmseqs search -e 0.01 -s 7 mmtargetdb mmtargetdb alnRes.m8 tmp’. Lastly, we ran FoldSeek with the following command: ‘foldseek easy-search targetDB targetDB aln.m8 tmpFolder —alignment-type 1 -e 1000 -s 9.5’. 

Supplementary table: benchmarking methods on the ProtTucker benchmark dataset: we benchmark against HHBlits, DIAMOND, MMseqs2, FoldSeek, and ProtTucker. We ran MMseqs2 with the following command: ‘*mmseqs search -e 1000 -s 7.5 —num-iterations 3 —cov-mode 0 mmquerydb  mmtargetdb alnRes.m8 tmp* , and FoldSeek with the following: ‘ foldseek easy-search pdbs_prottucker_test219/ targetDB aln_option1.m8 tmpFolder —alignment-type 1 -e 1000 -s 9.5’
