Metadata-Version: 2.4
Name: af3parallel
Version: 1.1.0
Summary: Profile-driven multi-GPU toolkit for large-scale AlphaFold 3 inference
Author: AF3Parallel contributors
License: MIT
Project-URL: Homepage, https://github.com/Xin-DongXu/AF3Parallel
Project-URL: Repository, https://github.com/Xin-DongXu/AF3Parallel
Project-URL: Issues, https://github.com/Xin-DongXu/AF3Parallel/issues
Project-URL: Documentation, https://github.com/Xin-DongXu/AF3Parallel/tree/main/docs
Keywords: alphafold3,protein-structure,gpu-scheduling,bioinformatics,singularity
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Provides-Extra: extras
Requires-Dist: psutil>=5.9; extra == "extras"
Requires-Dist: rdkit>=2022.9; extra == "extras"
Dynamic: license-file

# AF3Parallel

[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
[![Python](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)
[![AlphaFold 3](https://img.shields.io/badge/AlphaFold_3-v3.0.1-green.svg)](https://github.com/google-deepmind/alphafold3)

Profile-driven toolkit for running [AlphaFold 3](https://github.com/google-deepmind/alphafold3) inference at scale on multi-GPU Linux clusters.

AF3Parallel wraps the official AF3 Singularity workflow with VRAM-aware scheduling, temporal-wave batching, and companion utilities for profiling, runtime estimation, and input JSON preparation. Install with **pip** or **conda**, then invoke all tools through a single CLI — no manual script copying required.

---

## What's included

| Tool | CLI command | Purpose |
| --- | --- | --- |
| Multi-GPU executor | `af3parallel run` | Distribute AF3 jobs across GPUs with LPT scheduling, VRAM-aware batching, and temporal-wave packing |
| Peak VRAM profiler | `af3parallel profile` | One-shot peak-memory scan → TSV profile for scheduling |
| Time-series profiler | `af3parallel profile-ts` | Sub-second VRAM sampling during AF3 runs |
| GPU runtime estimator | `af3parallel estimate-gpu` | Predict serial GPU wall time from a token profile |
| CPU/MSA estimator | `af3parallel estimate-cpu` | Predict data-pipeline wall time from a protein-length profile |
| JSON integrator | `af3parallel json` | Batch-edit AF3 inputs (seeds, ligands, nucleic acids, ions) |
| GPU monitor | `af3parallel monitor` | Standalone `nvidia-smi` memory logger |

Built-in VRAM/runtime profiles are **measured** on NVIDIA A800 80 GB and RTX 4090 24 GB; other GPUs require a one-time custom profile from `af3parallel profile`.

---

## Installation

### Prerequisites

AF3Parallel is a wrapper around AlphaFold 3 — complete the [official AF3 v3.0.1 installation](https://github.com/google-deepmind/alphafold3/blob/v3.0.1/docs/installation.md) first (Singularity image, model weights, genetic databases). Full details: [docs/installation.md](docs/installation.md).

| Component | Required | Notes |
| --- | --- | --- |
| AlphaFold 3 v3.0.1 + Singularity | Yes | Run commands from your AF3 working directory |
| Linux + NVIDIA GPU (CC ≥ 8.0) | Yes | e.g. A100, H100, RTX 4090 |
| Python ≥ 3.8 | Yes | Standard library only in core tools |
| `psutil` | Optional | Auto `--max-concurrent-tasks` cap in `af3parallel run` |
| `rdkit` | Optional | More accurate SMILES heavy-atom counts |

### pip (recommended)

```bash
pip install "af3parallel[extras]"

# from source
git clone https://github.com/Xin-DongXu/AF3Parallel.git
cd AF3Parallel
pip install -e ".[extras]"
```

Release guide: [docs/publishing.md](docs/publishing.md).

Verify:

```bash
af3parallel --version
af3parallel --help
```

### conda / mamba

```bash
conda install -c bioconda -c conda-forge af3parallel
```

Submit the recipe via [docs/publishing.md](docs/publishing.md) if not yet on Bioconda.

Development environment from source:

```bash
git clone https://github.com/Xin-DongXu/AF3Parallel.git
cd AF3Parallel
mamba env create -f environment.yml
conda activate af3parallel
```

### Legacy script wrappers

If you prefer the old `*.py` filenames inside your AF3 tree:

```bash
pip install -e /path/to/AF3Parallel
bash /path/to/AF3Parallel/tools/install-to-alphafold3.sh /path/to/alphafold3
python /path/to/alphafold3/AF3Parallel.py ...   # thin wrapper; requires pip install
```

---

## Quick start

All examples assume your current directory is the AF3 working tree (`alphafold3/`).

```bash
# 1. Profile once per GPU model (skip for built-in a800-80g / rtx4090 presets)
af3parallel profile \
    -i ./profile_inputs -o my_gpu_profile.tsv \
    --sif alphafold3.sif --af3-db ~/af3_DB --models ./models

# 2. (Optional) estimate batch runtime
af3parallel estimate-gpu \
    --input-dir ./af_input --profile my_gpu_profile.tsv \
    --output-tsv estimate_breakdown.tsv --workers 16

# 3. Run the batch across all GPUs
af3parallel run \
    -i ./af_input -o results.tsv --output-dir ./af_output \
    --sif alphafold3.sif --af3-db ~/af3_DB --models ./models \
    --gpus 0,1,2,3 --memory-profile my_gpu_profile.tsv
```

Dry-run the schedule before a large batch:

```bash
af3parallel run ... --test-only
```

Prepare inputs for a ligand screen:

```bash
af3parallel json replace-ligand \
    -i base.json --from-csv examples/ligands.csv \
    --output-dir ./outputs --workers 16
```

---

## Typical workflow

```
  AF3 input JSONs  ──►  af3parallel profile  ──►  TSV profile
         │                                              │
         │                                              ▼
         ├──►  af3parallel estimate-gpu/cpu             │
         │                                              ▼
         └──────────────────────────────►  af3parallel run  ──►  results.tsv
```

See [docs/workflow.md](docs/workflow.md) for the full pipeline.

---

## CLI reference

### Unified CLI

```bash
af3parallel <command> [arguments]
af3parallel run --help
af3parallel json set-seeds -i input.json -o output.json --seeds 1 2 3
python -m af3parallel --help          # equivalent
```

| Subcommand | Standalone alias | Replaces legacy script |
| --- | --- | --- |
| `run` | `af3parallel-run` | `AF3Parallel.py` |
| `profile` | `af3parallel-profile` | `AF3_GPU_Memory_Profiler.py` |
| `profile-ts` | `af3parallel-profile-ts` | `AF3_GPU_Memory_Time-Series_Profiler.py` |
| `estimate-gpu` | `af3parallel-estimate-gpu` | `AF3_GPU_time_estimate.py` |
| `estimate-cpu` | `af3parallel-estimate-cpu` | `AF3_CPU_time_estimate.py` |
| `json` | `af3parallel-json` | `AF3_JSON_Integrator.py` |
| `monitor` | `af3parallel-monitor` | `GPU_monitor.py` |

Full flag lists: [docs/cli-reference.md](docs/cli-reference.md) or `af3parallel <command> --help`.

---

## Features

- **Token-balanced LPT** distribution across multiple GPUs
- **VRAM-aware batching** with configurable safety margins
- **Temporal-wave scheduling** — pack small jobs into the VRAM shadow of long anchors
- **Built-in GPU profiles** for A800 80 GB and RTX 4090 24 GB (`--gpu-preset`)
- **Streaming TSV logs** written per task as jobs finish
- **Resilient execution** — per-task retry, SIGINT/SIGTERM JSON restore, optional strict mode
- **Global CPU-RAM cap** — auto-derived from system memory when `psutil` is installed

---

## Repository layout

```
AF3Parallel/
├── README.md
├── LICENSE · CITATION.cff · pyproject.toml · requirements.txt
├── src/af3parallel/              # installable Python package
│   ├── parallel.py               # multi-GPU executor
│   ├── gpu_memory_profiler.py
│   ├── gpu_memory_timeseries_profiler.py
│   ├── gpu_time_estimate.py
│   ├── cpu_time_estimate.py
│   ├── json_integrator.py
│   ├── gpu_monitor.py
│   └── cli/                      # unified CLI dispatcher
├── docs/                         # detailed documentation
├── examples/                     # sample CSV manifests
├── conda/recipe/                 # Bioconda recipe template
├── environment.yml               # conda dev environment
├── scripts/                      # legacy thin wrappers
└── tools/
    └── install-to-alphafold3.sh
```

---

## Documentation

| Topic | Guide |
| --- | --- |
| Documentation index | [docs/README.md](docs/README.md) |
| PyPI / Bioconda release guide | [docs/publishing.md](docs/publishing.md) |
| pip / conda / legacy install | [docs/installation.md](docs/installation.md) |
| End-to-end workflow | [docs/workflow.md](docs/workflow.md) |
| Built-in GPU presets & profile TSV format | [docs/gpu-profiles.md](docs/gpu-profiles.md) |
| CLI flags & output columns | [docs/cli-reference.md](docs/cli-reference.md) |
| JSON Integrator (ligand/nucleic/ion screens) | [docs/json-integrator.md](docs/json-integrator.md) |
| Tips & troubleshooting | [docs/tips.md](docs/tips.md) |

---

## Development

```bash
git clone https://github.com/Xin-DongXu/AF3Parallel.git
cd AF3Parallel
pip install -e ".[extras]"
af3parallel --version
```

To publish: bump version in `src/af3parallel/__version__.py` and `pyproject.toml`, tag a release, then follow [docs/publishing.md](docs/publishing.md). GitHub Actions workflow: `.github/workflows/publish-pypi.yml`.

---

## License & citation

This project is released under the [MIT License](LICENSE).

AlphaFold 3 is licensed separately by Google DeepMind and is **not** distributed by this repository.

If you use AF3Parallel in academic work, please cite AlphaFold 3:

> Abramson, J., Adler, J., Dunger, J. *et al.* Accurate structure prediction of biomolecular interactions with AlphaFold 3. *Nature* **630**, 493–500 (2024). https://doi.org/10.1038/s41586-024-07487-w

See [CITATION.cff](CITATION.cff) for machine-readable metadata.
