Metadata-Version: 2.4
Name: oh-my-batch
Version: 0.7.3
Summary: 
License: GPL
License-File: LICENSE
Author: weihong.xu
Author-email: xuweihong.cn@gmail.com
Requires-Python: >=3.8,<4.0
Classifier: License :: Other/Proprietary License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Requires-Dist: fire (>=0.5.0,<0.6.0)
Description-Content-Type: text/markdown

# oh-my-batch

[![PyPI version](https://badge.fury.io/py/oh-my-batch.svg)](https://badge.fury.io/py/oh-my-batch)
[![PyPI - Downloads](https://img.shields.io/pypi/dm/oh-my-batch)](https://pypi.org/project/oh-my-batch/)
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/oh-my-batch)](https://pypi.org/project/oh-my-batch/)

A command line toolkit to simplify batch tasks generation and execution. 

## Features
* `omb combo`: generate folders/files from different combinations of parameters
* `omb batch`: generate batch scripts from multiple working directories
* `omb job`: track the state of job in job scheduler
* `omb misc`: miscellaneous commands

## Install
```bash
pip install oh-my-batch
```

## Usage
`oh-my-batch` provides a command line interface `omb` with multiple sub-commands.
You can read the help message of each sub-command to learn how to use it.

For example,
```bash
omb  # see the help message of `omb` command
omb combo # see the help message of `combo` sub-command
omb combo set_broadcast --help # see the help message of `set_broadcast` command in combo
omb batch # see the help message of `batch` sub-command
omb batch add_work_dirs --help # see the help message of `add_work_dirs` command in batch
omb job # see the help message of `job` sub-command
omb job slurm submit --help # see the help message of slurm `submit` command
```

### Chaining function calls
`oh-my-batch` makes use of [fire](https://github.com/google/python-fire)'s [chaining function call](https://github.com/google/python-fire/blob/master/docs/guide.md#chaining-function-calls) to provide a flexible command line interface.
You can see it as builder pattern for command line tools.

For command that supports chaining function calls, you can chain multiple commands with `-` to build your desired command.
You will find many examples in `Examples` section below.

### Thinking in oh-my-batch
[English](./doc/thinking-in-omb.md) | [中文](./doc/thinking-in-omb.zh.md)

> Note that the document is not yet complete. 

## Examples
* [TESLA workflow](./examples/tesla/): A customizable active learning workflow for training machine learning potentials.
* [TESLA PIMD workflow](./examples/tesla-pimd/): A customizable active learning workflow for training machine learning potentials with path integral molecular dynamics.
* [LAMMPS benchmark](./examples/lammps-benchmark/): Find out the best MPI and OpenMP setup for LAMMPS through benchmarking.


## Use cases

### Generate files from different combinations of parameters

It's common to generate files with different combinations of parameters in scientific computing.
For example, you have 3 LAMMPS data files in `tmp` directory: `tmp/1.data`, `tmp/2.data`, `tmp/3.data`.
And you want to generate a series of input files with different parameters,
for example, different temperatures 300K, 400K, 500K, against each data file.

In this case, you can use `omb combo` command to generate a series of input files for you.

```bash
# prepare fake data files
mkdir -p tmp/
touch tmp/1.data tmp/2.data tmp/3.data

# prepare a lammps input file template
cat > tmp/in.lmp.tmp <<EOF
read_data @DATA_FILE
velocity all create @TEMP @RANDOM
run 1000
EOF

# prepare a run script template
cat > tmp/run.sh.tmp <<EOF
cat in.lmp  # simulate running lammps
EOF

# generate input files
omb combo \
    add_files DATA_FILE tmp/*.data - \
    add_var TEMP 300 400 500 - \
    add_randint RANDOM -n 3 -a 1 -b 1000 - \
    set_broadcast RANDOM - \
    make_files tmp/tasks/{i}-T-{TEMP}/in.lmp --template tmp/in.lmp.tmp - \
    make_files tmp/tasks/{i}-T-{TEMP}/run.sh --template tmp/run.sh.tmp --mode 755 - \
    done
```

The above script will generate 9 folders in `tmp/tasks` directory
with names from `0-T-300`, `1-T-400`, `2-T-500`, `3-T-300` to `8-T-500`.
Each folder will contain a `in.lmp` file and a `run.sh` file.

The 9 folders are the combinations of 3 data files and 3 temperatures,
and each input file will have a independent random number between 1 and 1000 as `RANDOM`.

You can run the about script by `./examples/omb-combo.sh`,
and you can also run `omb combo --help` to see the detailed usage of `combo` command.

### Generate batch scripts from multiple working directories
It's common to submit a lot of jobs to a job scheduler. `omb batch` is designed to help you generate batch scripts from multiple working directories and package them into several batch scripts.

Let's continue the above example, now you have 9 folders in `tmp/tasks` directory.
You want to package them into 2 batch scripts to submit to a job scheduler.

You can use `omb batch` to generate batch scripts for you like this:

```bash
cat > tmp/lammps_header.sh <<EOF
#!/bin/bash
#SBATCH -J lmp
#SBATCH -n 1
#SBATCH -t 1:00:00
EOF

omb batch \
    add_work_dirs tmp/tasks/* - \
    add_header_files tmp/lammps_header.sh - \
    add_cmds "./run.sh" - \
    make tmp/lmp-{i}.slurm --concurrency 2
```

You will find batch scripts `tmp/lmp-0.slurm` and `tmp/lmp-1.slurm` in `tmp` directory.

You can run the above script by `./examples/omb-batch.sh`,

### Track the state of job in job schedular

Let's continue the above example, now you have submitted the batch scripts to the job scheduler.
In this case, you can use `omb job` to track the state of the jobs.

```bash
omb job slurm submit tmp/*.slurm --max_tries 3 --wait --recovery lammps-jobs.json
```

The above command will submit the batch scripts to the job scheduler,
and wait for the jobs to finish. If the job fails, it will retry for at most 3 times.

The `--recovery` option will save the job information to `lammps-jobs.json` file.
If `omb job` is interrupted, you can rerun the exact same command to recover the job status,
so that you don't need to resubmit the jobs that are still running or completed.


## Shell tips
`oh-my-batch` is intended to help you implement computational workflows with shell scripts.
To make the best use of `oh-my-batch`, you need to know some shell tips.

* [Retry commands until success in shell script](https://stackoverflow.com/a/79191004/3099733)
* [Run multiple line shell script with ssh](https://stackoverflow.com/a/32082912/3099733)
