Metadata-Version: 2.4
Name: msfiddle
Version: 2.0.0
Summary: A package for predicting chemical formulas from tandem mass spectra
Home-page: https://github.com/JosieHong/msfiddle
Author: Yuhui Hong
Author-email: josieexception@outlook.com
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: OS Independent
Classifier: Topic :: Scientific/Engineering :: Chemistry
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Classifier: Topic :: Scientific/Engineering :: Medical Science Apps.
Requires-Python: >=3.6
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy<2.0.0,>=1.20.0
Requires-Dist: pandas>=2.0.0
Requires-Dist: tqdm>=4.60.0
Requires-Dist: pyyaml>=6.0
Requires-Dist: scikit-learn>=1.0.0
Requires-Dist: scipy>=1.8.0
Requires-Dist: pyarrow>=10.0.0
Requires-Dist: rdkit>=2022.03.5
Requires-Dist: molmass
Requires-Dist: pyteomics
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: license-file
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

# msfiddle

[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
[![PyPI](https://img.shields.io/pypi/v/msfiddle)](https://pypi.org/project/msfiddle/)
[![Documentation](https://readthedocs.org/projects/msfiddle/badge/?version=latest)](https://msfiddle.readthedocs.io)

Source code for the FIDDLE PyPI package featuring:

* Chemical formula prediction from MS/MS spectra
* Formula refinement with confidence score estimation
* Seamless integration with BUDDY and SIRIUS tools

Paper: https://www.nature.com/articles/s41467-025-66060-9

Documentation: https://msfiddle.readthedocs.io

For the complete experimental codes, please visit the GitHub repository: https://github.com/JosieHong/FIDDLE

## Installation

```bash
# Install msfiddle (without PyTorch)
pip install msfiddle
```

To use `msfiddle`, you need to install `torch` separately with the appropriate version for your system. Please refer to the official PyTorch installation guide:
🔗 [PyTorch Installation Guide](https://pytorch.org/get-started/locally/).

## Usage 

**Step 1**: Download pre-trained models

```bash
# Download models to the default location (~/.msfiddle/check_point)
msfiddle-download-models

# Or specify a custom location and models
msfiddle-download-models --destination /path/to/models \
                          --models fiddle_tcn_qtof fiddle_rescore_qtof
```

**Step 2**: Run predictions

Using demo data (simplest option): 

```bash
# Run prediction with the built-in demo data
msfiddle --demo --result_path ./output_demo.csv --device 0
```

Using your own data:

```bash
# Run prediction with your data - automatically selects appropriate model
msfiddle --test_data /path/to/data.mgf \
         --instrument_type orbitrap \
         --result_path /path/to/results.csv \
         --device 0
```

The `--instrument_type` parameter can be either `orbitrap` (default) or `qtof`. 

Below is an example of input MS/MS data formatted in `.mgf`. The fields `TITLE`, `PRECURSOR_MZ`, `PRECURSOR_TYPE`, and `COLLISION_ENERGY` are required for msfiddle processing: 

```mgf
BEGIN IONS
TITLE=EMBL_MCF_2_0_HRMS_Library000529
PEPMASS=111.02016
CHARGE=1-
PRECURSOR_TYPE=[M-H]-
PRECURSOR_MZ=111.02016
COLLISION_ENERGY=50.0
SMILES=[H]c1c([H])n([H])c(=O)n([H])c1=O
FORMULA=C4H4N2O2
THEORETICAL_PRECURSOR_MZ=111.019453
PPM=6.368253318682487
SIMULATED_PRECURSOR_MZ=111.01946768634916
41.0148 0.329893 
41.9986 89.226766 
55.8055 0.200544 
56.2625 0.194617 
67.0304 0.330612 
68.0258 0.402906 
111.0203 100.0 
112.0515 1.2809 
END IONS
```

### Additional Options

Show all model paths:

```bash
msfiddle-checkpoint-paths
```

Advanced usage with custom paths:

```bash
msfiddle --test_data /path/to/data.mgf \
         --config_path /path/to/config.yml \
         --resume_path /path/to/tcn_model.pt \
         --rescore_resume_path /path/to/rescore_model.pt \
         --result_path /path/to/results.csv \
         --device 0
```

## Citation

```
@article{hong2025fiddle,
  title={FIDDLE: a deep learning method for chemical formulas prediction from tandem mass spectra},
  author={Hong, Yuhui and Li, Sujun and Ye, Yuzhen and Tang, Haixu},
  journal={Nature Communications},
  volume={16},
  number={1},
  pages={11102},
  year={2025},
  publisher={Nature Publishing Group UK London}
}
```
