Metadata-Version: 2.3
Name: datasets-dump
Version: 0.1.0
Summary: A tool for dumping datasets from the Hugging Face datasets library
Author-email: Jacob Lim <jacoblincool@gmail.com>
License: MIT
Keywords: datasets
Requires-Python: >=3.10
Requires-Dist: datasets
Requires-Dist: numpy
Requires-Dist: pillow
Requires-Dist: soundfile
Requires-Dist: tqdm
Description-Content-Type: text/markdown

# datasets-dump

Dump embedded datasets to audio folder or images folder.

Get the audio folder / image folder back from parquet files.

## Usage

```bash
datasets-dump someone/dataset ./dist
```

Python API:

```python
def dump(
    dataset: Union[str, Dataset],
    dist: str | Path,
    audio_column: Optional[str] = None,
    image_column: Optional[str] = None,
    metadata_format: Literal["jsonl", "csv"] = "jsonl",
) -> None
```
