Metadata-Version: 2.4
Name: das2numpy
Version: 1.1
Summary: A simple and universal package for loading large amounts of distributed acoustic sensing (DAS) data.
Author-email: Erik Genthe <erik.genthe@desy.de>
Project-URL: Homepage, https://git.physnet.uni-hamburg.de/wave/das2numpy
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3)
Classifier: Operating System :: OS Independent
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy
Requires-Dist: ffmpeg-python
Requires-Dist: h5py
Requires-Dist: scipy
Requires-Dist: numba
Dynamic: license-file

# Module for loading Distributed Acoustic Sensing (DAS) data. SILIXA / OPTASENSE



## Install

You can install via PIP.
```
python -m pip install das2numpy
```

To load data from flac files, ffmpeg (https://ffmpeg.org) needs to be installed. It is not possible to install ffmpeg with pip.

On DESY's Maxwell cluster ffmpeg is available as a module. Before using das2numpy execute:
```
module load maxwell ffmpeg
```




## Python API

Example: If you want to get started quickly, have a look at the [example.py](src/example.py).

Create an instance with:

```python
def loader(root_path:str, predefined_setup:str, num_worker_threads):
```
```
    Loads data and returns it as a numpy array. 
    Args:
        root_path (str): Path to directory that contains the files to be loaded from. Subdirectories are (recursively) also searched.
        predefined_setup (str): One of ["SILIXA", "FLAC_200HZ", "OPTASENSE"]
        num_worker_threads (int): The number of worker threads used for loading files in parallel.
    Returns:
        A loader instance to load data. Call instance.load_array(...).
```

Use one of the load_array(..) functions of that instance.

```python
def load_array(t_start:datetime, t_end:datetime, channel_start:int, channel_end:int) -> NP.ndarray:
```
```
Loading data into numpy array.
Returns nothing, the data can be accessed by accessing the data field of this instance.
Warning: using a different value then 1 for t_step or channel_step can result in a high cpu-usage.
        Consider using multithreaded=True in the constructor and a high amount of workers if needed.
Args:
    t_start (datetime): datetime object which defines the start of the data to load.
    t_end (datetime): datetime object which defines the end of the data to load.
    channel_start (int): The starting index of the sensor position in the data (inclusive).
    channel_end (int): The ending index of the sensors position in the data (exclusive).
    t_step (int): Reduces the data on the time axis by factor t_step. Uses mean averaging. Default is 1. 
    channel_step (int): Like t_step, but for the sensor position.
Returns:
    A 2d-numpy-array containing the data.
    The first axis corresponds to the time, the second to the channel (sensor position)
 ```

For more details have a look at the inline documentation of [chunk.py](src/das2numpy/chunk.py)


## Command Line Interface

Creates a numpy file from the requested data. Optionally, the binary data can be printed to stdout.

Example call:
```
python -m das2numpy "SILIXA" /pnfs/desy.de/m/project/iDAS/raw/2024-DESY/2024-07-23-desy 2024-07-23T10:01:00 2024-07-23T10:02:00 10 0 1000 10 default
```

For more information:
```
python -m das2numpy -h
```


## Issues

- Loading from OPTASENSE may not work anymore. I haven't tested it for a long time.
