Metadata-Version: 2.1
Name: deeva
Version: 1.0.0
Summary: Object Detection Data Analysis Toolbox
Home-page: https://github.com/vbyan/DEEVA
Author: Vahram Babajanyan, 
Author-email: vahram.babadjanyan@gmail.com
Keywords: deeva,object-detection,analytics,visualization,datasets,toolkit,statistics,detection,streamlit,plotly
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: POSIX :: Linux
Classifier: Operating System :: MacOS
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
License-File: AUTHORS.txt
Requires-Dist: kaleido==0.2.1
Requires-Dist: lxml==5.2.2
Requires-Dist: numpy==2.1.3
Requires-Dist: opencv-python==4.10.0.84
Requires-Dist: packaging==24.2
Requires-Dist: pandas==2.2.3
Requires-Dist: pillow==11.0.0
Requires-Dist: pipenv==2024.4.0
Requires-Dist: plotly==5.24.1
Requires-Dist: scikit-learn==1.5.2
Requires-Dist: screeninfo==0.8.1
Requires-Dist: stqdm==0.0.5
Requires-Dist: streamlit==1.40.1
Requires-Dist: streamlit-extras==0.5.0
Requires-Dist: streamlit-image-select==0.6.0
Requires-Dist: tables==3.10.1
Requires-Dist: toml==0.10.2
Requires-Dist: tqdm==4.67.0
Requires-Dist: pytest==8.3.3

# Deeva 🚀

**Your Smart Analytics Companion for Object Detection Datasets**

## 🎯 Overview

**Deeva** is a powerful yet easy-to-use analytics toolkit that makes exploring **Object Detection** datasets a breeze, whether you're just starting out or a seasoned pro. 

Built with **Streamlit**, it offers an intuitive interface packed with features that let you dive into your data quickly or take a deeper look when you need it.
Deeva is designed to simplify data exploration and reporting, so you can get meaningful insights without the hassle.

### Key Features

- **💻 Run locally**: Launch effortlessly on your **_local machine_** for seamless, offline use.
- **🚀 Instant Setup**: Quickly start visualizing data by pointing Deeva to a specific dataset folder.
- **📊 Rich Interactive Dashboards**: Build insightful, interactive dashboards for rich data exploration with minimal effort.
- **🎨 Customizable CLI**: Use simple command-line commands to launch Deeva with flexible paths and configurations.
- **💾 Smart Caching**: Efficient processing with intelligent data caching for large datasets
- **🎲 Built-in Toy Datasets**: Quickly get started with the included `coco128` dataset, perfect for initial experimentation.

## 🛠 Installation

install with **pip**:

```bash
$ pip install deeva
```

Alternatively, use a **virtual environment** (recommended):

```bash
$ python3 -m venv myenv
$ source myenv/bin/activate

$ pip install deeva
```

## ⚡ Quickstart
After installation, launch **Deeva** by running:

```bash
$ deeva start
```

This will open the **input page** where you can specify the **data path**.

<img src="https://raw.githubusercontent.com/vbyan/deeva/main/assets/input_page.gif"></img>

### Data structure

Your dataset **folder** should look like this:

```plaintext
data-path/
├── images/        # Folder containing image files (e.g., .jpg, .png)
├── labels/        # Folder containing label files (e.g., .txt, .xml)
└── labelmap.txt   # A file mapping class IDs to class labels (optional)
```

## 💡 Insights & Analytics

Deeva offers a powerful set of **statistical insights** to give you a detailed understanding of your dataset, including:

### 1. **File Matching and Integrity**

<br> <img src="https://raw.githubusercontent.com/vbyan/deeva/main/assets/data_match.gif">

- **Image-Label Matching**: Calculates how many images have **corresponding labels** (and vice versa).
- **Filename Consistency**: Identifies **misaligned or corrupted files** in images and labels.
- **Data Cleaning**: Provides tools to **identify and isolate** mismatched or corrupted files.


### 2. **Dataset Overview**

<br> <img src="https://raw.githubusercontent.com/vbyan/deeva/main/assets/overall.gif">

- **File Formats & Backgrounds**: View format distribution (`yolo` vs. `voc`, `jpeg` vs. `png`).
- **Class Distribution**: Displays instance counts and images per class, highlighting any **class imbalances**.
- **Class Co-occurrence**: Shows how frequently different classes **appear together**.


### 3. **Annotation Insights**

<br> <img src="https://raw.githubusercontent.com/vbyan/deeva/main/assets/annotations.gif">

- **Bounding Box Analysis**: Provides insights into box center, width/height, and median box sizes.
- **Box Size Distribution**: Analyzes box size categories with **adjustable thresholds** for small, medium, and large sizes.


### 4. **Image Statistics**

<br> <img src="https://raw.githubusercontent.com/vbyan/deeva/main/assets/images.gif">

- **Color Analysis**: Displays **dominant colors** and their tones extracted from images.
- **Image Dimensions**: Examines **height, width,** and aspect ratios across your dataset.
- **CBS (Contrast, Brightness, Saturation)**: Shows **contrast, brightness, and saturation** distributions across the dataset.


### 5. **Overlap Statistics**

<br> <img src="https://raw.githubusercontent.com/vbyan/deeva/main/assets/overlaps.gif">

- **Cases**: Classify and **cluster** overlapping instances from two specific classes into `n` predefined cases. Display representative **example images** for each case to help visualize typical overlap patterns.
- **Ratios**: Calculate and visualize the overlap ratio distributions for each class
- **With/without overlaps**: Present a side-by-side comparison of images and co-occurrences with and without overlaps


## 🔖 **Caching & Version control**

Deeva employs efficient **caching** to streamline your data processing workflow. For large datasets, users have the option to **sample a subset of the data**—allowing for quicker initial exploration. 

Data extracted during time-consuming operations can be saved as a **dataframe on disk** for effortless access in future sessions, enabling a faster, more efficient experience by skipping redundant processing steps.

To **track different versions** of your dataset you need to simply put them into different folders and **Deeva** will do the rest


## 🌟 **Contributing**

**Deeva** welcomes contributions! If you have ideas or want to add new features, please feel free to open a pull request or start a discussion on **GitHub**. 

## License

Deeva is completely free and open-source and licensed under the [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) license.



