Metadata-Version: 2.1
Name: wordcloud-mapper
Version: 0.2.0
Summary: A package for creating wordcloud maps in Python.
Home-page: https://github.com/GabZech/wordcloud_mapper
Author: Gabriel da Silva Zech
Author-email: g.dev@posteo.net
License: GNU General Public License v3
Project-URL: Documentation, https://gabzech.github.io/wordcloud_mapper
Keywords: wordcloud_mapper
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3)
Classifier: Natural Language :: English
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Requires-Python: >=3.8
Description-Content-Type: text/x-rst
License-File: LICENSE

================
WordCloud_Mapper
================


.. image:: https://img.shields.io/pypi/v/wordcloud_mapper.svg
        :alt: PyPI - Version
        :target: https://pypi.python.org/pypi/wordcloud_mapper

.. image:: https://img.shields.io/github/pipenv/locked/python-version/GabZech/wordcloud_mapper
        :alt: GitHub Pipenv locked Python version

.. image:: https://img.shields.io/pypi/l/wordcloud_mapper
        :alt: PyPI - License
        :target: https://github.com/GabZech/wordcloud_mapper/blob/main/LICENSE

.. image:: https://img.shields.io/github/repo-size/GabZech/wordcloud_mapper?color=white
        :alt: GitHub repo size

`WordCloud_Mapper` is a Python package that allows one to **create wordclouds shaped like regions on a map**.

Such visualisations are especially useful when communicating sets of data that consist of many different observations and each observation is attributed to a specific region and size of occurrence. Take the example below, a dataset containing the name of the biggest companies (in terms of estimated number of employees in 2019) for each state in Germany.

|

.. image:: https://github.com/GabZech/wordcloud_mapper/raw/main/docs/figures/germany_nuts1.png

Installation
------------

To install `WordCloud_Mapper`, run in your terminal:

.. code-block:: console

    pip install wordcloud_mapper

or

.. code-block:: console

    pip install wordcloud-mapper


Features and usage
------------------

* **Create a wordcloud map** from data stored in a DataFrame object using `wordcloud_map() <https://gabzech.github.io/wordcloud_mapper/build/html/functions.html#>`_.
* **Easily resize a map** by any desired scaling factor using `resize_map() <https://gabzech.github.io/wordcloud_mapper/build/html/functions.html#resize-map>`_.
* **Load dummy datasets** to test out the package's features using `load_companies() <https://gabzech.github.io/wordcloud_mapper/build/html/functions.html#load-companies>`_.
* **Calculate how unique a word is** to a particular region in comparison to other regions by calculating the Term Frequency â€” Inverse Document Frequency (TF-IDF) score for each word in each region using `calc_tfidf() <https://gabzech.github.io/wordcloud_mapper/build/html/functions.html#calc-tfidf>`_.

See the `documentation <https://GabZech.github.io/wordcloud_mapper>`_ for more information on how to use the package and its functions.


Notes on geographical nomenclature
----------------------------------

The classification of regions used here follows the European Union's Nomenclature of Territorial Units for Statistics (`NUTS <https://en.wikipedia.org/wiki/Nomenclature_of_Territorial_Units_for_Statistics>`_), a geocode standard for referencing the subdivisions of countries. The advantage of using this system is that the classification of regions across countries is **standardised and hierarchically structured**. For instance, Germany has the base code *DE* (NUTS 0), the state of Bavaria has the code *DE2* (NUTS 1), its subregion of Oberbayern has the code *DE21* (NUTS 2) and the city of Munich has the code *DE212* (NUTS 3). Since each region is given a unique identifier which is directly linked to the regional level above it, it is fairly easy to identify and match any dataset to these regions.

However, this means that **this package currently only works for creating wordcloud maps for EU countries**. For an overview of the NUTS regions and levels, you can browse the available `maps for each EU country <https://ec.europa.eu/eurostat/web/nuts/nuts-maps>`_ or use `this interactive map <https://ec.europa.eu/statistical-atlas/viewer/?config=typologies.json&>`_ instead. If you have a dataset containing postcodes and want to convert these to NUTS regions, you can find the `correspondence tables here <https://ec.europa.eu/eurostat/web/nuts/correspondence-tables/postcodes-and-nuts>`_.

In a future release, support nor non-NUTS regional referencing systems will be implemented.

Feedback and contributions
--------------------------

This package is under active development, so any feedback, recommendations, suggestions or contribution requests are more than welcome!

Please read the contribution instructions or email g.dev@posteo.net if you would like to provide any feedback.


=======
History
=======

0.1.0 (2022-07-27)
------------------

* First release on PyPI.


0.2.0 (2022-09-11)
---------------------------------

New functionality:

* Add new function ``calc_tfidf()`` to calculate TF-IDF score of each word in each region in a dataframe.
* Add wordcloud colour generating function based on rank of words.
* Add colour_hue parameter to wordcloud_map() allowing users to choose one specific colour hue for all regions.


Parameters exposed to users:

* Allow users to change the parameters when downloading NUTS shapefiles from Eurostat's API in wordcloud_map().
* Allow users to change the sharpness of the regional border lines by channging the DPI value used when creating the masks.
* Allow users to use shapefiles form a local filepath instead of downloading from GISCO's online database.

Others:

* Change default coordination system when downloading shapefiles.
