Metadata-Version: 2.1
Name: PinPoint
Version: 0.2.1
Summary: A fast geo toolkit for academic affiliation strings
Home-page: http://pinpoint.eckert.science
Author: Hagen Eckert
Author-email: pinpoint@eckert.science
License: MIT
Download-URL: https://bitbucket.org/nathan-diodan/pinpoint/get/0.2.1.zip
Description: # PinPoint
        
        [![PyPI](https://img.shields.io/pypi/v/pinpoint.svg)](https://pypi.org/project/PinPoint/)
        [![PyPI - License](https://img.shields.io/pypi/l/pinpoint.svg)](https://bitbucket.org/nathan-diodan/pinpoint/src/master/LICENSE.txt)
        [![Codeship](https://img.shields.io/codeship/d6849780-6081-0136-b9bb-0a7a2647bd02.svg)](https://app.codeship.com/projects/296298)
        [![Bitbucket issues](https://img.shields.io/bitbucket/issues/nathan-diodan/pinpoint.svg)](https://bitbucket.org/nathan-diodan/pinpoint/issues?status=new&status=open)
        [![PyPI - Status](https://img.shields.io/pypi/status/pinpoint.svg)](https://pypi.org/project/PinPoint/)
        
        
        PinPoint is a fast geo toolkit for academic affiliation strings.
        It provides the following base functions:
        
        * find a location (information about mapped city and country)
        * calculate the apparent location and cooperation distance for a list of weighted affiliation strings
        
        ## Install
        Install and update using pip
        
        ```
        pip install pinpoint
        ```
        
        ## Usage
        
        ```python
        from pinpoint import Locator
        loc = Locator()
        ```
        
        The first time `Locator` is initialized the lookup tables and databases need to be created.
        For this four files are downloaded from [GeoNames](http://www.geonames.org) [dump](http://download.geonames.org/export/dump/) (~ 150MB) and optimized:
        
        * cities1000.zip
        * admin1CodesASCII.txt
        * countryInfo.txt
        * alternateNames.zip
        
        It is possible to rebuild the database at a later date:
        ```python
        from pinpoint import Locator
        loc = Locator(refresh=True)
        ```
        The data will not be downloaded again from [GeoNames](http://www.geonames.org) if the cached files are younger than a week, to avoid unnecessary load on their servers.
        The databases and cached files are stored in the appropriate folders depending on your operating system.
        If necessary, you can empty them by hand.
        
        ```python
        from pinpoint import Locator
        print(Locator.resources_dir)
        print(Locator.resources_cache_dir)
        ```
        
        ### Find a location
        
        ```python
        test_string = "Department of Chemical and Biomolecular Engineering, Rice University, Houston, TX, United States"
        country, region, city = loc.find(test_string)
        ```
        
        This function returns either a `dict()` or `None` for each the country, region, and city.
        The following information is returned based on the data from [GeoNames](http://www.geonames.org):
        
        * county
          * `'a2'` ISO 3166-1 alpha-2 counry code
          * `'a3'` ISO 3166-1 alpha-3 counry code
          * `'n3'` ISO 3166-1 numeric counry code
          * `'name'`
          * `'short_name_list'` short name variants
          * `'name_list'` name in different languages
          * `'capital'`
          * `'continent'`
          * `'area'` in square kilometer
          * `''population'`
          * `'geonameid'` unique id given by [GeoNames](http://www.geonames.org)
        * region (just used for USA and Canada at the moment)
          * `'name'`
          * `'short_name_list'` short name variants
          * `'name_list'` name in different languages
          * `'region_code'`
          * `'a2'` ISO 3166-1 alpha-2 counry code
          * `'geonameid'` unique id given by [GeoNames](http://www.geonames.org)
        * city
          * `'name'`
          * `'asciiname'`
          * `'name_list'` name in different languages
          * `'latitude'`
          * `'longitude'`
          * `'a2'` ISO 3166-1 alpha-2 counry code
          * `'admin1_code'`
          * `'elevation'` and `'dem'` are linked to the elevation in meter
          * `'timezone'`
          * `'geonameid'` unique id given by [GeoNames](http://www.geonames.org)
        
        ### Calculate the apparent location and cooperation distance
        
        Based on a weighted list of affiliations, an apparent location for a scientific document can be calculated.
        
        ```python
        from pinpoint import Locator
        loc = Locator()
        
        weighted_affiliations = {
            "Dresden Center for Computational Material Science, Technische Universität Dresden, Dresden, Germany": 2,
            "Department of Chemical and Biomolecular Engineering, Rice University, Houston, TX, United States": 1,
            "Nanoscience and Nanotechnology Center, Institute of Scientific and Industrial Research (ISIR), Osaka University, 8-1 Mihogaoka, Ibaraki, Osaka, Japan": 0.5,
            "Centro/Departamento de Física da Universidade do Minho, Campus de Gualtar, 4710-057 Braga, Portugal": 0.5,
            }
        
        cooperation_distance, apparent_location = loc.calculate_str(weighted_affiliations)
        ```
        
        The cooperation distance is returned in kilometers.
        If the coordinates are already known, the calculation can be done directly, without the need to initialize the resources.
        
        ```python
        Locator.calculate_coordinates(weighted_coordinates)
        ```
        
        ## redis subsystem
        The underlying architecture of pinpoint is not well suited for the use in a system that spawns many processes or threads.
        To enable its use under such conditions, the application data can be separated from the search logic.
        
        The lookup tables and location databases are then stored in a [redis](https://redis.io) database (>4.0).
        After the [installation](https://redis.io/topics/quickstart) two additional python packages are needed:
        
        ```
        pip install redis
        pip install hiredis
        ```
        
        The way to interact with pinpoint does not change by using the redis subsystem.
        When `Locator` is initialized the value of `server` needs to be set to `True`.
        
        ```python
        from pinpoint import Locator
        loc = Locator(server=True)
        ```
        
        If different settings for redis server are needed, `server` can be set to a dictionary containing the settings.
        The allowed keys are the same as listed in the redis-py [documentation](https://redis-py.readthedocs.io/en/latest/index.html#redis.Redis).
        
        ```python
        from pinpoint import Locator
        loc = Locator(server={'host': 'localhost', 'port': 6379, 'db': 0})
        ```
        
        This approach is noticeable slower when directly compared to the default implementation.
        It should be just used if multiple instances of pinpoint need to run in parallel.
        
        ## Examples
        
        Various examples can be found in the *extra* folder of the [source distribution](https://bitbucket.org/nathan-diodan/pinpoint/src/master/extras/README.md).
        
        
Keywords: collaboration metrics,weighted distance,cooperation distance,apparent location,affiliation strings,bibliometrics
Platform: UNKNOWN
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3 :: Only
Requires-Python: ~=3.6
Description-Content-Type: text/markdown
