# -*- coding: utf-8 -*-
from setuptools import setup

packages = \
['csv_reconcile', 'csv_reconcile_dice']

package_data = \
{'': ['*']}

install_requires = \
['cython>=0.29.21,<0.30.0',
 'flask-cors>=3.0.10,<4.0.0',
 'flask>=1.1.2,<2.0.0',
 'normality>=2.1.1,<3.0.0']

extras_require = \
{':python_version >= "3.7" and python_version < "3.8"': ['importlib_metadata>=3.7.3,<4.0.0']}

entry_points = \
{'console_scripts': ['csv-reconcile = csv_reconcile:main'],
 'csv_reconcile.scorers': ['dice = csv_reconcile_dice']}

setup_kwargs = {
    'name': 'csv-reconcile',
    'version': '0.2.0',
    'description': 'OpenRefine reconciliation service backed by csv resource',
    'long_description': '\n# Table of Contents\n\n1.  [CSV Reconcile](#orga5c90a9)\n    1.  [Quick start](#org1b422ae)\n    2.  [Poetry](#org8b4627b)\n    3.  [Description](#org86e08ad)\n    4.  [Usage](#org55ecd97)\n    5.  [Common configuration](#orge7e9f64)\n    6.  [Scoring plugins](#org93437fb)\n        1.  [Implementing](#org02825f9)\n        2.  [Installing](#orgf9bbffa)\n        3.  [Using](#orgce3ac9e)\n    7.  [Future enhancements](#orga668c9a)\n\n\n<a id="orga5c90a9"></a>\n\n# CSV Reconcile\n\nA [reconciliation service](https://github.com/reconciliation-api/specs) for [OpenRefine](https://openrefine.org/) based on a CSV file similar to [reconcile-csv](http://okfnlabs.org/reconcile-csv/).  This one is written in Python and has some more configurability.\n\n\n<a id="org1b422ae"></a>\n\n## Quick start\n\n-   Clone this repository\n-   Run the service\n    \n        $ python -m venv venv                                             # create virtualenv\n        $ venv/bin/pip install dist/csv_reconcile-0.1.0-py3-none-any.whl  # install package\n        $ source venv/bin/activate                                        # activate virtual environment\n        (venv) $ csv-reconcile --init-db sample/reps.tsv item itemLabel   # start the service\n        (venv) $ deactivate                                               # remove virtual environment\n\nThe service is run at <http://127.0.0.1:5000/reconcile>.  You can point at a different host:port by\nadding [SERVER\\_NAME](https://flask.palletsprojects.com/en/0.12.x/config/) to the sample.cfg.  Since this is running from a virtualenv, you can simply\ndelete the whole lot to clean up.\n\nIf you have a C compiler installed you may prefer to install the sdist\n`dist/csv-reconcile-0.1.0.tar.gz` which will build a [Cython](https://cython.readthedocs.io/en/latest/) version of the computationally\nintensive fuzzy match routine for speed.\n\n\n<a id="org8b4627b"></a>\n\n## Poetry\n\nThis is packaged with [poetry](https://python-poetry.org/docs/), so you can use those commands if you have it installed.\n\n    $ poetry install\n    $ poetry run csv-reconcile --init-db sample/reps.tsv item itemLabel\n\n\n<a id="org86e08ad"></a>\n\n## Description\n\nThis reconciliation service uses [Dice coefficient scoring](https://en.wikipedia.org/wiki/S%C3%B8rensen%E2%80%93Dice_coefficient) to reconcile values against a given column\nin a [CSV](https://en.wikipedia.org/wiki/Comma-separated_values) file.  The CSV file must contain a column containing distinct values to reconcile to.\nWe\'ll call this the *id column*.  We\'ll call the column being reconciled against the *name column*.\n\nFor performance reasons, the *name column* is preprocessed to normalized values which are stored in\nan [sqlite](https://www.sqlite.org/index.html) database.  This database must be initialized at least once by passing the `--init-db` on\nthe command line.  Once initialized this option can be removed from subsequent runs.\n\nNote that the service supplies all its data with a dummy *type* so there is no reason to reconcile\nagainst any particular *type*.\n\nIn addition to reconciling against the *name column*, the service also functions as a [data extension\nservice](https://reconciliation-api.github.io/specs/latest/#data-extension-service), which offers any of the other columns of the CSV file.\n\nNote that Dice coefficient scoring is agnostic to word ordering.\n\n\n<a id="org55ecd97"></a>\n\n## Usage\n\nBasic usage requires passing the name of the CSV file, the *id column* and the *name column*.\n\n    $ poetry run csv-reconcile --help\n    Usage: csv-reconcile [OPTIONS] CSVFILE IDCOL NAMECOL\n    \n    Options:\n      --config TEXT  config file\n      --scorer TEXT  scoring plugin to use\n      --init-db      initialize the db\n      --help         Show this message and exit.\n    $\n\nIn addition to the `--init-db` switch mentioned above you may use the `--config` option to point to\na configuration file.  The file is a [Flask configuration](https://flask.palletsprojects.com/en/1.1.x/config/) and hence is Python code though most\nconfiguration is simply setting variables to constant values.\n\n\n<a id="orge7e9f64"></a>\n\n## Common configuration\n\n-   `SERVER_NAME`  - The host and port the service is bound to.\n    e.g. `SERVER_NAME=localhost:5555`.  ( Default localhost:5000 )\n-   `CSVKWARGS`  - Arguments to pass to [csv.reader](https://docs.python.org/3/library/csv.html).\n    e.g. `CSVKWARGS={\'delimiter\': \',\',  quotechar=\'"\'}` for comma delimited files using `"` as quote character.\n-   `CSVECODING` - Encoding of the CSV file.\n    e.g. `CSVECODING=\'utf-8-sig\'` is the encoding used for data downloaded from [GNIS](https://www.usgs.gov/core-science-systems/ngp/board-on-geographic-names/download-gnis-data).\n-   `SCOREOPTIONS`  - Options passed to scoring plugin during normalization.\n    e.g. `SCOREOPTIONS={\'stopwords\':[\'lake\',\'reservoir\']}`\n-   `LIMIT`      - The maximum number of reonciliation candidates returned per entry.  ( Default 10 )\n    e.g. `LIMIT=10`\n-   `THRESHOLD`  - The minimum score for returned reconciliation candidates.  ( Default 30.0 )\n    e.g. `THRESHOLD=80.5`\n-   `DATABASE`   - The name of the generated sqlite database containing pre-processed values.  (Default `csvreconcile.db`)\n    e.g. `DATABASE=\'lakes.db\'`  You may want to change the name of the database if you regularly switch between databases being used.\n-   `MANIFEST`   - Overrides for the service manifest.\n    e.g. `MANIFEST={"name": "My service"}` sets the name of the service to "My service".\n\nThis last is most interesting.  If your data is coming from [Wikidata](https://www.wikidata.org) and your *id column*\ncontains [Q values](https://www.wikidata.org/wiki/Help:Items), then a manifest like the following will allow your links to be clickable inside OpenRefine.\n\n    MANIFEST = {\n      "identifierSpace": "http://www.wikidata.org/entity/",\n      "schemaSpace": "http://www.wikidata.org/prop/direct/",\n      "view": {"url":"https://www.wikidata.org/wiki/{{id}}"},\n      "name": "My reconciliation service"\n    }\n\nIf your CSV is made up of data taken from another [reconciliation service](https://reconciliation-api.github.io/testbench/), you may similiarly copy\nparts of their manifest to make use of their features, such as the [preview service](https://reconciliation-api.github.io/specs/latest/#preview-service).  See the\nreconciliation spec for details.\n\n\n<a id="org93437fb"></a>\n\n## Scoring plugins\n\nAs mentioned above the default scoring method is to use [Dice coefficient scoring](https://en.wikipedia.org/wiki/S%C3%B8rensen%E2%80%93Dice_coefficient), but this method\ncan be overridden by implementing a `cvs_reconcile.scorers` plugin.\n\n\n<a id="org02825f9"></a>\n\n### Implementing\n\nA plugin module may override any of the methods in the `csv_reconcile.scorers` module by simply\nimplementing a method of the same name with the decorator `@cvs_reconcile.scorer.register`.\n\nSee `csv_reconcile_dice` for how Dice coefficient scoring is implemented.\n\nThe basic hooks are as follows:\n\n-   `normalizedWord(word, **scoreOptions)` preprocesses values to be reconciled to produce a\n    tuple used in fuzzy match scoring.  Note that this preprocessing happens for both the value to\n    be reconciled and to all the values in the csv column to be reconciled against.  The value of\n    `SCOREOPTIONS` in the configuration will be passed in to allow configuration of this\n    preprocessing.  This hook is required.\n-   `getNormalizedFields()` returns a tuple of names for the columns produced by `normalizeWord()`.\n    The length of the return value from both functions must match.  This hook is required.\n-   `processScoreOptions(options)` is passed the value of `SCOREOPTIONS` to allow it to be adjusted\n    prior to being used.  This can be used for adding defaults and/or validating the configuration.\n    This hook is optional\n-   `scoreMatch(left, right)` gets passed two tuples as returned by `normalizedWord()`.  The `left`\n    value is the value being reconciled and the `right` value is the value being reconciled\n    against.  This hook is required.\n-   `valid(normalizedFields)` is passed the normalized tuple prior to being scored to make sure\n    it\'s appropriate for the calculation.  This hook is optional.\n\n\n<a id="orgf9bbffa"></a>\n\n### Installing\n\nHooks are automatically discovered as long as they provide a `csv_reconcile.scorers` [setuptools\nentry point](https://setuptools.readthedocs.io/en/latest/userguide/entry_point.html).  Poetry supplies a [plugins](https://python-poetry.org/docs/pyproject/#plugins) configuration which wraps the setuptools funtionality.\n\nThe default Dice coefficent scoring is supplied via the following snippet from `pyproject.toml`\nfile.\n\n    [tool.poetry.plugins."csv_reconcile.scorers"]\n    "dice" = "csv_reconcile_dice"\n\nHere `dice` becomes the name of the scoring option and `csv_reconcile_dice` is the package\nimplementing the plugin.\n\n\n<a id="orgce3ac9e"></a>\n\n### Using\n\nIf there is only one scoring plugin available, that plugin is used.  If there are more than one\navailable, you will be prompted to pass the `--scorer` option to select among the scoring options.\n\n\n<a id="orga668c9a"></a>\n\n## Future enhancements\n\nIt would be nice to add support for using [properties](https://reconciliation-api.github.io/specs/latest/#structure-of-a-reconciliation-query) as part of the scoring, so that more than\none column of the csv could be taken into consideration.\n\n',
    'author': 'Douglas Mennella',
    'author_email': 'trx2358-pypi@yahoo.com',
    'maintainer': None,
    'maintainer_email': None,
    'url': 'https://github.com/gitonthescene/csv-reconcile',
    'packages': packages,
    'package_data': package_data,
    'install_requires': install_requires,
    'extras_require': extras_require,
    'entry_points': entry_points,
    'python_requires': '>=3.7,<4.0',
}
from build import *
build(setup_kwargs)

setup(**setup_kwargs)
