Metadata-Version: 2.1
Name: hydra-api
Version: 1.0.0
Summary: Official client for Siftrics' Hydra API, which is a text recognition documents-to-database service
Home-page: https://siftrics.com/
Keywords: hydra,api,text,recognition,ocr,computer,vision,siftrics,document,database
Author: Siftrics Founder
Author-email: siftrics@siftrics.com
Requires-Python: >=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*
Classifier: Programming Language :: Python :: 2
Classifier: Programming Language :: Python :: 2.7
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Requires-Dist: requests (>=2.23.0,<3.0.0)
Project-URL: Repository, https://github.com/siftrics/hydra-python
Description-Content-Type: text/markdown

This repository contains the official [Hydra API](https://siftrics.com/) Python client. The Hydra API is a text recognition service.

# Quickstart

1. Install the package.

```
pip install hydra-api
```

or

```
poetry add hydra-api
```

etc.

1. Create a new data source on [siftrics.com](https://siftrics.com/).
2. Grab an API key from the page of your newly created data source.
3. Create a client, passing your API key into the constructor.
4. Use the client to processes documents, passing in the id of a data source and the filepaths of the documents.

```
import hydra_api

client = hydra_api.Client('xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx')

rows = client.recognize('my_data_source_id', ['invoice.pdf', 'receipt_1.png'])
```

`rows` looks like this:

```
[
  {
    "Error": "",
    "FileIndex": 0,
    "RecognizedText": { ... }
  },
  ...
]
```

`FileIndex` is the index of this file in the original request's "files" array.

`RecognizedText` is a dictionary mapping labels to values. Labels are the titles of the bounding boxes drawn during the creation of the data source. Values are the recognized text inside those bounding boxes.


## Faster Results

`client.recognize(dataSourceId, files, doFaster=False)` has a parameter, `doFaster`, which defaults to `False`, but if it's set to `True` then Siftrics processes the documents faster at the risk of lower text recognition accuracy. Experimentally, doFaster=true seems not to affect accuracy when all the documents to be processed have been rotated no more than 45 degrees.

## Official API Documentation

Here is the [official documentation for the Hydra API](https://siftrics.com/docs/hydra.html).

# Apache V2 License

This code is licensed under Apache V2.0. The full text of the license can be found in the "LICENSE" file.

