Metadata-Version: 2.1
Name: ultocr
Version: 0.2.2
Summary: text detection + text recognition
Home-page: https://github.com/cuongngm/text-in-image
Author: cuongngm
Author-email: cuonghip0908@gmail.com
License: UNKNOWN
Description: ### Quickstart
        ```bash
        pip install torch==1.7.0+cu101 torchvision==0.8.1+cu101 -f https://download.pytorch.org/whl/torch_stable.html
        pip install --upgrade ultocr  # install our project with package
        
        # for inference phase
        from ultocr.inference import OCR
        from PIL import Image
        model = OCR(det_model='DB', reg_model='MASTER')
        image = Image.open('..')  # ..is the path of image
        result = model.get_result(image)
        ```
        
        ### Install
        ```bash
        git clone https://github.com/cuongngm/text-in-image
        pip install torch==1.7.0+cu101 torchvision==0.8.1+cu101 -f https://download.pytorch.org/whl/torch_stable.html
        pip install -r requirements.txt
        ```
        ### Prepare data
        
        ### Train
        Custom params in each config file of config folder then:
        
        Single gpu training:
        ```bash
        python train.py --config config/db_resnet50.yaml --use_dist False
        # tracking with mlflow
        mlflow run text-in-image -P config=config/db_resnet50.yaml -P use_dist=False -P device=1
        ```
        Multi gpu training:
        ```bash
        # assume we have 2 gpu
        python -m torch.distributed.launch --nnodes=1 --node_rank=0 --nproc_per_node=2 --master_addr=127.0.0.1 --master_post=5555 train.py --config config/db_resnet50.yaml
        ```
        
        ### Serve and Inference
        ```bash
        python run.py
        ```
        Then, open your browser at http://127.0.0.1:8000/docs. Request url of the image, the result is as follows:
        <!--
        ![](assets/fastapi.png)
        ![](assets/fastapi.png)
        -->
        
        <div align=center>
        <img src="assets/fastapi.png" width="1000" height="150" />
        </div>
        
        ### Todo
        - [x] Multi gpu training
        - [x] Tracking experiments with Mlflow
        - [x] Model serving with FastAPI
        - [ ] Add more text detection and recognition model
        
        ### Reference
        - [DB_text_minimal](https://github.com/huyhoang17/DB_text_minimal)
        - [pytorchOCR](https://github.com/BADBADBADBOY/pytorchOCR)
        - [MASTER-pytorch](https://github.com/wenwenyu/MASTER-pytorch)
        - [DBNet.pytorch](https://github.com/WenmuZhou/DBNet.pytorch)
        
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.6
Description-Content-Type: text/markdown
