Metadata-Version: 2.1
Name: predictit
Version: 1.60.0
Summary: Library/framework for making predictions.
Home-page: https://github.com/Malachov/predictit
Author: Daniel Malachov
Author-email: malachovd@seznam.cz
License: mit
Description: # predictit
        
        [![PyPI pyversions](https://img.shields.io/pypi/pyversions/predictit.svg)](https://pypi.python.org/pypi/predictit/) [![PyPI version](https://badge.fury.io/py/predictit.svg)](https://badge.fury.io/py/predictit) [![Language grade: Python](https://img.shields.io/lgtm/grade/python/g/Malachov/predictit.svg?logo=lgtm&logoWidth=18)](https://lgtm.com/projects/g/Malachov/predictit/context:python) [![Build Status](https://travis-ci.com/Malachov/predictit.svg?branch=master)](https://travis-ci.com/Malachov/predictit) [![Documentation Status](https://readthedocs.org/projects/predictit/badge/?version=master)](https://predictit.readthedocs.io/en/master/?badge=master) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) [![codecov](https://codecov.io/gh/Malachov/predictit/branch/master/graph/badge.svg)](https://codecov.io/gh/Malachov/predictit)
        
        Library/framework for making predictions. Choose best of 20 models (ARIMA, regressions, LSTM...) from libraries like statsmodels, scikit-learn, tensorflow and some own models. There are hundreds of customizable options (it's not necessary of course) as well as some Config presets.
        
        Library contain model hyperparameters optimization as well as option variable optimization. That means, that library can find optimal preprocessing (smoothing, dropping non correlated columns, standardization) and on top of that it can find optimal models inner parameters such as number of neuron layers.
        
        ## Output
        
        Most common output is plotly interactive graph, numpy array of results or deploying to database.
        
        <p align="center">
        <img src="https://raw.githubusercontent.com/Malachov/predictit/master/output_example.png" width="620" alt="Plot of results"/>
        </p>
        <p align="center">
        <img src="https://raw.githubusercontent.com/Malachov/predictit/master/table_of_results.png" width="620" alt="Table of results"/>
        </p>
        
        Return type of main predict function depends on `configation.py`. It can return best prediction as array or all predictions as dataframe. Interactive html plot is also created.
        
        ## Oficial repo and documentation links
        
        [Repo on github](https://github.com/Malachov/predictit)
        
        [Official readthedocs documentation](https://predictit.readthedocs.io)
        
        ## Installation
        
        Python >=3.6. Python 2 is not supported. Install just with
        
            pip install predictit
        
        Sometime you can have issues with installing some libraries from requirements (e.g. numpy because not BLAS / LAPACK). There are also two libraries - Tensorflow and pyodbc not in requirements, because not necessary, but troublesome. If library not installed with pip, check which library don't work, install manually with stackoverflow and repeat...
        
        ## How to
        
        Software can be used in three ways. As a python library or with command line arguments or as normal python scripts.
        Main function is predict in `main.py` script.
        There is also `predict_multiple_columns` function if you want to predict more at once (columns or time frequentions) and also `compare_models` function that tell you which models are best. It evaluate error criterion on out of sample test data instead of predict (use as much as possible) so more reliable errors (for example decision trees just assign input from learning set, so error in predict is 0, in compare_models it's accurate). Then you can use only good models in predict function.
        
        You can setup prediction with Config. It' capitalize because it's class.
        
        ### Simple example of using predictit as a python library and function arguments
        
        ```Python
        import predictit
        import numpy as np
        
        predictions_1 = predictit.main.predict(data=np.random.randn(100, 2), predicted_column=1, predicts=3, return_type='best')
        ```
        
        ### Simple example of using as a python library and editing Config
        
        ```Python
        import predictit
        from predictit.configuration import Config
        
        # You can edit Config in two ways
        Config.data = 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/daily-min-temperatures.csv'  # You can use local path on pc as well... "/home/name/mycsv.csv" !
        Config.predicted_column = 'Temp'  # You can use index as well
        Config.datetime_column = 'Date'  # Will be used for resampling and result plot description
        Config.freq = "D"  # One day - one value
        Config.resample_function = "mean"  # If more values in one day - use mean (more sources)
        Config.return_type = 'detailed_dictionary'
        Config.debug = 0  # Ignore warnings
        
        # Or
        Config.update({
            'predicts': 14,  # Number of predicted values
            'default_n_steps_in': 12  # Value of recursive inputs in model (do not use too high - slower and worse predictions)
        })
        
        predictions_2 = predictit.main.predict()
        ```
        
        ### Simple example of using `main.py` as a script
        
        Open `configuration.py` (only script you need to edit (very simple)), do the setup. Mainly used_function and data or data_source and path. Then just run `main.py`.
        
        ### Simple example of using command line arguments
        
        Run code below in terminal in predictit folder.
        Use `python main.py --help` for more parameters info.
        
        ```
        python main.py --used_function predict --data 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/daily-min-temperatures.csv' --predicted_column "'Temp'"
        
        ```
        
        ### Explore Config
        
        Type `Config.`, then, if not autamatically, use ctrl + spacebar to see all posible values. To see what option means, type for example `Config.return_type`, than do mouseover with pressed ctrl. It will reveal comment that describe the option (at least at VS Code)
        
        To see all the possible values in `configuration.py`, use
        
        ```Python
        predictit.configuration.print_config()
        ```
        
        ### Example of compare_models function
        
        ```Python
        import predictit
        from predictit.configuration import Config
        
        my_data_array = np.random.randn(2000, 4)  # Define your data here
        
        # You can compare it on same data in various parts or on different data (check configuration on how to insert dictionary with data names)
        Config.update({
            'data_all': (my_data_array[-2000:], my_data_array[-1500:], my_data_array[-1000:])
        })
        
        compared_models = predictit.main.compare_models()
        ```
        
        ### Example of predict_multiple function
        
        ```Python
        import predictit
        from predictit.configuration import Config
        
        Config.data = pd.read_csv("https://raw.githubusercontent.com/jbrownlee/Datasets/master/daily-min-temperatures.csv")
        
        # Define list of columns or '*' for predicting all of the columns
        Config.predicted_columns = ['*']
        
        multiple_columns_prediction = predictit.main.predict_multiple_columns()
        
        ```
        
        ### Feature derivation
        
        It is possible to add new data that is derived from original. It can be running fourier transform maximum or two columns multiplication or rolling standard deviation.
        
        ### Categorical embedings
        
        It is also possible to use string values in predictions. You can choose Config values 'embedding' 'label' and every unique string will be assigned unique number, 'one-hot' create new column for every unique string (can be time consuming).
        
        ### Example of Config variable optimization
        
        ```Python
        
        Config.update({
            'data': "https://raw.githubusercontent.com/jbrownlee/Datasets/master/daily-min-temperatures.csv",
            'predicted_column': 'Temp',
            'return_type': 'all_dataframe',
            'optimization': 1,
            'optimization_variable': 'default_n_steps_in',
            'optimization_values': [4, 8, 10],
            'plot_all_optimized_models': 1,
            'print_table': 2  # Print detailed table
        })
        
        predictions_optimized_config = predictit.main.predict()
        
        ```
        
        ### Hyperparameters tuning
        
        To optmize hyperparameters, just set `optimizeit: 1,` and model parameters limits. It is commented in `Config.py` how to use it. It's not grid bruteforce. Heuristic method based on halving interval is used, but still it can be time consuming. It is recomend only to tune parameters worth of it. Or tune it by parts.
        
        ## GUI
        
        It is possible to use basic GUI. But only with CSV data source.
        Just run `gui_start.py` if you have downloaded software or call `predictit.gui_start.run_gui()` if you are importing via PyPI.
        
        ## Data preprocessing, plotting and other Functions
        
        You can use any library functions separately for your needs of course. mydatapreprocessing, mylogging and myplottling are my other projects,
        which are used heavily here. Example is here
        
        ```Python
        
        from mydatapreprocessing import load_data, data_consolidation, preprocess_data
        from myplotting import plot
        from predictit.analyze import analyze_column
        
        data = "https://blockchain.info/unconfirmed-transactions?format=json"
        
        # Load data from file or URL
        data_loaded = load_data(data, request_datatype_suffix=".json", predicted_table='txs')
        
        # Transform various data into defined format - pandas dataframe - convert to numeric if possible, keep
        # only numeric data and resample ifg configured. It return array, dataframe
        data_consolidated = data_consolidation(
            data_loaded, predicted_column="weight", data_orientation="index", remove_nans_threshold=0.9, remove_nans_or_replace='interpolate')
        
        # Predicted column is on index 0 after consolidation)
        analyze_column(data_consolidated.iloc[:, 0])
        
        # Preprocess data. It return preprocessed data, but also last undifferenced value and scaler for inverse
        # transformation, so unpack it with _
        data_preprocessed, _, _ = preprocess_data(data_consolidated, remove_outliers=True, smoothit=False,
                                                correlation_threshold=False, data_transform=False, standardizeit='standardize')
        
        # Plot inserted data
        plot(data_preprocessed)
        
        ```
        
        ## Example of using library as a pro with deeper editting Config
        
        ```Python
        
        import predictit
        from predictit.configuration import Config
        
        Config.update({
            'data': r'https://raw.githubusercontent.com/jbrownlee/Datasets/master/daily-min-temperatures.csv',  # Full CSV path with suffix
            'predicted_column': 'Temp',  # Column name that we want to predict
        
            'predicts': 7,  # Number of predicted values - 7 by default
            'print_number_of_models': 6,  # Visualize 6 best models
            'repeatit': 50,  # Repeat calculation times on shifted data to evaluate error criterion
            'other_columns': 0,  # Whether use other columns or not
            'debug': 1,  # Whether print details and warnings
        
            # Chose models that will be computed - remove if you want to use all the models
            'used_models': [
                "AR (Autoregression)",
                "ARIMA (Autoregression integrated moving average)",
                "Autoregressive Linear neural unit",
                "Conjugate gradient",
                "Sklearn regression",
                "Bayes ridge regression one step",
                "Decision tree regression",
            ],
        
            # Define parameters of models
        
            'models_parameters': {
        
                "AR (Autoregression)": {'used_model': 'ar', 'method': 'cmle', 'ic': 'aic', 'trend': 'nc', 'solver': 'lbfgs'},
                "ARIMA (Autoregression integrated moving average)": {'used_model': 'arima', 'p': 6, 'd': 0, 'q': 0, 'method': 'css', 'ic': 'aic', 'trend': 'nc', 'solver': 'nm'},
        
                "Autoregressive Linear neural unit": {'mi_multiple': 1, 'mi_linspace': (1e-5, 1e-4, 3), 'epochs': 10, 'w_predict': 0, 'minormit': 0},
                "Conjugate gradient": {'epochs': 200},
        
                "Bayes ridge regression": {'regressor': 'bayesianridge', 'n_iter': 300, 'alpha_1': 1.e-6, 'alpha_2': 1.e-6, 'lambda_1': 1.e-6, 'lambda_2': 1.e-6},
            }
        })
        
        predictions_configured = predictit.main.predict()
        
        ```
        
Platform: any
Classifier: Programming Language :: Python
Classifier: Development Status :: 3 - Alpha
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Natural Language :: English
Classifier: Environment :: Other Environment
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Software Development :: Libraries :: Application Frameworks
Classifier: Intended Audience :: Science/Research
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Education
Description-Content-Type: text/markdown
