Machine Learning Orchestration

Pentaho’s data integration and analytics platform ends the ‘gridlock’ associated with machine learning by enabling smooth team collaboration, maximizing limited data science resources and putting predictive models to work on big data faster.

Streamlining the Machine Learning Workflow

With Pentaho’s machine learning orchestration, the process of building and deploying advanced analytics models maximizes efficiency. Most enterprises struggle to put models to work because data professionals often operate in silos and the workflow - from data preparation to updating models - create bottlenecks.

Pentaho’s platform enables collaboration and removes bottlenecks in four key areas:

Machine Learning Orchestration Workflow

1. Prepare Data and Engineer New Features

Pentaho helps data scientists and engineers easily prepare and blend traditional sources like ERP, EAM and big data sources like sensors and social media. Pentaho also accelerates the notoriously difficult and costly task of feature engineering by automating data onboarding, data transformation and data validation in an easy-to-use drag and drop environment.

2. Train, Tune, and Test Models

Data scientists often apply trial and error to strike the right balance of complexity, performance and accuracy in their models. With integrations for languages like R and Python, and for machine learning packages like Spark MLlib and Weka, Pentaho allows data scientists to seamlessly train, tune, build and test models faster.

3. Deploy and Operationalize Models

A completely trained, tuned and tested machine learning model still needs to be deployed. Pentaho allows data professionals to easily embed models developed by the data scientist directly in a data workflow. They can leverage existing data and feature engineering efforts, significantly reducing time-to-deployment. With embeddable APIs, organizations can also include the full power of Pentaho within existing applications.

4. Update Models Regularly

With Pentaho, data engineers and scientists can re-train existing models with new data sets or make feature updates using custom execution steps for R, Python, Spark MLlib and Weka. Pre-built workflows can automatically update models and archive existing ones.

To learn more, check out the full list of machine learning capabilities and download a free trail of Pentaho.