Machine Learning Capabilities

Streamline the Machine Learning Workflow with Pentaho

Pentaho offers a robust platform to help companies take advantage of machine learning algorithms throughout their organization, helping business units and IT to work together with the common goal of making predictive analytics deliver value to the enterprise.

Review the full set of Pentaho's machine learning capabilities:

Capability Description Data and Engineer Features Train, Tune,
and Test
Deploy
and
Operationalize
Regularly
Update
Data Cleansing and Validation with PDI Standardize and validate data to ensure that it complies with business rules and standards. De-duplicate, cleanse and correct inconsistent or redundant data and view the distribution of data.
Data Services Quickly blend and visualize fast-moving or rapidly evolving data sets from a virtual table that you can query with simple SQL statements using tools like R Studio.
R Script Executor for PDI Run an R script as part of a Pentaho Data Integration transformation removing the burden of operationalizing your models.
Community CPython Executor for PDI Leverage the Python programming language and its extensive package-based support for scientific computing as part of a data integration pipeline
Spark MLlib with PDI Operationalize Spark MLlib applications written in Python, Java or Scala code using PDI to submit Spark jobs
Weka Scoring for PDI Score data as part of a PDI transformation by applying classification, clustering, or regression models constructed in WEKA.
Weka Forecasting for PDI Leverage forecasting models created in Weka's time series analysis and forecasting environment to create future predictions on incoming data within a PDI transformation
In-Line Visualizations Visually experience data anywhere along the machine learning pipeline to collaborate and iterate on the right features to feed your algorithm
Metadata Injection Develop standard templates that dynamically adjust to multiple data sources that have varying schemas, to drastically reduce development time and accelerate time to value
Real-time or Batch Make real-time predictions in low-latency applications or use batch mode for bulk offline scoring.
Model Versioning Instate a new version or rollback to a previous version of your model within your transformation using PDI's version control and tracking facilities.

Test out the new capabilities by downloading a trial of Pentaho or contact a Pentaho expert to schedule a demo.