Weka is a collection of machine learning algorithms for data mining tasks. The algorithms can either be applied directly to a dataset or called from your own Java code. Weka contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization. It is also well-suited for developing new machine learning schemes and is used to build predictive models.
How Pentaho Supports Weka:
- Weka Scoring for PDI: This tool allows the user to “score” data as part of a PDI transformation by applying classification, clustering, and regression models constructed in WEKA
- Weka Forecasting for PDI: Weka forecasting leverages forecasting models created in Weka’s time series analysis and forecasting environment in order to create future predictions on in-coming data within a PDI transformation
Check out how R and Weka fit into Pentaho Data Integration with the Data Science Pack.