|
In addition to Pentaho’s data integration, OLAP, reporting and ad hoc analysis capabilities, we also provide sophisticated data mining and advanced analytics functionality. Data mining is functionality that processes data through sophisticated algorithms to uncover meaningful patterns and correlations that may otherwise be hidden with standard analysis and reporting. These can be used to help you understand the business better and also exploited to improve future performance through predictive analytics. For example, data mining can warn you there’s a high probability a specific customer won’t pay on time based on an analysis of customers with similar characteristics.
Pentaho Data Mining is differentiated by its open, standards-compliant nature, use of Weka data mining technology, and tight integration with our core business intelligence capabilities including reporting, analysis and dashboards in the Pentaho BI Suite Enterprise Edition.
Pentaho Data Mining can be deployed as:
- An out-of-the-box solution for immediate deployment to analysts. As far as end-users are concerned, data mining operates entirely in the background – users see results and recommendations through e-mail or other web pages, which can include Pentaho Dashboards.
- A set of components that enable Java™ developers to quickly create custom reporting solutions using Java Objects or Java Server Pages (JSPs). These can be tightly integrated with other applications or portals.
- Together with other components of the overall Pentaho BI Suite
Pentaho Data Mining:
- Provides insight into hidden patterns and relationships in your data
- Enables you to exploit these correlations to improve organizational performance
- Provides indicators of future performance
- Enables embedding of recommendations in your applications
- Enables you to take full advantage of a range of data mining algorithms
Features and Functionality:
- Provides a comprehensive set of machine learning algorithms from the Weka project including clustering, segmentation, decision trees, random forests, neural networks, and principal component analysis.
- Output can be viewed graphically, interacted with programmatically, or used data source for reports, further analysis, and other processes.
- Filters are provided for discretization, normalization, re-sampling, attribute selection, and transforming and combining attributes.
- Classifiers provide models for predicting nominal or numeric quantities. Learning schemes include decision trees and lists, instance-based classifiers, support vector machines, multi-layer perceptrons, logistic regression, Bayes’ nets, and other advanced techniques.
- Inputs and outputs can be controlled programmatically, enabling developers to create completely custom solutions using the components provided.
|