Big Data

Within a single platform, our solution provides big data analytics tools to extract, prepare and blend your data, plus the visualizations and analytics that will change the way you run your business. Regardless of data source, analytic requirement or deployment environment, Pentaho allows you to turn big data into big insights.

Free TrialRequest Demo

Big Data Analytics

Blended Big Data Analytics

A tightly coupled data integration and business analytics platform accelerates the realization of value from blended big data.  

  • Full array of analytics: data access and integration to data visualization and predictive analytics
  • Empowers users to architect big data blends at the source and stream them directly for more complete and accurate analytics
  • Supports the broadest spectrum of big data sources with Pentaho adaptive big data layer, which takes advantage of the specific and unique capabilities of each source
  • Open, standards based architecture, easy to integrate with or extend existing infrastructure

Learn about 4 common big data use cases that deliver immediate results. 


Interactive Analysis, Reporting, Visualizations & Dashboards

Pentaho empowers business users and analysts to easily visualize, analyze, and report on data across multiple dimensions without depending on IT or developers.

  • Interactive analysis, drill through, lasso filtering, zooming, and attribute highlighting for greater insight
  • Out-of-the box library of interactive visualizations
  • Extreme scale in-memory data caching for speed-of-thought analysis of large data volumes
  • Self-service interactive reporting to high volume, highly formatted enterprise reports
  • Dashboards from any big data source including enterprise.

Learn more about Pentaho Visualizations and embedded analytics

High-Volume Data Processing

Speed development time for big data, and achieve exceptional in-cluster performance.

  • Native connectivity to leading Hadoop, NoSQL and analytic databases
  • Visual designer for MapReduce jobs to reduce development cycles
  • Data preparation, modeling and exploration of unstructured data sets
  • Powerful, multi-threaded data integration engine for fast execution
  • Cluster support, enabling distributed processing of jobs across multiple nodes
  • Unique in-Hadoop execution for extremely fast performance

Adaptive Big Data Layer

Accelerate access and integration to the latest versions and capabilities of popular big data stores.

  • Ability to access data once - and then process, combine and consume it anywhere
  • Support for latest Hadoop distributions from Cloudera, Hortonworks, and MapR
  • Simple plug-ins to NoSQL databases such as Cassandra and MongoDB
  • Connections to specialized data stores such as Amazon Redshift and Splunk
  • Greater flexibility and insulation from changes in the big data ecosystem

Powerful Predictive Analytics and Data Mining

Sophisticated analytical modeling empowers organizations to plan for future outcomes by understanding historical business performance.

  • Powerful machine learning modeling algorithms such as classification, regression, clustering and association
  • Operationalize data mining models and R scripts as part of data integration workflows in Pentaho's graphical interface
  • Import of third-party models using Predictive Modeling Markup Language (PMML)
  • Scale modeling processes inside or outside of a Hadoop cluster

Learn more about Pentaho Predictive Capabilities.