Pentaho Business Analytics
for Big Data
Pentaho Business Analytics offers unmatched native support for the most popular big data sources including Hadoop, NoSQL and analytic databases.
Pentaho Kettle for Big Data
Are you a developer, database administrator, analyst or data scientist looking for easier ways to operationalize your big data? If so, visit the Pentaho Big Data Resource Center to:
- Become a part of the Pentaho Big Data Community
- Download the latest release of Pentaho Kettle for Big Data which includes greatly expanded support for Hadoop and NoSQL data sources
- Access valuable "how to" guides that quickly walk you through using Pentaho to input, output, manipulate and report data using leading big data platforms
Using Pentaho's code-free visual interfaces will dramatically increase your productivity, eliminate the need for you to learn the technical intricacies' of your preferred big data platform, and enable easy co-existence and migration between a wide variety of big data and traditional databases.
Pentaho Big Data
Resource Center
What is Big Data Analytics?
Are you new to big data and big data analytics? If so, here's a brief primer.
Everyday, vast lakes of data are being created and grow in size, with data coming from everywhere -- financial and retail transactions, web logs, RFID, sensor networks, social media, telephony, internet indexing, clickstreams, call detail records, ecommerce, medical records and more. This "big data" usually has one or more of the following characteristics:
- Very large data volumes measured in terabytes or petabytes
- Variety of structured, unstructured and semi-structured data
- High velocity, rapidly changing data
Another way to describe big data is datasets that grow so large that they become awkward or uneconomical when using traditional database management and BI tools. Analyzing big data allows analysts, data scientists and now business users to make better decisions using information that was previously inaccessible to them.
The ability to store, aggregate, and combine huge amounts of data, and perform insightful analytics on the results, has finally become more accessible and cost-effective – the technical and economic barriers are falling fast. Companies that use big data to gain business insight and take action will outperform their peers. Pentaho Business Analytics for big data dramatically lowers the technical barriers and shortens the time it takes to provide a useful solution to help companies pragmatically operationalize the promise of big data analytics.
Choose Pentaho
Pentaho is the leading solution for big data analytics for the following reasons:
- Fastest time to results for big data access, integration, discovery, analysis and visualization
- Full business analytics solution from data to dashboards
- Speed of thought analysis of big data, which is not possible with traditional tools
- Dramatically reduced development costs compared to traditional BI tools
- Lowers technical barriers for developers

Learn more about why customers and the leading big data management solution providers have chosen to work with Pentaho to solve their big data analytics requirements.
Pentaho Makes Analytics with Hadoop Easy
Using Pentaho Business Analytics with Hadoop allows easy management, integration, and speed-of-thought analysis and visualization of Hadoop data and enables:
- Quick and easy analytics against big data
- Easier to maintain solutions
- Integration of big data tasks into the overall IT/ETL/BI solutions
- ETL engine distributed across the Hadoop cluster
- Support for multiple Hadoop distributions

NoSQL Analytics and Visualization
With deep native support for the most popular emerging NoSQL data sources, including HBase, mongoDB, and HPCC Systems, as well as data from very large XML sources, Pentaho:
- Easily integrates NoSQL data types with existing data architectures
- Offers native and high-performance data access to leading NoSQL data stores
- Provides reporting and analysis on growing amounts of user- and machine-generated data, including web content, documents, social media and log files

Using Relational Analytic Databases with Pentaho
Analytic databases are a rapidly growing category of relational database management systems designed for high scalability and performance, providing very fast query performance when used as data stores for query-intensive applications such as business intelligence. Pentaho provides performance optimized native support for many popular analytic databases, and enables data analysis, reporting and data integration, through:
- Deep integration with native SQL dialects
- Support for high performance parallel bulk data loader utilities




