NoSQL Databases
NoSQL databases are a class of database management systems that differ from class relational database management systems because they are intended to perform at Internet scale and reject the relational model in favor of other models. Use of NoSQL databases is growing rapidly in popularity driven by their favorable economics and horizontal scalability benefits over traditional relational database systems.

Pentaho Makes NoSQL Databases Easy
More and more enterprises are turning to big data solutions such as NoSQL databases to process and gain insight from high volume, velocity and variety data streams including text, click streams, log files, social media, documents, location data, weather data, and more.
However, using NoSQL databases has its challenges. Steep technical learning curves and a lack of qualified technical staff create barriers to adoption. NoSQL databases lack management, data integration, data orchestration and business analytics tools. In addition, instead of SQL queries they each have unique APIs and methods for data loading and access. Pentaho Business Analytics removes these barriers and provides visual interfaces that enable you to easily access, integrate, visualize, explore and mine your NoSQL database data, as well as orchestrate management of your NoSQL data and other data sources.
Pentaho Business Analytics used with NoSQL databases:
- Easily integrates NoSQL data with data from relational sources
- Offers native and high-performance data access to leading NoSQL data stores
- Provides data access, integration, discovery, analysis and visualization for the growing amounts of data stored in NoSQL databases including web content, documents, social media, and log files
Supported NoSQL Databases
Pentaho's unparalleled native for NoSQL databases includes:
- Apache Cassandra and DataStax - Learn More.
- Apache Cassandra - Performs low-latency data operations at extremely high rates, automatically maintains fault-tolerant replicas within and across data centers, and supports near-unlimited incremental scaling of capacity and performance by adding nodes.
- DataStax Enterprise - Powered by Cassandra: distributed, scalable, enterprise-class NoSQL data management solution.
- DataStax Community - Powered by Cassandra: Smart Start Installers from DataStax automatically install and configure Cassandra, which guarantees the best possible out-of-the-box performance experience.
- HBase - Provides real-time read/write access to Hadoop data
- MongoDB - A scalable, high-performance, open source, document-oriented database
- HPCC Systems - HPCC (High Performance Computing Cluster) is a massive parallel-processing computing platform that solves Big Data problems.
- Elasticsearch - Open source, distributed, RESTful, search engine built on top of Apache Lucene
- XML streaming - Support for streaming of data from XML files of any size
Visual Development
Pentaho provide a rich visual user interface for loading, extracting and transforming data within NoSQL databases. This visual interface allows developers, IT and data scientists to work with NoSQL databases using the much more familiar and less complex extract-transform-load (ETL) approach, instead of having to write code and scripts.

Pentaho MapReduce allows developers, IT and data scientists to work with Hadoop using the much more familiar and less complex extract-transform-load (ETL) approach.
Visual Job Orchestration
Pentaho an intuitive visual user interface for orchestration of data processing and data integration jobs for all supported NoSQL database stores as well as traditional relational databases and other data stores. This enables easy configuration of scheduled jobs, as well as more complex job execution logic such as events, triggers and conditional logic.
Contact Pentaho to learn more about the supported NoSQL database platforms.
