With the recent announcement of Impala, also known as Cloudera Enterprise RTQ (Real Time Query), I expect the interest in and adoption of Hadoop to go from merely intense to crazy. We applaud Cloudera’s investment in creating Impala as it moves Hadoop a huge step forward in making Hadoop accessible using existing BI tools.
What is Impala? Simply put, it enables all of the SQL-based BI and business analytics tools that have been built over the past couple of decades to now work directly on top of Hadoop, providing interactive response times not previously attainable with Hadoop, and many times faster than Hive, the existing SQL-like alternative. And Impala provides pretty complete SQL support, including join and aggregate functions – must-have functions for analytics.
For enterprises this analytic query speed and expressiveness is huge – it means they are now much less likely to need to extract data out of Hadoop and load it into a data mart or warehouse for interactive visualization. Instead they can use their favorite business analytics tool directly against Hadoop. But of course only Pentaho provides the integrated end-to-end data integration and business analytics capability for both ingesting and processing data inside of Hadoop, as well as interactively visualizing and analyzing Hadoop data.
Over the past few months Cloudera and Pentaho have been partnering closely at all levels including marketing, sales and engineering. We are proud of the role we played in assisting Cloudera with validating and testing Impala against realistic BI workloads and use cases. Based on the extremely strong interest we’ve seen, as evidenced by the lines at our booth at the recent Strata big data conference in New York City, the combination of Pentaho’s visual development and interactive visualization for Hadoop with the break-through performance of Cloudera Impala is very compelling for a huge number of enterprises.
- Ian Fyfe, Chief Technology Evangelist, Pentaho