The Pentaho Blog

Views on Big Data, Data Integration, and Analytics at the Point of Impact
May 14, 2015 | Comments
This week Pentaho announced the native integration of Pentaho Data Integration (PDI) with Apache Spark™ enabling the orchestration of Spark jobs will be supported in it’s upcoming release. Spark is a powerful open source processing engine built around speed, ease of use, and machine learning. This PDI integration builds upon previous projects from Pentaho Labs, furthering the type of efforts that led to support for YARN and the Adaptive Big Data Layer. What does this integration mean for Pentaho customers? And the big data market overall? Ultimately, this builds and expands on Pentaho's foundation of Big Data innovation. For customers,...
April 13, 2015 | Comments
In a just published report on BI and Analytics, Ventana Research named Pentaho a “hot” vendor. The Ventana Research Value Index is an unbiased and fact-based analytic representation of how well 15 BI vendors’ offerings meet buyers’ requirements for software that enable and support business intelligence. Big Data meets BI Ventana Research finds that new technologies collectively known as big data, (such as in-memory computing, Hadoop and Data Warehouse appliances) are influencing the evolution of Business Intelligence. Pentaho’s leadership and strength in supporting all data, especially the new emerging big data technologies, ensures that our BI offerings are optimized as...
April 1, 2015 | Comments
Whenever I talk to someone interested in embedding BI into an application, they usually have a good reason for it. They may be a product manager at a SaaS company, looking to competitively differentiate their product with seamless visual analytics. They could be a strategy director at a global enterprise looking to enhance customer relationships by delivering visibility into service and performance through an online portal. These organizations normally have no shortage of vision - they know which analytic needs they want to meet. But when it comes to execution - the same debate almost always pops up: Should we...
March 27, 2015 | Comments
Politics is all that stands in the way of democratizing analytics. Following the whole ‘BI for the masses’ movement, today’s buzz is all about democratizing analytics – giving everyone from Alice in the mailroom to Joe CEO the tools to make data-informed decisions. It’s a lively debate. Entrepreneurial types insist that it’s a ‘do or die’ imperative while the more cautious amongst us liken it to running with scissors. Last Wednesday, I joined the panel of Computing’s “Practical steps towards democratising analytics” web conference chaired by Stuart Sumner to explore the topic in more depth. You can read a recap...
February 18, 2015 | Comments
There has definitely been an evolution of how the industry talks about data. About five years ago the term ‘Big Data’ emerged to define the volume aspect of Big Data. Soon after, the definition of Big Data expanded to a better one that explains what it really is; not just big, but data that moves extremely fast, often lacks structure, varies greatly from existing data, doesn't fit well with more traditional database technologies, and frankly, is best described as “messy”. Fast-forward to 2015 and Pentaho’s announcement of version 5.3 this week to deliver on demand big data analytics at scale...
February 13, 2015 | Comments
Ten years ago we set out to commoditize BI, disrupt the existing old school proprietary vendors and give customers a better choice. It’s been an exciting journey building up a scalable analytic platform, building an open source community, surviving a deep recession, beating up the competition, building a great team, providing great products and services to our customers, and being a major player in the big data market. Some of the key points along the way: 2008 – the recession hits and frankly as painful as that was it actually helped Pentaho as we were the best value proposition in...
February 10, 2015 | Comments
Big Data and the Internet of Things are disrupting entire markets, with machine data blurring the virtual world with the physical world. This market matters —a recent Goldman Sachs report cites an astounding $2 Trillion opportunity by 2020 for IoT, with the potential to impact everything from new product opportunities, to shop floor optimization, to factory worker efficiency gains that will power top-line and bottom-line gains. The company that delivers high quality big data solutions fastest and enables customers to connect people, data and things to transform their industries and organizations will win. That is why I am very excited...
January 27, 2015 | Comments
There’s lots of advice out there on building a big data team, from industry or expert analysts and leading publications. But we wanted to see how this is being implemented in real life , so we talked to the real world big data mavericks – those who've faced the challenge of gaining true business value from big data and succeeded. They shared real-world insights into how they made it happen and the advice they’d give to those ready to take the plunge. (Scroll to the bottom to meet our mavericks.) 1. Clearly define your business goal, and don’t be afraid...
January 22, 2015 | Comments
Pentaho co-founder and CTO, James Dixon is who we have to thank for the term, ‘Data Lake.’ He first wrote about the Data Lake concept on his blog in 2010, Pentaho, Hadoop and Data Lakes . After the numerous interpretations and feedback, he revisited the concept and definition here: Data Lakes Revisited . Now, in his latest blog , Dixon explores a use case based off the Data Lake concept – calling it The Union of State . Read the blog below, to learn how the Union of State can provide the equivalent of a rewind, pause, and forward remote...
December 16, 2014 | Comments
Last year I speculated that the big data ‘power curve’ in 2014 would be shaped by business demands for data blending. Customers presenting at our debut PentahoWorld conference last October, from Paytronix, to RichRelevance, to NASDAQ, certainly proved my speculations to be true. Businesses like these are examples of how increasingly large and varied data sets can be used to deliver high and sustainable ROI. In fact, Ventana Research recently confirmed that 22 percent of organizations now use upwards of 20 data sources, and 19 percent use between 11 – 20 data sources. [1] Moving into 2015, and fired up...

Pages