April 29, 2016
As Mark Twain famously said, “The secret of getting ahead is getting started.” When it comes to Internet of Things (IoT) projects, however, taking that first step can seem daunting. Many of the technology, processes and skills involved are new and unfamiliar. In today’s hyper-competitive global economy, however, the longer a company delays reaping the new commercial opportunities, productivity gains and risk reduction that IoT brings, the further it risks lagging behind competitors. At Pentaho we’ve witnessed IoT naturally evolve from big data. Working with some of the world’s most ambitious enterprises on IoT projects and our parent company Hitachi,...
April 13, 2016
We’ve talked about managing and automating the entire analytics data pipeline, from data engineering to data preparation to business analytics. Now, we’re excited to talk about accelerating the data pipeline. Over the last few months, our team has been hard at work, and today, we’re proud to announce the release of Pentaho 6.1. This release is built to bring analytics to you faster, by focusing on back end features that streamline manual data onboarding and bring more agility and flexibility to all users along the data analytics pipeline. Here are just a few of features and enhancements we’re proud to...
April 5, 2016
The results are in! Based on our poll of 363 data mavens like yourself, we’ve found a few trends in how organizations today are approaching data integration . 1. What Kinds of Data? The data types you’re using most frequently for blending are customer profile or demographic data (69% of respondents) and transactional/financial data (59% of respondents). Other frequently used data types include machine or device data, and email or document data. 2. Where Is the Data? Almost 80% of our respondents are blending data living in data warehouses, and sixty percent are getting data directly from flat files. 52%...
April 1, 2016
Today Pentaho announced the industry’s first native integration with “Liger” the new open source technology holding the keys to simplify both big data and IoT analytic use cases. Pentaho has always believed that BI and DI go better together. It is a logical step that we would be the first to support a technology that bridges two strong parts to create a greater whole. With the strength of the “king of the jungle” and the speed of the tiger, Liger delivers lightning fast analytics for massive amounts and varieties of data. Being the first Big Data Integration and Analytics vendor...
March 28, 2016
From ingestion to analytics, Pentaho's Chuck Yarbrough discusses Hadoop is hard ahead of Strata + Hadoop World 2016 in San Jose. Let’s face it, Hadoop is hard. Gartner predicts, "Through 2018, 70% of Hadoop deployments will fail to meet cost savings and revenue generation objectives due to skills and integration challenges.” [i] This statement should give most IT groups pause, but what is equally concerning is the fact that many organizations are still struggling to determine how to deliver value from Hadoop in the first place. To combat these key challenges, organizations must develop a clear plan for their Hadoop...
March 24, 2016
Pentaho was a proud sponsor of the Gartner Business Intelligence and Analytics Summit in Grapevine, Texas last week. The Pentaho team had the opportunity meet impressive practitioners and attended some great sessions on the future of analytics and data integration from several different analysts including Cindi Howson, Carlie Idoine, Josh Parenteau, Joao Tapadinhas and more. There were a few key themes that resonated throughout the conference, and we wanted to share these with you: 1) The Evolution of Analytics Business intelligence is a dated term at this point. We aren’t saying it’s obsolete; it’s just evolved. Analytics no longer follows...
March 11, 2016
What can you do to find the right information to unlock transformative value? Erin Latham, president of Mo'Mix Solutions, explains how Pentaho ETL tools can help. “...they would spend millions on these huge legacy implementations, yet at the end of the day, it felt like their data was in a Grand Canyon; they couldn’t get it out.” It takes the right people getting the right information at the right time to be able to unlock transformative value. Sadly, many business leaders spend a lot of time and money on legacy ERP systems, but run into obstacles before they can drive...
February 5, 2016
1) When was Pentaho Data Integration (PDI) created and how has it evolved? PDI has evolved from an idea I had over 10 years ago. As many of you know, PDI started as an open source project and has evolved into what I consider the most popular data integration environment on the planet. PDI has had many evolutions but an important one was about 6 years ago after James Dixon, our CTO, coined the “data lake” term. I remember Hadoop was gaining popularity and realized we needed to create a plugin with Hadoop and other big data stores. This led...
January 25, 2016
Due to its simplicity and intuitive nature, the demand to learn Python has continued to grow over the last five years, making it the preferred language for deep learning researchers. How serious is this growth? In 2015, CodeEval determined for the fourth year in a row that Python was the #1 most popular coding language, followed by Java, C++, and Javascript. [1] Python has overtaken French as the most popular language taught in primary schools, according to a new survey released in August 2015 ( Information Age ). [2] Driven by the market and customer demand, Pentaho Labs announced today...
January 14, 2016
It’s been over five years since Pentaho’s CTO, James Dixon coined the now-ubiquitous term data lake . His metaphor contrasted bottled water which is cleansed and packaged for easy consumption with the natural state of the water source – unstructured, uncleansed, and un-adulterated. The data lake represents the entire universe of available data before any transformation has been applied to it. Data isn’t compromised by giving it undue context in order to fit it into existing structures, which could potentially compromise its utility to your business. You can store data at low cost and you can process it at scale...