Current Data Scientist Craze Can’t Last
Randy Bean
NewVantage Partners

Randy Bean is an industry thought-leader and author, and CEO of NewVantage Partners, a strategic advisory and management consulting firm which he founded in 2001.  He is a contributor to Forbes, Harvard Business Review, MIT Sloan Management Review, and The Wall Street Journal.   

This article is by Featured Blogger Randy Bean from his LinkedIn page. Republished with the author’s permission.

It was not quite a decade ago that I received a most interesting invitation from a friend and colleague of many year’s past. He had recently been appointed to the position of Assistant Secretary of Defense for Research and Engineering, and in his new role, had responsibility for how the Department of Defense would leverage ‘Big Data’ to strengthen the nation’s military capabilities. 

Part of his mandate was to understand how data and analytics could be used to more effectively deploy military resources, understand enemy movements, and pinpoint military actions with greater accuracy based on the abundance of data available to the Pentagon. There was one hitch, however. As the roomful of military leadership pointed out to me, “We spend 80% of our efforts on data preparation, not on analysis and execution. We are hoping you will educate us on how private industry has successfully addressed this challenge”. 

I am afraid that my answers disappointed them, or perhaps it relieved them, to understand that the data challenges they faced were common across the private sector too. In fact, leading corporations have struggled with issues of poor data quality, scattered data sources, and the challenges of data preparation for decades. It is against this backdrop that the data industry has undertaker major investments within the past decade to tackle the challenges of data preparation — to enable analysts and business decision makers to access their data with greater accuracy, greater confidence, and more effective results.

In recent years, firms like Trifacta have emerged to assist corporate clients in transforming raw data so that it becomes “analysis-ready”. Trifacta lightheartedly describes their focus as “janitorial work”, but the results are important and mission critical, as firms need quality data to ensure accurate business analysis and execution. The Trifacta approach engages business analysts — who know their data best — in the data preparation process using tools designed to discover, structure, clean, enrich, and optimize their data assets to drive business value. Trifacta is also employing Artificial Intelligence (AI) and machine learning capabilities to guide users in order to accelerate business insights and achieve faster results.

I recently spoke to one of Trifacta’s financial services clients about their success. ABN AMRO is a Dutch bank founded in 1765 and headquartered in Amsterdam with global offices. At its peak, prior to the financial crisis of 2008-2009, ABN AMRO was the 8th largest bank in Europe, 15th largest worldwide, and operated in 63 countries. Today, ABN AMRO remains a leading European financial institution.  

Marcel Kramer is Head of Data Engineering at ABM AMRO and in this capacity is mandated with advancing data and machine learning engineering practices throughout the firm. Kramer explains, “ABN AMRO uses data as a key asset to further reinvent the customer experience, accelerate the sustainability shift, and optimize our processes around compliance and regulations”.  Noting the business benefits resulting from improved data quality, Kramer comments, “The better the quality of data and the easier it is to access the data, the time-to-market greatly improves and instantly contributes to your corporate goals”.

ABN AMRO has deployed Trifacta in support of its modern data distribution architecture and Cloud-centric platform strategy. ABN AMRO’s Kramer observes, “Tools such as Trifacta help make our business significantly faster in discovering the value they can create with data”. He notes that this enables business constituents to take greater ownership of critical data elements, which provides a foundation for greater dialogue between data owners and data consumers. One way this is realized is by abbreviating the delivery cycle through reducing the need for IT specialists, often referred to as “democratizing the data”.

In addition to the obvious customer benefits, the new data architecture is fully compliant with General Data Protection Regulation (GDPR) requirements. Kramer notes, “Trifacta has an approach called ‘Predictive Transformation’, that allows the user to explore and clean up their data”. The result is that non-technical users can now do more with their data, as they are guided through a process using intelligent suggestions powered by machine learning.

Like so many organizations with legacy IT systems, ABN AMRO has been dependent upon centralized data warehouses. In their new data distribution architecture, ABN AMRO makes raw data easily and securely accessible. Kramer concludes, “Trifacta allows business users to easily and swiftly wrangle this data so that combinations of data lead to new insights about our customers”. Today, ABN AMRO is building a data marketplace where each user of data can find relevant data sources, understand the data, and easily create analytics projects resulting in measurable business outcomes.