Build Engaging Stories on Big Data with SAP Lumira 2.0, Discovery Component

  • by Vinayak Gole, Senior Business Intelligence Consultant, Tata Consultancy Services
  • November 17, 2017
Explore the connectivity to data stored in the Hadoop distributions (Cloudera and Hortonworks) through the discovery component of SAP Lumira 2.0.

Learning Objectives

Reading this article you will learn:

  • Basic terminologies for Big Data distributions
  • Step-by-step connectivity options of Cloudera to Lumira 2.0
  • Step-by-step connectivity options of Hortonworks Hadoop to Lumira 2.0


Key Concept

A Big Data wave has prompted analytics tools to provide native connectivity to the better known Big Data distributions such as Hortonworks and Cloudera. Both Hortonworks and Cloudera provide Hadoop distributions to enable enterprises to be Big Data driven. Connectivity to other distributions of Hadoop can be achieved by implementing the appropriate Java Database Connectivity (JDBC) or Open Database Connectivity (ODBC) drivers. Modern analytic capabilities also enable self-service BI, which allows business users to work with data directly with minimal interaction with IT. SAP’s self-service BI tool, SAP Lumira, allows connectivity to traditional data sources as well as integration into Big Data sources to enable data exploration.

Thanks to Big Data technologies, unstructured data can now be stored, processed, and queried in a cohesive manner. While storing and processing data have been a primary focus for technology companies, querying and analysis of Big Data are now gaining momentum. Native connectivity to Big Data technologies is now a prerequisite for modern data analysis tools with features that offer advanced charting options. 

The SAP Analytics suite offers native connectivity to two of the most popular distributions of Big Data (i.e., Hortonworks and Cloudera). SAP Lumira 2.0, part of the suite that enables self-service data exploration, also offers users direct connectivity to Hadoop Hive and Cloudera Impala. Business users can connect to the data, analyze it, and build stories on it. I provide a step-by-step guide on connecting to two of the popular Hadoop distributions, Cloudera and Hortonworks, by fetching a sample dataset to create transformations, build visualizations, and build a story on the data. For the purpose of demonstration, I use virtual machines provided by Cloudera and Hadoop.

Vinayak Gole

Vinayak Gole is a senior Business Intelligence consultant with 15 years of experience in IT across multiple business domains. Part of the global SAP Analytics Center of Excellence at Tata Consultancy Services, Vinayak has been engaged in architecting solutions on SAP Business Objects suite including Lumira and Business Objects cloud.  

See more by this author


No comments have been submitted on this article. 

Please log in to post a comment.

To learn more about subscription access to premium content, click here.