Tableau on HANA: What you need to know
Customers using SAP HANA often want to get to a point where they have a simplified architecture for visualization and reporting on HANA – specifically with their business users, who want less dependency on developers. Tableau is one of the most sought after ‘Self-service BI’ products on the market and it empowers business users with its practical user interface. With our quest to find the best BI tools integrating with HANA, we ran some tests at Comerit Labs with Tableau on HANA. We are sharing our findings in terms of Tableau’s ease of use, performance, and objects propagation from HANA to Tableau so that decision makers are aware of the pros and cons of the Tableau on HANA landscape.
Tableau Desktop (Version 10.3 Professional Edition), running on a Windows machine (64-bit corei7, 8GB RAM), consuming calculation view from HANA (version 2.0 SPS02) using a live ODBC connection. The calculation view contains sales data of automobile spare parts and related electrical components across various states in US and Canada. Within the view, the final data count had 2.1 million rows with 34 fields (columns) including 28 attributes and 6 measures.
Calculation view in HANA:
- The Connection:
Unfortunately, Tableau does not provide an OLAP connection (e.g. MDX), which consumes the views as aggregated cubes instead of flattened out tables coming through in the ODBC connection. By default, the ODBC connection connects to the _SYS_BIC schema of the HANA Catalog. Note that the _SYS_BIC schema contains all the column views of the models created and activated in the Content Packages.
Some business users might need time to get used to the naming convention of column views in HANA. It would have been better if the package names [from the Content folder] would have been displayed instead of the schema names [from the Catalog folder] for better identification. On the other side, this also means we can connect to specific schemas in HANA using the Tableau interface and directly consume the underlying schema tables, instead of the column views, for analysis.
Where Tableau excels is with its ease of use. It is simple to create most basic visualizations including area chart, waterfall charts, crosstab reports, and geo-based visualization.
Although it did take some tweaking to create the multi-label pie-chart (Business Unit name and % share), the Tableau community website has a lot of resources regarding how to tweak and create the necessary visualization.
If the visualization shows aggregations (e.g SUM[sales]) it runs smooth, unless you want to see a non-aggregated scatter pot to analyze detailed fields. For example, below is a scatter plot created to see relation between the sales and profit for Ontario, Canada. It took ~ 1.5 minutes for all the individual plot values to load on the chart after the row and column fields were selected for the scatter plot for the first time.
However, once the visualization was created, it was smooth to filter out and see the details of any chunk of data points from the given dataset.
We also created an Advanced Pareto chart to visualize whether the 80-20 rule for business holds true in this case. Even the Pareto chart took almost 2 minutes to load after the row and column fields were selected for the chart and the calculations were maintained for the first time.
- Propagation of semantics:
To improve performance of the queries, it is best practice to push most of the calculations and filters to the HANA layer. Here, we found that the input parameters and the variables defined in the calculation views were propagated to the data loading screen of Tableau:
Unfortunately, any hierarchies created in the HANA layer do not propagate to Tableau in any form. We must create them in Tableau separately. Also, it’s easy to create multi-level hierarchy (Country-> region-> state) in Tableau, but creating parent-child hierarchy (employee-> manager) is not possible.
This Tableau on HANA use case works well when customers want to keep a simple architecture, have a HANA developer who understands the business requirements, and have business users who want to see aggregate-based summary visualizations on the fly. We ran the same scenario with the Tableau server version instead of the desktop and unfortunately, we did not find any performance improvement. HANA can work as relational as well as an OLAP database and that is where we would have loved to see an OLAP connection instead of only ODBC to leverage HANA’s capability. Sure, the hierarchies in HANA do not propagate to the upper layer, but with Tableau’s ease of use, business users will find it easy to play and tweak around with data and create sophisticated serf-service dashboards that runs on a live connection.