作者注:2020年新书出版之后,几位Tableau好友计划帮助我把书翻译为英文,计划本身就出乎我的意料,不过我也乐见其成,因此多方合作,已经有了前面两章的翻译。不过图片还需要我自己处理,需要一些时间。

不管能否出版,博客先随着翻译进度和修改公布内容。

希望能帮助大家。

——喜乐君Michael Wu

Data Visualization Analysis: Principles and Practice of Tableau

Michael Wu

This book presents principles and practice of Tableau Prep Builder and Tableau Desktop systematically. With visualization analysis and Tableau computing emphasized, it introduces in detail how to understand the levels of data, how to collect and prepare data with Tableau Prep Builder, how to make agile data analytics with Tableau Desktop and advanced interactions in Tableau, especially various kinds of calculations of Tableau, hence a realization of unlimited business analytics with limited data.
The method of analyzing in levels of data and question goes through this book, which are illustrated with examples. It is appropriate for not only new beginners of Tableau who wish to learn it systematically, but also intermediate and advanced analysts in Tableau.

Contents

Volume 1 From Question to Chart: Tableau Visualization

Chapter 1 Visual Analytics: The Gate of Reason and Intuition to Big Data Times
1.1Data Pyramid: How Far Is It from Data to Decision-Making?
1.2Intuition Prior to Reason: Visualization Psychology
1.3Tableau: Van Gogh of Big Data Times
1.4Tableau Roadmap

Chapter 2 Data Visualization: Concepts and Basics
122.1Review: from Data Perspective of Excel to Question Perspective of Tableau
2.2Resolution: Question Resolution and The Method of Level Analysis
2.2.1Question Resolution: Dimension Describes Question and Measure Answers
2.2.2Aggregate Resolution: Visualization as An Aggregation Process from Row Level to View Level
2.2.3Level Resolution: Two Level Categories and Their Relations with Data Preparation and Calculation
2.3Tableau Basics: Field Categorizations, Aggregate Methods and Visualization Logics
2.3.1The First Field Categorization and Its Data Types
2.3.2The Second Field Categorization: Discreteness and Continuousness
2.3.3Tableau Aggregate: Aggregate Methods of Dimension and Measure
2.3.4Tableau: A Path from Basic Visualization to Enhanced Visual Analytics
2.3.5Summary: Business-Oriented Question Resolution and Tableau Desktop’s Visualization Logics

Chapter 3 Data Prep: Cleanse Data and Adjust Structures with Prep Builder
33.1Basic Operations of Prep Builder
3.2Elementary Field Collection: Data Cleansing and Filtering
3.2.1Data Splitting
3.2.2Data Grouping
3.2.3Filters
3.2.4String Cleansing
3.3Intermediate Structure Adjustment: Data Transformation
3.3.1Prep Builder and Desktop’s Transformation of Column into Row
3.3.2Prep Builder’s Transformation of Row into Column
3.4Advanced Structure Adjustment: Data Aggregation
3.4.1Necessity and Usage of Aggregation: The Single-Level Aggregation
3.4.2FIXED LOD: The Independent Level Aggregation
3.4.3Notes for Prep Builder’s Aggregation
3.5Advanced Computing: Calculation Ranks in Prep Builder
3.5.1Rank Calculation of Single-Dimension
3.5.2Rank Calculation of Fields with Sections
3.5.3Row-Level Ranking and Dense Ranking
3.6Prep 2020.3 Upgrade: Incremental Updates and Writing into Database

Chapter 4 Data Prep: Data Merge and Data Modeling
44.1Row-Level Merge: Union, Join and Desktop’s Methods
4.1.1Data Union
4.1.2Data Join
4.1.3Similarities and Differences between Union and Join
4.2View-Level Merging: Data Blend and Desktop’s Methods
4.2.1Blend Data with Desktop
4.2.2Logics of Data Blend and Its Similarities and Differences with Join
4.3One Data Saint: from Data Join and Blend to Data Relations
4.3.1Relations betwixt Merge, Join and Blend and Their Advantages and Disadvantages
4.3.2Physical Table/Level to Logical Table/Level: The Background and Distinctiveness of Data Relationship
4.4Construct Data Models with Data Relations
4.4.1Data Relations: A Construction of Matches between Physical Layers of Different Levels
4.4.2Data Models: An Addition of Physical Relations to Logical Ones
4.4.3Improve the Performance of Data Models (I): Relation Types
4.4.4Improve the Performance of Data Models (II): Quote Integrity
4.4.5Proceed to Data Model from Data Merge
4.5Merge Data with Prep Builder
4.5.1Union Data with Prep Builder
4.5.2Join Data with Prep Builder
4.5.3Blend Data with Prep Builder: Aggregation + Join
4.6The Comprehensive Application of Data Prep
4.7Quickly Merge and Collect Excel Data with Prep Builder
4.8How to use Prep Builder Elegantly
4.8.1Thinking and Question Prior to Data
4.8.2Thinking in Levels Is Critical
4.8.3Specialties: Matching and Cooperation with Other Tools

Chapter 5 Visual Analytics and Exploration
55.1Three Steps of Tableau Report Visualization
5.1.1Organizing Fields: To Understand the Independent Level Structure in Database Table
5.1.2Worktable: To Achieve Data Visualization According to Level Structure of Field
5.1.3Dashboard: To Explore Relations between Different Kinds of Data
5.2Tableau’s Relational Analytics in Complicated Business Questions
5.2.1Multiple Data Analytics: Uniqueness of Every Row in Database Table
5.2.2Instantaneous Computing: Perfect Analytical Model by Computing Fields
5.2.3Data Explanation: AI-Driven Intelligent Relational Analytics
5.3How to Choose A Framework of Visualization Charts
5.3.1Common Question Types and Charts
5.3.2From Simple Visualization to Complex one
5.4High Visualization Functions
5.4.1Measure Title and Value: Juxtapose and Compare Multiple Measures
5.4.2Bar Chart with Double Bars: Sales Amount and Profits of Every Subcategory
5.4.3Piled and Overlapped Measures: Overlapped Comparison of Multiple Measures
5.4.4Aggregate and Deaggregate Measures
5.5Enhanced Analytical Technologies of Visualization
5.5.1Common Filters and Their Priorities
5.5.2Set
5.5.3Parameters
5.5.4Group and Level Structures
5.5.5Ordering: Ordering Data According to Some Rule
5.5.6Reference Line, Reference Interval, Distributive Interval and Boxplot
5.6Format Setting
5.6.1Highlight Measure Value Through Label Setting
5.6.2Advanced Setting of Tooltip
5.6.3Other Common Settings

Chapter 6 Geographical Location’s Visualization
66.1Introduction to Tableau’s Geographical Analysis
6.2Map with Symbols and Map with Filling Colors
6.3Scatter Map and Heat Map
6.4Path Map
6.5Space Functions
6.6Combination of Map and Shape: Custom Shape and HEX Function
6.7Tableau 2020.4 Upgrade: Overlapping of Multiple Map Layers

Chapter 7 Chat with Data: Information Presentation and Advanced Interaction
77.1More than Data: from Sheet to Dashboard
7.1.1Dashboard: Visualization Tangram
7.1.2Precise Design and Layout
7.1.3Collapsible Toolbar Saving More Space
7.1.4Multi-Device Design and Big Screen Design
7.2Story: Build Your DataPoint
7.3Visualization Interaction: Chat with Data
7.3.1Multiple Filtering and Common Filter
7.3.2Pagination and Animation
7.3.3Highlighting
7.4Advanced Interaction: Dynamical Parameter and Parameter Action
7.4.1Example: Update Measure with Parameter
7.4.2Example: Update Measure with Operation
7.4.3Example: Dynamical Controlling Reference Line with Parameter Action
7.4.4Example: Expand an Intended Category with Parameter
7.5Summit of Advanced Interactions: Set Action
7.5.1Example: Sales Amount Percent of a Chosen Province
7.5.2Example: Find out Sales Percentages of a Designated Province’s Various Categories of Goods
7.5.3Example: Sales Differences of Various Provinces Compared to a Designated Province
7.5.4Example: Sales Trend of a Designated Province According to Date
7.5.5A Critical Principle: Priorities of Tableau’s Various Operations
7.5.6Advanced Example: Custom Matrix Built with Multiple Set Actions
7.5.7Skill: Combination of Set, Level Structure and Tooltip
7.6To Make Set Action More Effective: Incremental Update and Set Control
7.6.1Add and Remove Set Actions
7.6.2Set Control: A Set Becomes A Parameter of Multiple Values
7.7Suggestions for Usage of Advanced Interactions

Volume 2 From Limited to Unlimited: Tableau Computing

Chapter 8 Tableau Basic Computing: Principles and Method
88.1Levels of Question and Types of Computing
8.1.1To Learn the Basics of Big Data via Excel: Row-Level Computing and Aggregative Computing
8.1.2From Pivot Table of Excel to View Computing of Tableau
8.2Row-Level Functions and Their Uses
8.2.1Usage Scenes of Row-Level Functions
8.2.2String Functions
8.2.3Date Functions
8.2.4Number Functions
8.2.5Type Conversion Functions
8.2.6One Kind of Advanced String Function: Regex Function
8.3Aggregative Functions
8.4Logic Functions
8.4.1IF Function
8.4.2IIF Function
8.4.3CASE WHEN Function
8.4.4Other Functions of Simplifying Logics
8.5Differences and Principle of Row-Level and Aggregative Functions
8.5.1Advanced Example: Profit Levels and Profit Structural Analysis of Various Categories
8.5.2Principles: Substantial Differences between Row-Level and Aggregation-Level Expressions

Chapter 9 Tableau Advanced Computing: Table Computing
99.1Introduction to Multi-Level Analysis and Advanced Computing
9.1.1One Representative of Table Computing Functions: WINDOW_SUM Function
9.1.2One Representative of Special LOD Expressions: FIXED LOD
9.1.3Categories and Differences of General LOD Expressions
9.2Specialties and Principles of Table Computing
9.2.1Special Principles of Table Computing
9.2.2Specialties of Table Computing: How Dimension Participate in the Process of Computing
9.2.3Two Methods to Designate Directions
9.3Functions and Examples of Table Computing
9.3.1The Most Representative Function: LOOKUP Function and Difference Computing
9.3.2RUNNING_SUM Function: Running Sum Computing
9.3.3Example: Table Computing with LOOKUP and RUNNING_SUM (TC5)
9.3.4The Most Important Table Computing: Window Sum Function WINDOW_SUM
9.3.5One Elementary Example of WINDOW_SUM: Weighted Computing and Sum Percentages (TC6)
9.3.6One Intermediate Example of WINDOW_SUM: Differences Relative to a Chosen Subcategory
9.3.7One Advanced Example: Percentage Differences Relative to Random Date (TC1)
9.3.8Table Computing of Parameters
9.3.9INDEX and RANK: Order Computing
9.3.10Example: Sales Increment Based on Public Date Datum (INDEX Function) (TC2)
9.3.11Example: RANK Function Varying According to Date (TC4)
9.3.12Statistical Table Computing and Third-Party Table Computing
9.3.13Quick Table Computing
9.4Setting for Advanced Table Computing
9.4.1Example: Nested Table Computing of Iterative Aggregations
9.4.2Example: Principles of Depth Priority of Field with Multi-Direction
9.5Comprehensive Example: Method to Build Pareto Distribution Chart
9.6Comprehensive Example: Table Computing as A Filter
9.7New Features of Prep Builder2020: To Compute an Order of a Specified Level

Chapter 10 Advanced Computing: Special LOD Expression
1010.1Specialties and Principles of LOD Expression
10.2Grammar of LOD Expression
10.3Three Types of FIXED LOD Expression
10.3.1Detail Levels of a Higher Aggregative Degree than View
10.3.2Detail Levels of a Lower Aggregative Degree than View
10.3.3Aggregations Independent of View
10.3.4An Explanation of Principles of Three Kinds of Grammar
10.4INCLUDE/EXCLUDE LOD Expressions
10.4.1EXCLUDE LOD Realizes Aggregations of Higher Level
10.4.2INCLUDE LOD Realizes Aggregations of Lower Level
10.4.3Computing Logics and Priorities of FIXED, EXCLUDE and Table Computing
10.5How to Choose the Type of Advanced Computing: Level Analysis
10.5.1Four Steps of Advanced Analysis
10.5.2One Brief Example of The Four Steps of Advanced Analysis
10.6Advanced Application: Nested LOD Expressions
10.6.1Example: To Complete a Nested LOD with Four Steps of Analysis
10.6.2Variations of Nested LOD Expression
10.7Advanced Analysis Model: Membership RFM Analytical Model
10.7.1Membership RFM-L Index System
10.7.2Common Perspectives of Membership Analysis
10.7.3Frequency Analysis of Customers of Membership (LOD15-1)
10.7.4Matrix Analysis (LOD15-2)
10.7.5New Customer Acquirement Rate (LOD15-3)
10.7.6Customer Numbers of Different Repurchase Interval in Various Date Spans (LOD15-10)
10.7.7Yearly Purchase Frequencies of Various Customer Matrixes (LOD15-15)
10.8Cross Purchases of Goods and Shopping Basket Analysis
10.8.1Example: Customer Numbers of Different Purchase Numbers
10.8.2One Super Example: Cross Purchase Analysis Based on Shopping Baskets in Orders
10.9The Best Practices of Advanced Computing
10.9.1Which Locations in View Decide the Level of Detail
10.9.2How Various Kinds of Computing Constitute Parts of View
10.9.3How to Choose Computing Types and Priorities

Volume 3 From Visualization to Big Data Analyzing Platform

Chapter 11 Tableau Server as A Data Platform
1111.1Agile BI Speeds up the Flow from Data Assets to Value Decision-Making
11.2From Desktop Release to Server: Analytical Model Automation
11.3Prep Outputs to Server or Database Table: Data Flow Automation
11.4Data Management: From Complicated Data Preps to Deep Business Analytics

Chapter 12 To Ensure Data Security: Security System of Tableau Server
1212.1One Recommended Permission System of Tableau Server
12.1.1Set Permissions Based on Group and Project
12.1.2Lock Permissions in Project (If Necessary)
12.2Row-Level Data Security Management: User Filters and User Functions
12.3Permission Evaluation Rules of Tableau Server

Volume 1. From Question to Chart: Tableau Visualization

Chapter 1. Visual Analytics: The Gate of Reason and Instinct to Big Data Times

Keywords: Data Pyramid Model, Analysis and Decision-making, Intuition, Visualization Psychology

The purpose of data analysis is for decision-making, which needs data analysts and decision-makers to acquire external data as fast as possible and analyze effectively. Visualization analysis represents data in the directly perceived methods such as location, color, length, shape and size etc., helps us recognize critical information out of data, find logical relations behind data, and furthermore, make business policies.

In this chapter, the author will introduce to the readers the critical backgrounds of data analysis such as data pyramid model, data decision making process, and visualization psychology, with factual business decision-making processes. In this book, “business analysts” refer to data analysts of business departments, and “IT analysts” those ones of information or IT departments.

1-Data Pyramid: How Far Is It from Data to Decision-Making?

Thanks to the quick development of computer science and technologies, humans jumped into a period of data explosion from the one of scare data, with a question following: how to transform data and information into knowledge to assist decision-making?

Before 1990s, auxiliary decision-making hadn’t grown up, until relational databases appeared and data mining and visualization analysis emerged. In 1989, the conception of BI (Business Intelligence) appeared in a report of the famous consulting institution Garner, “concepts and methods to improve business decision making by using fact-based support systems.” Since then, the conception of BI was applied officially step by step. Nowadays, Garner’s “Magic Quadrant for Business Intelligence and Analytics Platforms” represents the bellwether of the whole industry. The protagonist hereof, Tableau, has appeared in the Leader quadrant for 8 years in succession. And no doubt, she will remain the leader of the Agile Business Intelligence industry.

Via the leading and driving of internet economy, many Chinese enterprises begin to invest more in IT software and hardware, data collection and storage, leading to the exponential growth of enterprise data. But data doesn’t mean values necessarily, and it is analysis and decision-making that create values. The author loves the following words of his favorite professor Peter Drucker:

By far, the only result of our system is data, but not information, let alone knowledge. (original text needed)

Then what are data, information and knowledge? A vivid example is given by Zipei Tu(涂子沛) in his The Big Data[quotation]. “185”, “Obama” etc. are only independent data, they can only be transformed into valid information from disconnected data by being put into specific contexts, such as “Obama is 185cm tall”. Some specific regular pattern will always be found based on a large quantity of data, for example, most of American adults are 185cm on average, which leads to the appearance of industrial knowledge. So, data is only the material of analysis, and knowledge is the final product of data analysis and the critical base of auxiliary decision-making.

Data has no value by itself, value results from the integrated process of data collection, analysis and processing, and intelligence and experience of human is the most important catalyst in data analysis. From data to information, and again information to knowledge, those constitute the three layers of ultimate importance in data pyramid. With wisdom added besides the three-layer model springs up the standard DIKW model, like the pyramid (see Figure1-1). This model describes clearly transformations from data into information, and information into knowledge, which could be understood as an increasement of “knowledge density”. A piece of A4-size paper cannot read the business data of one day of a listed company, but can give a valuable brief (Information) to investors.

Figure 1-1 DIKW Data Pyramid Model

One paper on The Economist says that “The most valuable thing in the 21st century is not oil, but data.” [quotation] The process of data analysis can be considered as that of oil exploration, extraction and refining, as showed in Figure 1-2.

Data: symbols to understand facts, such as numbers, units, degrees etc., before collected or used for understanding, they are useless. Like non-reflected life, or unmined oil underground, unanalyzed data exists but has no worth. Computer system records data with fields, more details will be given in Chapter 2, which are the counterparts of basic concepts and drag and drop logic of Tableau.

Information: Information is data combination with logic, the most part of which is structured, for example, “Oil 95# is 7.6RMB/liter.” We can understand the world and relationship behind data through information, which is called “know-what” (know what facts are). The transformation from data into information is like the process of oil exploration and mining from underearth, the basis of which is data collection and preparation, more details of which seen in Chapter 3 and 4, equivalent to a part of functions of Tableau Prep Builder and Tableau Desktop.

Knowledge: Knowledge is opinions based on data extracted from various data and information, so it varies from person to person. It is different from information, the former guides business decision-making and action directly, produces values directly, and so can be called “know-how” (knowledge and acts in perfect harmony). Like oil is refined into gasoline and gasoline power for cars, knowledge is the key product for data analysis.

Wisdom: Deep understanding and empirical insights constitute wisdom. With it, data can be analyzed, facts can be understood, and causes can be seen, which is good logic, i.e. “know-why”. There are always some business leaders and managers in every company who can tell the industrial trends via minor data clues, see how things will develop from the first small beginnings, and give a good judgement on the future. Behind all of the wisdom and insights is more abstract and insightful data logic and knowledge system.

In short words, data analysis is the process of extracting and collecting information from data, and summarizing knowledge and adding up to insights, and then guiding decision-making.

In enterprises, every layer of DIKW model has its own data-holders. As Figure 2-1 shows, data layer corresponds to IT personnel (managing and maintaining data), information layer to analysts (IT analysts or business analysts), the one of knowledge to business managers (decision-makers based on data), and that of wisdom to senior managers and CEOs (who lead business managers to see data and the future).

As data explodes, the highlight of the data area in the enterprises has moved away from the question of “how to collect more and more data” to that of “how to make analyses that contribute more to decision-making.” The predominant tension influencing decision-making analysis is that IT analysts who own data don’t understand business logics, but it is difficult for business managers who directly make data decisions to master methods of analyzing data. Realizing this situation, more and more enterprises, from burgeoning internet companies to traditional medicine ones, are transferring the work of data analysis from information to business departments, and even building special data analysis teams in business departments.

Therefore, in the 2019 BI analysis report Gartner wrote that:”” [original text and quotation needed]. Nowadays, this prediction is coming true gradually. Economic crises have urged enterprise leaders to put more weight to data analysis, with which to reduce the cost of trial and error.

For business analysts and managers, visualization data analysis is a shortcut to big data times, because it accords with the basic logic of humans who tend to make decisions both intuitively and reasonably.

2-Instinct Prior to Reason: Visualization Psychology

Human history has been more than 300 million years long, the scriptural span of which is less than 10 thousand years. In the history of millions of years, like other animals, humans faced various sudden dangers at any time, so judgements should be so fast and agile as to avoid death. Nowadays, in every single event of clothing, travelling, avoiding cars, family languages, and even mobile games that happen every day, more than 80% of decision-makings depend still on sensory signals from eyes, ears, nose, tongue, and body, without deep thinking with the brain, which is not only safe, but also energy-saving. The famous economist and Nobel Prize winner Daniel Kahneman has shown clearly in his Thinking, Fast and Slow that when making decisions humans always have intuitive thinking and reasonable thinking combined. He examined thoroughly decision-makings derived from this model in economic activities. Decision-making in the economic field can be considered as a goldmine of data analysis, integrating reasonable thinking and intuitive reflections into a whole. [quotation needed]

There are many classical visualization examples around us. Figures are used usually in bathrooms of airports and high-speed railway stations to represent men’s/women’s washroom, and arrows to indicate directions. In every crossroad all over the world, red and green lights are used to direct the traffic. Logo as a figure, is universally chosen by companies and NPOs, such as Huawei’s “petals” and Apple’s “apple”. Behind these widely-used symbols is a typical application of visual elements, for fitting the process of consumers’ intuitive thinking.

After civilized, human kept this “intuitive decision-making system”, reducing the difficulty of decision-making that is needed before similar and frequently-happening events; while civilized society is characterized by a rapid development of “reasonable decision-making”, and we depend more on logic, reason and deep common sense than eyes, ears, nose, tongue and body, going beyond former cognitive limits, thus developing vaccines to fight viruses and launching rockets to explore outer space. Instinctive judging and reasonable thinking are human decision-making system, and the basic process of data analysis.

As data explodes in this big data times, data noises become more and more; expressing information in a rapid and effective way is key to data analysis. The crosstab and simple figure lose their power, and analysts must use better representations, while data visualization is the best window, with enough “knowledge intensity”, showing data highlights visually. Intuitive judging is a guide for reasonable thinking, and visualization is an interpreting language for big data times.

Choosing the fittest “Preattentive Attribute” is key to visual analytics. Modern phycology calls signals such as colors and shapes as “preattentive attributes” that can bring about phycological reflections. They live in our subconsciousness, only take 0.25 seconds to make recognitions, and so can be used as the best preamble to visual analytics. The main preattentive attributes are as figure 1-3 shows.

figure 1-3

The first one of the preattentive attributes is location, which means the most important information should be put in the most critical places, for example, enterprise data briefs present revenues and growth rates on the top of the first page.

The classical example of “winning as a result of location” is the Indian numbers (the Arabic numbers). After Indian decimal system and number 0 had been introduced into Europe, Latin numbers were becoming less and less of frequency day by day, one of the utmost reasons of which is that Indian numbers can express “big data” with 0 and the decimal system in a simpler way, making numbers understandable without diving deep into thinking. For example, when expressing “1888”, the logic of Indian numbers is that the more left the bigger, whenever adding up to more than 10, adding 1 to the number on the left. If it is expressed with Latin numbers, then such a string MDCCCLXXXVIII will be needed. Latin numerical system doesn’t have a carrying system, the places cannot tell the size, and it can only be told through calculation (see figure 1-4). Could it be imagined to make a financial report with Latin numbers?

Next to location of the common-used attributes are color and size. Size includes length, height, and even angle and area, applied to bar, column, pie and tree charts (see Chapter 5.).

In the traditional data analytics, bar, line and pie diagrams are the three elementary ones, but the big data analytics emphasizes macroscopic traits, distributive patterns and interactive relations of dig data samples, so appear histogram, box plot, and scatter diagrams that are of an advanced level. All of the diagrams extend the information in depth and hierarchy in ocular ways such as length, color and shape, as figure 1-5 shows, in the punctual delivery trend, colors indicate categories, areas quantities, and lengths of bars time spans of delivery, hence a presentation of the interrelations of data.

In the running of enterprise, decision-making exists anywhere and any time, or is prepared or ready to be made. Intuitive thinking is fast, but not appropriate for complicated questions; reasonable thinking is dependable, but its efficiency is low. Visual analytics simplify data expressions via visual elements, save much more intelligent resources for reasonable thinking, and realize a balance between the two.

So to speak, visual data analytics is a combination of art and science, fast and slow thinking, which tries to help us recognize data clues out of various, rapid and huge data, furthermore test hypotheses with experiences and thinking, thus reach the goal of assisting decision-making with data.

In thus condition, we need a really agile tool of data visualization analysis, to make every business operator analyze data at any time, to improve efficiency of decision-making. Visual agile BI analytics help more people collect and analyze data, especially business analysts and managers, thus to improve efficiency of decision-making.

Tableau obeys this kind of times trend, developed rapidly from a simple visual analytical tool to an enterprise-level platform of visual data analysis. Tableau Prep Builder could help users collect data more rapid and better, Tableau Desktop could transform data into information and knowledge through drag and drop analyses, calculations and interactions, and Tableau Server could help enterprises transform individual knowledge into organized one, adding values to data analysis. Using continually such BI products as Tableau, could enhance our understanding of data, and help enterprises become “data-driven organizations” (see figure 1-6).

The author experienced many and various industries and jobs, whose memories of national enterprises and private retail ones are still fresh. In the front of ever-increasing competitions, no matter what it is for improving efficiency or reducing cost, in national enterprise or private ones, Captain’s call has become a highly-risky matter of decision-making; more and more decisions depend on data analysis to be verified or falsified. Through BI tools, enterprises could transform data into assets, drive business progresses with data analysis, and be in the lead in the business growth and market competition.

3-Tableau: Van Gogh of Big Data Times

We know of many tools of data storage and analysis, such as Excel, WPS, and other database software: SQL Server, Oracle, PostgreSQL, etc. They exist in a variety of scenes, from mobile contacts to ERP. The more data, the more difficult data analysis. In the competitions of ever-increasing intensity, data analytics should be handed over to those groups that master data logics and make final decisions. Technologies are only a helping hand for data exploration, and experience and knowledge are the best catalyst for data analysis. As Tableau urges, “whoever raises a question is responsible for an answer to it.” [original text needed]

The only obstacle that lies in the front of those business personnel that own data and experiences is lack of proper tools. Simple tools of data analysis are not satisfying any more, and SQL is too abstract and difficult to be a basic and appropriate tool.

All is ready except for what is crucial. As business personnel has become the main force of data analysis, we need a fully-new, ocular, rapid, agile and easy-to-use data processing technique. Tableau created the VizQL originally, riding the waves of our days, developing rapidly and becoming widely used. It has the composite query of SQL, and takes into consideration business personnel’s need for agile use.

“Tableau helps people query and understand data.” [original text needed]

VizQL weaved a new piece of clothes for SQL that could be dressed by anyone, and putted the complicated technique into a black box. As figure 1-7 shows, through simple drags and drops visual charts could be generated, analytics deepened, data samples filtered, calculable fields created, and even composite dashboards built, and data stories distributed. This book will introduce the process of creating simple visualization diagrams in Chapter 2, and advanced ones in details in Chapter 5.

Tableau is fully worth the name of Van Gogh of big data times, especially in the zone of data visualization.

The author as an office man without a background of any technology, for job reason searched the market which is full of various BI products for a proper tool, and at last chose Tableau that has never disappointed him, helped improve his efficiency of retail analysis and making marketing prices, and accumulated his materials for this book. Now, Tableau as a tool of visual analytics, has grown into a visual data analysis platform of enterprise level, and been making progresses in terms of NLP and AI.

On the one hand, Tableau has been extending and enriching its product line continuously.

As figure 1-8 shows, Tableau added Tableau Prep Builder to Tableau Desktop visualization tools, making up to the shortage of agile data collection (2018), and extended the platform of Tableau Server with data governance and large-scale management (2019). The author’s especial love for Tableau Prep Builder is even more vehement than for Tableau Desktop in the early days, the former of which saved the author so much time.

The version of Tableau Prep Builder 2020.3 supports writing results of workflow into databases, hence a true agile tool of ETL. Every time when dealing with corporate training programs or executing programs, the author has always urged to learn Tableau Prep Builder, for the purpose of changing the way of doing with data. Compared to the complex and profound visual analysis, data collections always produce useful and instant results immediately.

On the other hand, Tableau as a visual agile analysis tool, has grown into a platform of data visualization analysis.

As figure 1-9 shows, Tableau supports a wide range of distribution policies and a large variety of interactive channels. The modular combination of products, the feature iteration that makes it always keep the pace with the times, and more and more open APIs, have molded itself into the most important participant in this big data times.

figure 1-9

In 2019, Tableau began to empower BI with AI, presenting successively such intelligent features as “Data Q&A”, “Data Interpretation” and “Intelligent Views” [jargon needed]. Many clients fell in love with “Data Interpretation” at a glance, which with the automatic “Bayes statistics” helps them query and understand error values and error reasons in data, promoting hypotheses and test before making a decision.

This book will introduce multiple upgrades from Tableau 2020.1, including map cache calculation (in Chapter 6), animation and dynamic parameters and so on. In 2020, the most important feature should be “data model” in the version of Tableau 2020.2, which has upgraded data merge based on aggregate, link and blend; and next to it should be “writing into databases” and predict function from the version of 2020.3.

4-Tableau Roadmap

In the choosing and learning of BI analytical tools, there exists an obvious difference between considerations of enterprises and data analysts. Enterprises wish to “exchange money for time”, and market opportunities brought by effective data decision-makings; while data analysts wish to “exchange time for money”, with analysis to create values, and values to create income.

As a matter of fact, Tableau is accessible highly, but profound; and so data analysts could build their professional barriers in the field of analysis with Tableau, and their own long-term market competitiveness through continuous learning.

The analyst Yvan Fornes drawed a roadmap for Tableau rapid learning. The author made a minor modification, as figure 1-10 shows.

figure 1-10

On the first knowledge of Tableau, so many people had become so ardent as to have a sense of deja vu. But it might be difficult to make an upgrade from data statistics to data analytics without a systematic learning.

This book is a summary of the author’s learning and training for so many years. Readers could learn and understand completely in a short time data links, visualization concepts, data visualization and calculations as its main topics, and learn thinking methods and usual models of enterprise analytics.

Learn please according to figure 1-10 and contents of this book, besides which following are several points worthy of attention to be paid.

  • The most substantial leap from Excel to Tableau in mind is thinking in levels.

The gigantic obstacle of learning Tableau is a habit of high probability to use it to present various Excel-way reports. True analytics should visualize data logics in the visualization methods, with crosstabs as an auxiliary help.

Data analysis highlights much more questions than data itself. Agile tools present questions of different difficulties with levels, the understanding of which is the basis of visualization and advanced calculations. Level analysis goes through this book, seen in every chapter, details seen in chapters 2, 5 and 8.

  • Understand first its principles, then grab its techniques, and then become a master.

“Birds can fly, is it because they have feathers?” It seems so, but actually it is not. Humans bound with wings cannot fly into the sky, while planes can fly in the sky with bodies in steel. This leap was made by humans’ accumulated knowledge of aerodynamics, with the pressure of airstream to create a lift force.

So is Tableau learning. Making simple diagrams doesn’t mean the ability to analyze data. Only through the understanding of its principles, could appear the grabbing of analytical techniques, and a mastery of solving complicated questions. The conceptual basis of this book is seen in detail in Chapter 2, and analytical process Chapter 5, advanced analytical process Chapter 6.

  • Techniques are not difficulty, the difficult of analysis lies in the understanding of business.

Data analyst is said to be one of the top 10 scarcest human resources in the 21st century, the reason of which is not that techniques are difficult to learn, but that the thinking ability to extract knowledge from data and information is difficult to cultivate. Every industry and every enterprise have their own analytical framework of self-system, analytical indexes and analytical logics. So it is only a starting point of data analysis to learn this book, while it is the main work of data analysis to understand and describe the business logics of industries and enterprises in the data-oriented way with Tableau, which is out of range of this book, and should be explored by data analysts themselves.

This book tries to introduce in a concise and complete way the process of Tableau Desktop and Tableau Prep Builder from data collection and data visualization to data display, with an emphasis on the process of the principles of data analysis, data interaction and advanced calculation. It is not only appropriate for new beginners but also veteran users.

Meanwhile, this book presupposed that the localization tools of Tableau Desktop and Tableau Prep Builder had been installed on your computer, the first-time use of which could be licensed through Tableau official website.