

There are some groupings that I think are incorrect or am not sure of just yet, like this one, so to revert I’ll simply uncheck the 330 and remove it from the grouping: Scrolling through the results (changes identified by the paper clip), I can see some wanted adjustments, like this one to Avalon: Since there are so many distinct values (537), I’m going to tell Tableau Prep to take a pass at common character grouping and replacing. Now that I’m happy with my ShipMaker field, I’m going to go through similar steps on the ShipModel field. This is an easy fix in Tableau Prep, simply click on specific field’s drop-down and in the Clean sub-option, selecting Make Uppercase will adjust your values as such: Moving on, I know my boss prefers to see ShipMaker in all caps. Our distinct ShipMaker count is now 62, reflecting the fix made to the inaccurate fields: As you can see, Ford and GMC are fixed and there is no trace of the incorrect values. That’s easy to fix however, I can simply manually group FordGMC into Ford! Once you’re done making changes, clicking Done will essentially commit your groupings. Here we can see that GMCC was automatically re-mapped to GMC (good), but it was unable to combine FordGMC into Ford like I expected: For this field, a common character group and replacement makes the most sense since any “bad” fields are likely a result of bad data entry or concatenation:Īfter I run the common character group and replace cleanse, I can scan through the results and see what Tableau Prep was able to fix for me. Instead of manually grouping and replacing these erroneous values, I’ll let Tableau Prep work its magic and then clean up anything left over. I can see right away that these values should be Ford and Maybach, respectively. If you’re familiar with your data, like I am with Ship Makers, for example, the anomalies will be easy to spot by simply sorting the desired column by count, and then eye-balling the singular values for errors: Since I know this dataset is from a system prone to human error, one of the first things I’ll do is look for abnormal values. Once I have our sales data loaded up, if I click to add step I can see that the Profile pane now shows me a nice summary of the fields: First, click the + sign by Connections and add the Excel file to Tableau Prep, then drag the sheet you want out onto the flow if necessary. Now that I have an idea of what I’m dealing with, I’m going to put this file into a Tableau Prep flow and do some data profiling. Opening the ship_sales_dataset1.xlsx file, I can already see some potential issues: Chances are, since my base files are from a variety of different sources and systems, there are going to be some anomalies that need addressing. Before I can build visualizations on these data sources, I need to make sure my disparate data sets contain valid information and can talk to each other.

csv files with similar elements that I need to combine for analysis in Tableau. The year may be 3015, but data management has frozen in time. Sales for the company have struggled as of late, and I need to dig into both mine and my competitors’ numbers and see just how our models are stacking up in price, sales and specifications. The year is 3015, and I am the Sales Manager for Toyota. Unfortunately, that isn’t happening, and sets of data will always need massaging and wrangling.įor this (and forthcoming) Tableau Prep how-to blog post(s), we’re going to be looking at some spaceship manufacturing, sales and pricing data. While I’m dreaming, I’d also love to have a private island and an unlimited supply of coffee ice cream. I long for the day when data arrives clean – no bogus characters, mismatched naming conventions and or even duplicates. My focus for this blog post will be the variety of formidable data cleansing options available in Tableau Prep (TP for short).
