Searching and matching across these different you know parts corrected and set-asides data based on these predefined rules to eliminate duplications so really the goal here is deduplication.
How do you clean data well the steps in cleaning data are parsing correcting standardizing matching and consolidating?
Now, what is parsing so parsing location identifies individual data elements and the source files that isolate stinking elements at the target files. Examples include parsing the first for the last name’s big memorable block so if you have someone’s name you may need to whitespace parse it and separate it to first and all that or if you have street name it a street number and so on so forth maybe comma delimiter – etc…
Correcting is fixing problems so for example you know you may need to fix a zip code, or we may need to fix a misspelling or typo or whatever.
Standardizing means that you want to follow a standard set of rules for how things are formatted like maybe it must be 8 characters or must be in this format either formatted you must move it.
Matching is searching and matching across these different you know parts corrected and set-asides data based on these predefined rules to eliminate duplications so really the goal here is deduplication.
Consolidating of courses is combining things into one representation because there maybe once you’ve done all the previous stuff you may find out that hey you know this can be represented rather than having you like 5 or 6 different rows.
About the Author
Syed Sarfaraz has over 17+ years of experience in multiple BI Analytical tools, Software Development. In various projects, he has successfully implemented BI solutions for clients. He also contributes articles for digital-ranking.com. He focuses on solutions that merge current technologies, applications, and concepts to help each client meet their goals with success.
#datastrategy #datagovernance #datanalytics #dataprotection #dataarchitecture #dataplatform #datasecurity #dataintelligence #dataintegration #datacollection #datacleaning #datadriven #datadiscovery #datastorytelling #datavisualization #chiefdataofficer #datacleaning #datacleansing #datamaturity #datastandards #datacorrection