0

Want To Quickly and Easily Identify New Incoming Data? Keep Reading to Find Out How …

For all the data quality developers – are you sure you’re using your standardization techniques efficiently?

Namecandy.com a website that compiles and analyses all the new names of babies, suspects that parents are seeking Google whacks, names that return a single hit only. Following are some real first names that are in the top thousand most popular names and let’s see how we can identify them as new contacts. What / who are new contacts? Names that do not exist in reference tables are deemed to be new data, for this exercise. In focus here is the Token Labeler and FirstName reference table.

Use a Token Labeler transformation, apply the Firstname reference table in an Inclusive mode.

Reference table in ‘Inclusive’ mode

 

Profile the transformation, the new Contacts get identified appropriately as WORD by the Token Labeller

 

Now, let’s say, we wanted to follow up with all contacts with a specific first name or last name, viz. a first name of ‘AKIER’. We can do that by using the Custom Label feature. Create a new custom label as shown.

Profile and you can see a label of ‘new_contact Surname’ with a drilldown value of ‘AKIER’

Please post your comments below, send your comments to Informatica University or join our community discussion at the Informatica University LinkedIn Group.

 

 

 

FacebookTwitterLinkedInEmailPrintShare
This entry was posted in Data Quality, Education Services and tagged , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>