We continue with our article series about data treatment, and now we will focus on one last criterion: unification or deleting repeated data.

 

Why do you need to unify your data?

Unification allows you to simplify your bases management, and it helps you to achieve the following goals:

  • To know your users better.
  • To reduce duplicate data.
  • To reduce duplicate data.
  • To reduce the size of the bases.

 

What is the origin of the duplicate data?

Having duplicates in your database is a clear indicator of the low reliability of your capture or acquisition processes.

To solve the detected duplicities, you must follow the following steps:

  • Detect the active acquisition sources (forms, external services, imports, …).
  • Review said sources and the logic applied during acquisition in order to determine how they are affecting your bases.
  • Apply the necessary corrective measures to ensure that the same reliability problems are not detected in the future.

 

How to unify your data correctly?

The unification depends completely on the criteria applied during the process. These criteria may vary from one base to another, and they depend completely on the goals for which the base is being created.

It is not the same to obtain a base to carry out a Telemarketing action or to attract customers from an online store. The priority fields in each case can be very different. For a Telemarketing campaign, the landline or mobile phone can be decisive to avoid duplication, and in the case of an online store, the email is the key field.

It may be necessary to concatenate different deduplication criteria to unify duplicate records. It could be interesting to further unify them by the following criteria:

  • Postal address
  • Name and surnames
  • Nif

You must be clear about the criteria or fields for which you will carry out the unification. A change in the unification criteria may give place to a completely different base, thus affecting the subsequent actions to be carried out.

 

What other aspects can affect the unification?

If you want to achieve a good unification of your bases, the field values must be previously normalized to ensure a good result.

Cleaning or formatting the values of the fields used are basic to ensure that the data at origin have the same quality needed to start deduplication.

You can get more information regarding unification reading the entry How to treat your data (I): Format.

 

What will we see next?

In the next chapter, we will review all the criteria analyzed so that you can quickly understand the options you have available in order to achieve the highest quality for your databases.

 

Best,

UProc team



Do you want to explore tools?

Signup!