As of this entry, we are finishing our article series about “How to treat your data”. In this last entry, we will make a summary of all the treatment criteria previously analyzed, so you can understand the necessary stages of data treatment.
Stages for data treatment
Just as we have commented on our previous entry, there are four types of stages to treat all kinds of data:
- Formatting or Cleaning
- Completion or Enrichment
- Unification or Deduplication
I. Formatting or Cleaning
Cleaning data allows you to delete the noise of a value, so the resulting value adjusts to the characteristics of the treated field.
If we delete the characters that are not allowed on a phone, we will assure a correct format and we will be able to pass a validation in the phone’s format.
You can get more information on data cleaning, by reading the entry How to treat your data (I): Formatting.
Validating a data involves checking if the value complies with the specific rules of that field type.
There are multiple validations available which depend on the rules we are interested in complying. The more strict we are with the value in origin, the more reliable will be the validity of the treated field.
You can get more information about data validation reading the entry How to treat your data (II): Validation.
III. Completion or Enrichment
Completing a data supposes to add additional data related to an initial value. The enrichment allows you to get a better knowledge of an existing register.
To assure your data in origin is correct, it is recommended to apply the minimum validation rules, or the enrichment will get back unreliable results and not that adjusted to reality.
You can learn more about data enrichment reading How to treat your data (III): Completion.
IV. Unification or Deduplication
We are at the last stage of a treatment, which allows you to unify the data by deleting the duplicated registers of a base.
It is possible to apply a deduplication to a data not previously treated, though to get better results it is recommended to clean and format data in their origin.
You can get more information about unification reading the entry How to treat your data (IV): Unification.
In the future, we will continue analyzing the most common use cases, analyzing the tools available on UProc and we will publish news regarding data treatment. Remember that we are at your complete disposal both from the chat and the contact form, to solve any question you may have about the service.