Home – Data Privacy Guard – Anonymization methods

You decide how your data is anonymized

Data Privacy Guard provides a wide variety of anonymization algorithms. Check out the following tabs to see what the possibilities are for your dataset.

Replace changes all row values of the configured column to a predefined value. This method is an example of a generalization technique by which identification based on personal characteristics is made impossible.

  • All values made equal

    By using Replace you make sure all unique values are removed and replaced by a standardized value.

  • Mark columns as anonymized

    For example: change the content of all values in a memo column to “Anonymized” to indicate that all privacy-sensitive data has been removed.

Using the Remove method on a column clears all values inside that specific column. This is the simplest method that can be used on columns that have no purpose when the anonymization process has completed.

  • Data minimalization

    Removing all values in a column does not affect the structure of your table, while all privacy-sensitive data is removed.

The Randomize method replaces all values with a randomly selected value that pre-existed in the column. The random nature of the algorithm assures unpredictable and different values each and every time the anonymization process is executed.

  • Keep your data distribution intact

    An additional option is to keep your original data distribution intact. For example, this option guarantees that every name or city in a column appears just as frequently after the completion of the anonymization process as before.

  • Force a non-identical value

    Another additional option is to force the algorithm to never return a value that is equal to the original value.

Randomize Collection is an extension of the Randomize method. Using this method, the algorithm is not executed on a single column but rather on a set of up to three columns. This way, all row values of these columns keep their connection during the anonymization process preserving the relationship between these columns for every row.

  • Keep related data together

    This method is often used to keep geographical data together, like for example a postal code and a corresponding city.

The Randomize Pool method replaces all values within a column by a randomly selected value from a set of values specified by you. This option comes in handy when, for example, only a few specific values are useful in your testing processes.

  • No limits to the pool size

    There is no limit to the set of values the algorithm can choose from. It is even possible to extract values from an external data source.

Randomize Interval replaces the numerical or date values in a column with a value based on an interval you define. For example, if you configure Randomize Interval on a date column you can set the interval to 10 days. When the column is anonymized, all rows within the column will receive a randomly selected date that is between -10 and +10 days removed from the original date.

  • Dates and numerical digits

    This method can be configured on numerical and date columns.

  • Several intervals are possible

    Date intervals can be configured on a scale of days, weeks, months and years.

Tokenize is a pseudonymization algorithm that replaces all row values of a column by a randomly generated key value. By applying this method, it is possible to maintain relationships between multiple datasets. Another reason to apply this method is when, for example, your research requires the row value of the original dataset to be retrieved.

  • Maintain relationships between datasets

    The Tokenize method is especially useful when you need to be able to connect datasets after anonymization, for example based on a primary key like Customer ID.

  • Revert to the original value

    Perhaps your research requires you to always be able to retrieve the original values belonging to a row in the dataset. This pseudonymization method makes it possible.

  • Customizable token format

    By configuring a specific token format, you make sure it always fits your data model.

Using the Hash method, all values in a column are replaced by a random combination of letters and numbers.

  • Completely random hash

    The new value is in no way based on the original, but it is a completely randomly generated new value.

Truncate does exactly as its name suggests, it completely removes the entire content of the table. You can only apply this method when you are anonymizing a table within a database.

  • Clear audit and log tables

    This method is often used to quickly empty tables that contain logging and auditing records, when they have no purpose in the processes after anonymization.

Additional customizations

Do you have more detailed wishes to anonymize your data? Customize your anonymization process even further with these additional options!

  • Conditional anonymization

    Configure conditions to specify exactly what needs to be anonymized. For example, exclude specific records containing test data from the anonymization process.

  • Method filter

    If you don’t want to execute the entire anonymization process for a dataset but only a specific method, this is possible by applying a method filter to your configuration.

  • Table filter

    By applying a table filter you can execute the anonymization process on only a specific table in your database.