Native Duplicate Management in Salesforce

Administrators (and developers) are often faced with the task of cleaning up duplicate data from an org. Contact and Lead objects are the usual suspects for duplicate data. Anyone who has made an attempt at cleaning up duplicate data would agree it is a daunting and quite unenviable task. Duplicate data weighs down on your employee performance, application usability, risks customer satisfaction, and skews analytics. So, it is a priority for most organizations to put in place mechanisms to prevent data duplication.

As part of the Spring ’15 release, Salesforce.com has rolled out this much asked for feature – “Duplicate Management”. While they have marketed it as a Data.com feature, it doesn’t need a Data.com license, any customer with at least a Professional Edition subscription can leverage this new native Duplicate Management feature added by Salesforce.

Let’s have a look at the details.

This new feature appears under Setup | Data.com Administration | Duplicate Management. There are two primary concepts you need to grasp, in order to set up this feature – Matching rules and Duplicate Rules. A matching rule tells Salesforce how to compare records, and groups of matching rules make up a duplicate rule.

Matching Rules

You will find Matching Rules within the Duplicate Management setup section, where there is the Matching Rule List View. On creating a new Matching Rule, you would notice that this feature is currently supported for Account, Contact and Lead objects alone.

While this is a limitation, these three are in fact the most common candidates that need de-duplication checks. While defining a matching rule, you would provide the fields that need to be considered for matching. For instance, suppose our business rule defines a duplicate account as any two accounts sharing the same phone number, then we would define the matching rule as shown below.

 

Notice that Salesforce gives us the option of specifying multiple fields to check for matches; as well as the filter logic (And-Or statements) to link them. Another interesting aspect is that fact that it allows us to define the level for checking for matches. The matching method allows us to choose whether to perform an exact match, or a fuzzy match. Fuzzy match would mean that, even an approximate match would be considered as a duplicate. Fuzzy matches are best used for defining text-field based matching rules (for instance, on Account Name). Salesforce has published the detail logic used for fuzzy matching in this page:

https://help.salesforce.com/htviewhelpdoc?err=1&id=matching_rules_matching_methods.htm

After saving this matching rule, it does not become available until it is explicitly activated by clicking the Activate button. The activation is not instantaneous, and you are notified by an email when the Matching Rule gets activated. In most cases, the activation happens in less than a minute.

While defining the matching rules, you also get to specify whether the system should treat two blank fields as a match. That is, in our example, it would allow only one account in the system that has a blank value for the Account > Phone field – since we have marked Match Blank Fields as Y.

That’s it; your matching rule is ready of use. But of course, the matching rule has no effect on any data operations unless it is made part of a duplicate rule.

Duplicate Rules

Duplicate Rules are also created in the context of an object, and as described before you would get options to create duplicate rules for only Account, Contact and Leads. It is through a duplicate rule that would specify when and how to run a matching rule.

In the context of our example, you could specify whether Salesforce should compare the Account record that the user is trying to create, against all records in the system; or only against those records that the user has access to. You would do this by specifying the Record Level Security parameter on the Duplicate Rule. However, it would be of interest to note that, in the event of a match being found with an existing record that the current user doesn’t have visibility to – he would only be shown the duplicate error, but would not be shown the existing record. It could lead to some amount of end user confusion, so it is best to have a FAQ or Wiki circulated to your end users before rolling out this feature.

Also, while defining the duplicate rules, you would be able to specify the application behavior when a user attempts to create a duplicate record. You could make the system to prevent the record creation with an alert, all record creation with an alert, or allow record creation without an alert.

 

You would then choose the matching rule based on which the duplicate rule should run, and also specify entry criteria if any. A common use case for specifying entry criteria for duplicate rules is to ensure that the test records are not considered.

Another interesting aspect about the native Salesforce Data.com Duplicate Management feature is that you are able to compare for duplicates across object (objects meaning, the three objects that are supported).

For instance, shown below is a case where Accounts are being compared with Contacts. This could be a valid use case if you are using the Person Account of Salesforce. A more relevant use case could be to compare Leads with Contacts. The only extra step you would need to do while configuring cross-object duplicate rules is that, you need to specify a field mapping between the objects (as shown below).

 

As is the case with Matching Rules, a Duplicate Rule also needs to be activated before having any effect on the system. Once activated, all record creations are monitored and duplicate errors are thrown based on your definition of the matching rule and duplicate rules.

On an attempt to create a duplicate record, the user would be thrown a duplicate alert (alert text can also be specified while defining the duplicate rule), and is shown the list of possible duplicates records, as shown below.

 

In summary, this is a very nifty new feature from Salesforce, and could save you from having to scout AppExchange for data de-duplication products.

However, do note that this is a “duplicate check on create” feature. It is not a data clean up feature, where you could perform duplicate tests on existing data. This feature is only valid for a creating new data for either of Account, Contact or Lead objects.