How to create deduping Query with Altru
The Deduper application requires a data source to be able to dedupe data. Typically this is a query that outputs the data you want to dedupe. In this short guide we will be walking you through the process of creating a simple query to find duplicates.
Remember this query can be any query you want to process for duplicates. For this example we will be deduping our entire database but you have the option to choose which data to dedupe depending on how you set up your data source.
Step 1: Login into your Social Good Software account and click on the deduper application.
Once you have clicked on the deduper application click on the ODATA Links navigation item on the top right hand side menu.
Then you will see the option to add an ODATA link. Click on the plus option to create a new link with Altru.
You will get a dialog asking you for an ODATA link since we don’t yet have the link. We will click on the external icon to open the Altru information library directly.
Step 2: Create a De-duping query
Clicking on the link will open a new tab with the information library in Altru. Click on the “Add an ad-hoc query” option on the top left hand side navigation.
This will bring up the option to create a query. We are going to be creating a Constituent query to dedupe constituents. Note: If you want to dedupe other data types select the right query type, for example sales query.
Now it’s time to create the query based on our query restrictions.
Let’s review each restriction one at a time.
Individual is equal to yes – We only want to dedupe individual records at this time. If you want to dedupe other types such as organizations create a separate query and process those independently from this query.
Date added is before DATE or is blank – Altru has a restriction to only be able to output 15,000 rows at a time with ODATA links. Because of this restriction we have to create multiple queries. You want to create a query that outputs less than 15,000 rows. In our example it’s any record created before 1/1/2001. Keep in mind this could be different for your organization.
Email address – We want only records that have an email address and it’s the primary email address. Records without an email address are much more difficult to dedupe correctly and Altru typically picks all this up and merges them automatically.
Let’s now setup the output display fields
Let’s review each output field one at a time.
Lookup ID – Quick handy way to view if the record is a duplicate by just seeing the ID.
Email Address – Easily see if the email can provide a contextual action
First name – View the first name
Last name – View the last name
Address – View the address
Constituent record – We need this to be able to merge records together
Now click on the Preview Tab on the top navigation.
You will want to make sure the total result is less than 15,000 rows.
Don’t worry about not having all the records in this query, we will come back and create more queries for the rest of the records. Once you are happy with the output click on the “Set save options”
Make sure you check the “Suppress duplicate rows” checkbox. Don’t worry if you see multiple rows the Deduper application knows how to deal with duplicates.
Double check the output of the query is less than 15,000 rows.
If you see more than 15,000 rows go back and adjust the date to output less than 15,000 rows.
Once you have verified you have less than 15,000 rows then name and save your query.
Name: Duplicate Constituent Records Query (BOT – 2001)
Description: Query to output records created from beginning of time to 01/01/2001
Step 3: Duplicate Query with revised dates
Once you have your query created it’s time to duplicate the query and adjust the dates for the next 15,000 rows in your database by clicking on the copy button.
This will bring up a screen identical to the creation query screen. You will have to change the date range selection and rename the query to reflect the new dates. Select the date restriction and use the Between operator to select two date ranges.
You might be able to select more years depending on the output. For example if your output is only a few hundred rows you can select a wider range of dates. If you are getting too many rows you might have to restrict the date range to be between months and not years.
Click on the “Set save options” and save your query by changing the name
Name: Duplicate Constituent Records Query (2001 – 2003)
Description: Query to output records created from 01/01/2001 to 01/01/2003
At this point you will want to repeat the process in step three until you have done your entire database. This will equal a group of queries. Here is how the process works.
You will end up with a group of queries. You now have the ODATA queries required for the Deduper application. You can now at this point follow the guide on “How to set up ODATA links with Deduper”.