May 20th, 2019
Good things come to those who wait … like you, who waited with bated breath for the follow-up to my initial blog about the secret of a good predictive model.
From that last blog, we ruled out these items as the next “secret” to building predictive models:
- Dimension Reduction
- Feature Engineering
- Modeling Techniques
Again, although mastering the above methods can be impactful, you must put an emphasis on something else. So, after getting clean data, what is Secret #2?
While I’m writing this in relation to non-profit advanced analytics, it is easy to argue these secrets apply to all predictive modeling. The difference in non-profit analytics and other analytics is simply understanding your predictive modeling domain. And that is Secret #2—Domain Knowledge.
What does that mean?
How well do you understand the business context of your predictive model?
For that matter, how well do you know the non-profit fundraising programs and their historical objectives?
Based on your infrastructure, can you effectively apply artificial intelligence using the current database system?
Without clean data and sufficient domain knowledge, you will never make it to “good” in your modeling efforts.
Business knowledge and experience with your organization and the specific content knowledge of your data are often overlooked when a modeler gets people excited about things like machine learning. Do you have solid experience in non-profit fundraising? Do you have familiarity with how you have treated your donors in the past and how they have reacted? Being master of your domain (not in a Seinfeld way) helps to develop the base of your modeling environment.
With this domain knowledge, you already know what your focus could be and what patterns exist in the data that need to be considered. Consider these scenarios:
- Do event attendees look like the best monthly sustainer prospects or is that because you have only emailed monthly sustainer material to event attendees?
- Are planned giving prospects from your lapsed pool only good if they have giving to other charities or is that because your list plan only consists of other nonprofits and you only mailed planned giving information to lapsed multibuyers (donors that hit against outside lists) from the merge processing?
- A recent spike in giving and/or online actions has totally skewed your data. Are you aware of non-transactional activity that may have caused it (e.g., online investment strategies, emergencies, increased media exposure, etc.) based on past experience?
As a modeler, you should talk often with business stakeholders and subject matter experts in order to help refine the business problem, define any constraints, and outline metrics for success. Any modeling effort should be a team effort, not an endeavor of one person.
With all that being said, for modeling, is it beneficial to be strong in statistics, mathematics, data mining, and coding? Absolutely. Does it help to have a breadth of knowledge on different algorithms and knowing the basis of those algorithms (e.g., ordinary least squares vs. maximum likelihood estimation)? Sure. Although these skills are not as vital as they were in the past due to increased computing power and user-friendly modeling software environments, they can still help take a model from good to great. But, without clean data and sufficient domain knowledge, you will never make it to “good” in your modeling efforts.
Blog written by Jeff Huberty | Executive Vice President of Analytics and Partner