Table of Contents
01
of 06
Before you start…
AI is probably the word that you hear or see more in your company, on social media, and everywhere. And you might be eager to start using Einstein AI to optimise your business without checking your data quality. It seems uncomplicated in theory, but is it that way in practice? Let’s break it into steps to make it easier:
- Decide what you want to predict. It looks easy, but you need to know what you really want to achieve, and this needs to be clearly defined in terms of concept and measure.
- Get historical data in order. As the saying goes, “The best predictor of future behaviour is past behaviour”. This historical data needs to satisfy some conditions that we will analyse next.
- Turn predictions into action. Now that you have a prediction (a probability), it is time to take action, e.g. offer a special promotion to all customers with over a 40% chance of attritting.
- Enhance your actions. With the help of gen AI, you get generative content to support your actions e.g. Einstein can generate a special promotion email to send to those customers with over 40% of attriting.
If you take a sneak peek into Salesforce Einstein, you’ll understand that historical data is an important key in this process. And we can say that poor data = poor AI, right? But good data is not only relevant if you use Salesforce Einstein AI. However, maintaining data accuracy becomes even more critical when AI is introduced. In this article, we talk about data quality and why it is important to consider it while introducing artificial intelligence.
02
of 06
Good data vs. poor data
Data quality refers to the accuracy, completeness, consistency, and reliability of the data in your Salesforce platform. High-quality data ensures that your AI systems can make accurate predictions, recommendations, and insights. Poor data quality can result in incorrect outcomes and corrupt customer trust. To ensure you have quality data, ask yourself the following questions:
Data Quality categories
Accuracy | Are all details correct, with no margin for error and dependable? | ||
Completeness | Is the data comprehensive? Is anything missing? Are the records obsolete? | ||
Reliability | Is the data extracted from reliable sources? Is it consistent across all systems, without contradictions? | ||
Relevancy | Is the data relevant? | ||
Timeliness | Is the data up-to-date? Is it readily available to use when needed? |
Many things can contribute to poor data
a. There are too many fields to fill
Having a hundred fields to fill doesn’t necessarily translate to valuable information. The key takeaway is that not all fields should be mandatory. As people fill in only what they feel like providing, random and often incomplete records result. The existence of various levels and numerous non-mandatory fields results in a high volume of records with little to no actual data.
Striking a balance with mandatory fields is essential; too many can deter users from completing the form. Categorising data and avoiding excessive use of text can enhance standardisation, simplifying better segmentation.
b. You didn’t consider the usability of your data
It is essential to define the type and format of expected data, such as phone numbers, emails, and addresses, as invalid entries compromise data quality. Data usability is crucial; merely having fields is not enough—they must be user-friendly and have a goal for their existence.
The focus should be on capturing only essential information, creating patterns, and establishing common sense guidelines. More fields do not equate to more utility, and the cost-effectiveness of data storage should be a consideration. Don’t forget to validate them without forgetting the user experience.
c. You have duplicates
Duplicates interfere with usability and can lead to confusion and inefficiencies in data management. When contact information is duplicated, it becomes challenging to discern which entry is accurate or up-to-date. Having mechanisms to highlight those duplicates and merge them is important.
d. You don’t realise you have poor data
Addressing the issue of outdated data is compulsory. Regularly purging or updating obsolete information prevents storage costs from inflating and ensures that the stored data remains relevant and valuable.
03
of 06
Importance of good-quality data for your organisation
As good-quality data has a positive impact on many areas of business performance, companies should focus on improving data quality. Actually, according to Salesforce analysis, inaccurate or incomplete data can lead to 20% stalled productivity, which translates to one day of work each week. That’s a lot! The average company will lose 12% of its revenue due to inaccurate data, and 40% of businesses will fail to achieve their targeted benefits because of poor data quality.
See why data quality is paramount to meet your objectives:
Bad data | Good data |
Lost revenue | Identify cross-sell and upsell opportunities |
Missing or inaccurate insights | Gain account insights |
Wasted time and resources | Prospect and target new customers |
Inefficiency | Increase efficiency |
Slow info retrieval | Retrieve the correct info fast |
Poor customer service | Build trust with customers |
Reputational damage | Score and route leads faster |
Decreased adoption by users | Increase adoption by users |
04
of 06
The most common data quality issues that can impact Einstein AI
As stated previously, not having quality data can (and will!) impact your software so watch out for the following points when looking for quality issues.
a. Missing data and skewed information
Incomplete or skewed data, e.g. when determining the time to close an opportunity, using the duration of an opportunity that is completely out of range can distort the accuracy of predictions.
b. Multilingual formats
Large companies often face challenges with values in different languages, variations in accentuation, and differences between singular and plural forms.
c. No data cleaning in Salesforce
Cleaning data within the Einstein platform is essential, but it is even more effective when performed at the source, particularly within Salesforce records. Don’t just do data cleansing because you have Einstein AI. Start at the root problem.
d. Missing values
Instances where numerical values are expected, but the presence of letters disrupts the intended analysis.
e. Cardinality and ordinality
Challenges arise when dealing with wide ranges, such as scoring from 1 to 100, leading to potential misinterpretation by the Einstein AI model.
By implementing robust data cleaning practices, filtering out irrelevant records, and standardising information, organisations can enhance the accuracy and reliability of insights derived from Einstein AI.
The best practice is to fix data at the source. However, that is not always possible. No worries that is possible when we are preparing the data set, e.g. numbers that we skew the data, and we can filter out those records.
05
of 06
Best practices for cleaning and transforming data for Salesforce Einstein AI
If you’ve acknowledged some of the issues stated in the previous point, then watch out for the best practices for cleaning your data.
a. Define what you want to predict
When initiating an AI project, define the granularity of your analysis early. Clearly state what you aim to understand or predict. This clarity ensures that you gather and prepare the data at the appropriate level of detail.
b. Don’t see data cleaning as an option
Before delving into the complexities of artificial intelligence, it is imperative to acknowledge that the journey begins with clean and reliable data. Starting with inaccurate or incomplete data will compromise the integrity of any subsequent analysis. Make it a priority.
c. Quantity of data matters
Consider both the quality and quantity of your data. Consider, for example, setting a minimum requirement of 400 records, with only 10 new entries per year. The amount of data becomes particularly relevant depending on what you aim to predict. For variables in constant flux, such as customer preferences, AI can provide tangible benefits, aiding in tasks like improving contact associations.
d. Identify variables with higher correlation
It is crucial to identify the variables that bring real value. Avoid including irrelevant columns or fields like text areas or minute quotes. Also, avoid having too many empty columns, as they contribute nothing to the analysis.
e. Involve the business and get feedback
You have to ensure the active involvement of the business stakeholders throughout the AI implementation process. Their insights and domain knowledge are invaluable in refining the model and aligning it with real-world business objectives.
f. Continuous improvement
Keep in mind that is difficult to have the right model at first tentative. With the business feedback, the adoption of the predictors/ actions, data quality refinement, model monitoring and finding the best sources of data to improve is key to improving your model and creating versions 2, 3, N. Until the business is comfortable with the prediction.
06
of 06
Preparing data for Salesforce Einstein AI with Stellaxius
If you’re ready to take the next step, we at Stellaxius adopt a pragmatic and realistic approach to getting data prepared for Salesforce Einstein, acknowledging that AI is a journey: start small, learn, improve and start again. The focus is on delivering transparent and professional solutions that align with the client’s objectives.
Here’s an insight into our methodology for preparing data for Einstein AI:
- We’ll help you understand the importance of being pragmatic in the data preparation process. Instead of aiming for everything, the emphasis is on making necessary adjustments to the data to ensure it serves its purpose effectively.
- When preparing data for Salesforce Einstein AI, you must acknowledge the unpredictability of real-world scenarios. Plans are adjusted to accommodate potential challenges and deviations from the ideal path. This adaptive approach ensures that the data preparation process remains resilient despite uncertainties.
- We value transparency and professionalism throughout the data preparation process. Clear communication with our clients is paramount, ensuring that expectations are set realistically, and any limitations or challenges are communicated openly.
If you’re ready to take the next step, don’t hesitate to contact us and start your AI journey with us. Oh, and of course, don’t forget to subscribe to our Knowledge Center to keep yourself updated on the newest articles.