AI for Insight, Traditional Automation for Accuracy
Artificial intelligence has successfully taken over every conversation about automation and improving productivity in the last couple of years, and with good reason. In this article I will provide some reflections around both the incredible benefits of AI in data managements, but also when and why traditional data logic and good old fashion automations can solve your task in a better way.
Just to clarify, this post is not intended to talk down AI. It is actually co-written by AI as we do with most texts nowadays. It is intended to enlighten you about some of the areas where AI may be a misleading, less accurate, and/or resource heavy solution compared to the practices we have relied on for many years already.
AI as a leap in data management and analysis
For decades, automation has been strictly rule-based. This is fantastic for predictable, repetitive tasks, but it shatters the moment it encounters ambiguity.
This is where AI truly shines. Instead of just following explicit instructions, AI models can learn from your data, which opens up entirely new capabilities:
-
Understanding Unstructured Data: An AI can read an email, understand the sentiment, and categorize the topic, all without needing a predefined list of keywords or categories.
-
Predictive Analytics: Automation can tell you what your sales were last quarter. AI can analyze thousands of variables (market trends, web traffic, seasonal data, even the weather) to build a sophisticated model that predicts what your sales are likely to be next quarter.
-
Intelligent assumptions and pattern recognition: An AI can look at "John Smith" and "J. Smith" at the same address, analyze their purchase history, and conclude with 95% certainty that they are the same person.
AI has transformed data management from a simple, reactive process of storing and moving data to an intelligent, proactive discipline focused on interpretation and insight.
How intelligent subjectivity matters in data quality vs quantity
For a long time, the mantra in data was "garbage in, garbage out." Data quality was policed by rigid validation rules: Is the postal code in the correct format? Is the entry_date a valid date? This approach stops obviously bad data, but it does nothing to fix messy or ambiguous data, especially at scale.
We are now drowning in a high volumes of data from many different sources, and the quality and formatting often differs from each source. This is where AI's "intelligent subjectivity" becomes a critical tool.
Think about entity resolution. A traditional database would see MegaCorp Ltd., Mega-Corp Inc., and Mega Corporation as three separate companies. An AI model, on the other hand, can make assumptions based on typical patterns -> a subjective, but highly educated, guess. It assesses the similarity of the names, checks other fields like addresses or business numbers, and concludes that these are all the same entity.
This is not a binary, rule-based decision. It's a judgment call, and it’s what allows us to cleanse and harmonize massive, messy datasets, turning an unusable quantity of data into a high-quality "single source of truth."
Assumptions, hallucinations, and consistency
Here we arrive at the core of our warning. The "intelligent subjectivity" that makes AI so powerful is also its greatest liability. AI models are probabilistic, not deterministic.
A traditional automation script is deterministic. If you run a script that calculates (Column A + Column B), it will give you the exact same answer for the same inputs, every single time, forever. It is 100% reliable and consistent.
An AI model does not operate this way.
-
Assumptions: An AI is trained on data, and it inherits all the biases and assumptions present in that data. If your historical data subtly favored certain demographics, an AI model trained on it will continue to make those biased assumptions when making predictions or classifications.
-
Hallucinations: This is the most dangerous pitfall, especially with Generative AI. When asked to fill a gap or summarize a document, an AI doesn't "know" facts. It generates a response that is statistically plausible. This means it can, and will, invent information—a "fact," a number, a data point—that looks completely real but is potentially just made up.
-
Consistency: A generative AI might summarize the same report slightly differently each time you ask it. A classification model might rate a customer review as "positive" today but "neutral" tomorrow if the model is updated. This lack of perfect, repeatable consistency is a deal-breaker for many core business processes.
When a task requires unwavering consistency and factual accuracy, relying on a probabilistic AI can be misleading and, in some cases, catastrophic.
When objectivity matters more
This brings us to the all-important "when." If AI is subjective and probabilistic, when do we stick with "good old-fashioned automation"?
The answer is simple: You use traditional automation whenever objectivity, accuracy, and repeatability are non-negotiable.
This is the domain of deterministic logic. You do not want a "creative" or "subjective" model handling these tasks.
-
Financial Reporting: You cannot have an AI "hallucinate" figures for your quarterly earnings report or "probabilistically" decide how to balance a ledger. This requires hard, objective math.
-
Regulatory Compliance: When a user invokes their "right to be forgotten" under GDPR, you don't want an AI to "subjectively" decide which data to delete. You need a deterministic script that finds and purges every single specified record, every single time, without fail.
-
Core Data Operations (ETL): When you are moving data from your production database to your data warehouse, you need an exact, byte-for-byte copy. The integrity of the data is paramount. AI's tendency to "interpret" or "cleanse" is a liability here, not a feature.
-
Payroll and Billing: Calculating an employee's paycheck based on hours, tax rates, and deductions is a matter of pure, objective logic. There is no room for interpretation.
Takeaway: A Partnership, Not a Replacement
So, what's the takeaway? The choice should not be about AI or Automation. The most effective data strategy uses them both, for different types of tasks.
Use AI for discovery, interpretation, and handling ambiguity. Use it to understand unstructured data, enrich your customer profiles, and predict future trends.
Use traditional automation for execution, precision, and compliance. Use it to move data, calculate figures, and perform any task that must be 100% accurate and repeatable.