The industry took a decade (most of 90’s) to absorb the meaning of a ‘datawarehouse’ and just when it got over the war between Kimball’s conformed (shared) dimensions and Inmon’s heavy reliance on data marts, the term ‘real-time datawarehousing’ was introduced in 2000. The debate has now shifted to three fundamental points:
- What is ‘real-time’?
- Is there ‘real’ business demand for this technology?
- Are there technologies and technical architectures that can enable a ‘real-time datawarehouse’?
I am going to focus on the first two points in this blog and deal with the third point later.
So what is ‘real-time’? Real-time can be defined as having no lag between the occurrence of an event to the time it is recorded and shared. But the words ‘no lag’ are a misnomer as what may be perceived as ‘no lag’ to a human eye can be captured as hundreds of snapshots by a high-powered camera. Hence, knowing the nuances in precision, the terms ‘near real-time’ or ‘on time’ are gaining more popularity. ‘Near real-time’ refers to having a lag of couple of minutes to an hour where as ‘on time’ is geared to meet service level agreements (SLA’s) and policies where data is anywhere from two hours to a day old.
So now that we have settled on what is real-time, is there real demand for these technologies? Before answering that question, we have to step back and think about the speed at which businesses are conducted, and one can argue that very few business need data on a real-time basis. But when the question is posed in terms of business value, a different perspective emerges.
There has always been demand for ‘on time’ datawarehouses, but is there demand for ‘near real-time’ datawarehouses? Yes, there is. A recent article talks about how Fashion retailer Elie Tahari is using ‘five-minute’ refresh rates for her retail stores to allow customer purchase behavior drive the inventory.. “One of our regional managers told me that as soon as the system went up she realized we were not buying correctly across size levels per store,” says Aytaman. “Before, they had ordered the same size breakdown for all stores. While the customer would not notice, they’re now more likely to find the sizes and styles they prefer at their location. This is a great improvement for stocking and logistics that also reduces returns.”
Another practical application of ‘near real-time’ datawarehouses is in the area of fraud detection. Couple of years back, Continental implemented a ‘near-real time’ datawarehouse and in the first year alone $7 million in fraud was recognized and eliminated, and the airline realized a $41 million reduction in costs. Just after the tragic 9/11 event, the team sorted through 35 different data marts and loaded the airline’s booking system into XML so senior airline officials and the FBI could monitor customers booking flights in real time.
Fraud detection hinges on using standard identifiers (e.g. Social Security, Credit Card Accounts, Bank Accounts, Passport Numbers, Driving License ID’s etc.) to analyze transactions and tease out inconsistencies in usage patterns. ‘Near-real time’ datawarehouses are great for this function as they collate information from myriads of sources and can churn out exception analyses on a timely basis.
A few years back, I was personally involved in implementing an enterprise datawarehouse for a large insurance company. It took more than two years to get it done, but the benefits were quick and tangible – Of the many reports produced to measure the health of the auto insurance division, couple of them focused on using the standard identifiers to highlight claim frauds. For the first time in the company’s long history, they were able to automate a report which could list people using the same social security or driving license but under different names, addresses and claiming losses on different automobiles within a short time frame. Regardless to say the executives were more than pleased.
Most of us have personally experienced receiving automated messages from our credit card companies informing us that our credit card has been blocked based on some suspicious activity. And even though we may get ruffled by the extra steps to get our credit card back to a ‘functional’ state, nobody can deny the ‘peace of mind’ knowing that one will not be liable for some random high-valued charges – and all of this rests on ‘triggers’ generated through timely analysis of millions and millions of transactions.
In looking through the uses of this technology, there are clear business value themes:
- Cost Reduction – through consolidation of information and automation of analyses
- Risk Identification / Mitigation – by alerting the customers on a timely basis and resolving the exception within a short span of time
- Revenue Opportunities – improving the customer experience by empowering the channels (and customer reps) with accurate information on a timely basis
It is clear that ‘information’ can drive a significant ‘advantage’ to a company’s profitability, especially if it can drive insights faster than the rest of the competition. But one needs to identify and size the business value at play before spending a lot of money on the technology. More businesses will ultimately realize the power of ‘near real-time’ data, but can they afford such sophisticated technology? Well, I will save that discussion for my next blog…
{ 1 comment }










