Suncorp Group will decommission its six enterprise data warehouses in the coming 18 months to improve the scalability and speed of its analytics.
The group has deployed software systems and 200 processor cores to allow it to search through and analyse 100TB of data from disparate sources on demand and in near real-time.
Suncorp’s business intelligence strategy architect Shaun Deierl told the CeBIT Big Data conference yesterday that the new data strategy had been two years in the making.
“The whole philosophy of structuring data at the point of acquiring it doesn’t work,” he said.
“We have a claims system with 2000 tables. We’re really agile, so there [are] changes to the underlying data structure every week.
“For you to have a traditional warehousing space to keep up with that, you never keep up with it.”
Deierl told conference delegates that the group had moved away from traditional extract, transform, load (ETL) processes, redeploying 14 ETL professionals earlier this year.
His 175-person analytics team used IBM’s InfoSphere Change Data Capture (CDC) technology to scrape data from various backend systems, including mainframe and midrange technology.
Data was then fed into an IBM Netezza analytics platform, and queried using Java models.
Suncorp also evaluated Oracle and SAS platforms, and installed Hadoop big data analytics technology early last year but was "struggling to find the problem to solve" with the latter.
Deierl noted that the move away from data warehousing took “courage” – one of Suncorp’s six core values – especially given the amount of investment behind each of the warehouses.
He said the system had been used to analyse the insurance claims experience for about a tenth the cost so far.
“Return on investment is really hard to say because we provide the data but you’ve got to use it,” he noted.
“We’ve been delivering for the past 12 months … Every two weeks, we have to prove [the system’s value].”
Deierl said Suncorp had not forced business units to move away from data warehouses but aimed to attract them to the new architecture instead.
The business intelligence team planned to cease supporting various systems in step with Suncorp’s simplification plan.
It would then treat any decommissioned data warehouses as one of several sources of data to be scraped with CDC.
“Some of these warehouses are six terabytes and they don’t change much, and most of it is indexing,” Deierl noted.
“We take the warehouse as a source, so we hook CDC up to it and the warehouse is basically a proxy for all the systems that we want to turn off.”
Ideally, Deierl said Suncorp hoped to treat data like it did IT infrastructure, moving it from closed, proprietary systems to ubiquity.
“If you have a Google-like experience where you type in a word and it then does that without having data modellers in between, it’s brilliant,” he said.