Implemented big data platform
Enhanced business intelligence
Unified data source
Streamlined processes
The challenge
Streamlining data access for analytical efficiency
A leading insurer faced significant challenges with its internal system for collecting policy quote data. Despite having valuable data across various lines of business, its sheer size (over 67 TB compressed) and complexity (12k+ nested XML fields) rendered it unusable for effective analysis.
Key challenges
Lack of automated data delivery for analytical teams
Difficulty in leveraging historical data due to normalization
Overwhelming data size and complexity hindered analytical use
Conflicts between new data strategy and inconsistent business rules
No production-grade, extendible system and insufficient governance
Existing extractors were inefficient, requiring additional manual effort
The solution
Integrated data access and processing solution
Data ingestion and harmonization
Ingested 68TB data into Hadoop
Optimized with Avro, Parquet
Partitioned data for better access
Data extraction and querying
Flask UI for automation
XML to CSV conversion
MapReduce and Spark for processing
Implementation approach
1
Data partitioning
Partitioned data in Hadoop by business parameters
Enabled future scalability
Optimized storage
2
Custom UI development
Created data extraction UI
Simplified analytics for teams
Converted XML to CSV for easier use
3
End-to-end integration
Integrated with existing system
Enabled seamless automation
Built for scalability
The impact
Redefining data access and usage
Centralized data
Unified big data platform
Managed 1TB of data
Supported MDM
Enhanced analytics
Unified data source
Simplified data filtering
Streamlined access
Optimized processes
Replaced multi-layered architecture
Simplified data delivery
Enabled future scalability
Looking ahead
Expand platform capacity
Accommodate new data sources for future growth
Enhance processing
Improve real-time data processing capabilities
Integrate advanced analytics
Leverage advanced tools for deeper insights