
Building an AWS-powered Customer Data Platform with Next Best Action for a leading footwear retailer
Challenge
A leading multi-brand footwear retailer faced a significant challenge as they lacked a modern data platform to execute machine learning and predictive use cases essential for driving revenue growth. As part of their long-term data strategy, the retailer aimed to collect and analyze customer data to gain deeper insights.
This would result in personalized shopping experiences, optimized inventory management, predictive analytics for future trends, and ultimately, business growth and customer loyalty.
Solution
Fractal implemented its in-house accelerator “Customer Genomics,” developed on the open-source Spark framework.
The solution encompassed several key components:
- Data management: AWS S3 data lake was utilized for raw data storage, and Lake Formation was used for structured and processed data. Data ingestion tools like AWS Transfer Family SFTP facilitated efficient data transfer.
- Data processing and transformation: AWS Glue and DBT were employed for data transformation and processing.
- Data warehousing and modeling: Snowflake was adopted for data warehousing, ensuring high performance and scalability. DBT was integrated for data modeling, facilitating easy updates and maintenance.
- Security and compliance: AWS IAM and KMS were used for identity management and encryption. VPC, Security Groups, and Audit Logs ensured network security and monitoring.
- Governance and cataloging: A data catalog was implemented with Snowflake and AWS Glue Crawler for metadata management.
- Development and operations: A CI/CD pipeline was established using CodeCommit, CloudFormation, and CodePipeline. Logging and tracking were managed with CloudWatch, CloudTrail, and X-Ray for operational transparency.
- Model development and training: AWS EMR Serverless and AWS SageMaker were used for scalable data science model development and training.
- Orchestration and automation: Workflows were automated with Airflow for efficient pipeline orchestration and scheduling.
- Scalability and performance: The architecture supported virtually unlimited storage with AWS S3 and could process large datasets with AWS EMR and AWS Glue.
Results
Our solution helped the client improve the campaign to target the appropriate audience and recommend tailored products. The client experienced significant cost savings with a reduction in infrastructure costs with AWS. Deployment time was reduced from hours to minutes, and automated backups saved hours per week.