In the bustling world of e-commerce where countless products and customer behavioral data are growing exponentially, hyper-personalized recommendations that understand each customer's preference and recommend the right product are no longer a choice, but a must for e-commerce growth. Using our Hyper-Personalization AI, Korean online fashion mall BABATHE.COM achieved an astonishing 30 percent CTR increase within two months.
Starting with the goal of becoming a customer's personal shopping agent beyond the typical passive recommendation method, our Hyper-Personalization AI has been relentlessly improving performance with numerous customer scenarios. In order for AI recommendations to genuinely act as "personal" shopping agents, we conducted real-time optimization learning of each customer's shopping journey along with detailed product information, and after a long period of in-depth research and optimization, we developed it into an excellent model that predicts various situations and hidden needs of customers.
BABATHE.COM boasts a customer base where over 80 percent are fashion-forward women, wielding high purchasing power and having a clear brand and style preference. Understanding the nuanced intentions and needs of each customer became a focal point. Through continuous refinement of both quantitative and qualitative analyses, our model underwent significant improvements. The results? An astounding 30 percent spike in CTR in less than three months.
In this article, I invite you to delve into the core of our algorithm design process, exploring key considerations and gaining insights from the OMNICOMMERCE personalization team. Committed to achieving top-tier recommendation results, the team shares their expertise on navigating the evolving landscape of personalized e-commerce experiences.
1. What recommendations do customers want?
SDK inserting example
We have a lot of different thoughts while shopping. So how does AI know what products to recommend? This is done by iteratively hypothesizing, testing, analyzing, learning, and optimizing based on data about customer preferences and needs, which vary across e-commerce stores and brands. As our team worked to perfect the optimal recommendation algorithm for different customers and scenarios, we analyzed target metrics and recommendation outcome data to identify key recommendation strategies that worked well, and ultimately settled on the following three recommendation approaches.
- Preference based recommendation
The purchase patterns of BABATHE.COM customers showed that brand, price, and discount rate were the top factors that influenced their purchase decisions. We found that similar style brands and similar price level (60% of customers responded to products in a similar price range - around $100 from the current product they were looking at) were more likely to be purchased, especially for discount-sensitive customers. We also prioritized brands that customers were more responsive to and brands with real purchase histories, which significantly increased the hit rate of subsequent recommendations.
2. Recommending products of interest using product attributes
Depending on the customer scenario, it can be particularly effective to include some similar products in the recommendation results. In this case, we leveraged OMNICOMMERCE Tagging AI, which automatically extracts 900+ product attributes (apparel category, color, material, length, fit, design, etc.) for each product, to recommend products that are highly similar (based on popularity, price, etc.). In some cases, we also leveraged styling data from the OMNICOMMERCE database in conjunction with Tagging AI to recommend products that match the product the customer is viewing.
3. Relevant product recommendations based on shopping context
"Relevance" encompasses a variety of conditions. We took into account things like what other customers with similar tastes have seen or purchased, the order and type of products a customer previously viewed, and what's popular in the same category, and we prioritize the products we think are most likely to appeal to the customer right now. In short, we focus on understanding customers’ intentions (why they look at the product, what products they viewed just before, and how many products they viewed) to help you continue your shopping journey.
The reason we've seen this core recommendation logic perform so well is due to optimal data collection and processing and model building that is completely focused on the customer. In the following contents, I will introduce the process step-by-step.
2. How to collect customer data?
To start hyper-personalized recommendations, you must first insert an event SDK for each service page in the e-commerce store and collect user behavior data. Currently, OMNICOMMERCE provides an SDK for inserting events.
Based on this, recommendations are made that reflect the best AI algorithms for that e-commerce service. The key here is what information from the vast event data is categorized as meaningful and utilized as training data. The information utilized is as follows.
Examples of Information Collected and Utilized
- Product information
Product ID, product name, product price, product gender, brand name, number of product likes, number of product reviews, rating of product reviews, etc. - User information
Behavioral information on each customer service page including product information, gender, age group, etc. - Others
Ranking of most purchased products, products in cart together, products viewed together, ranking of most viewed products, ranking of most combined products, products purchased together, etc.
Among the various collection events listed in the example above, the importance of each data may vary depending on the characteristics and data of each e-commerce, the scenario in which you want to apply recommendations, and the business metrics you want to target. The collected events, related product information, and user information are used to train the hyper-personalized recommendation AI and to validate the performance of the recommendation AI after it is deployed. In the training phase, the importance of each feature, the update frequency (real-time update, batch processing, etc.), and the pre- and post-processing methods are determined.
For example, product viewing events and search events are data that must be applied "on the fly" to recommend products that are likely to be viewed or added to the cart next, and for this purpose, they are events that must be delivered in real time and fed into the model. In addition, information such as reviews, ratings that are associated with an event may be more efficiently processed on an hourly or daily basis rather than in real time, as in many cases the amount of information changes little or does not change well in real time. In this way, even after the recommendation service is deployed, there is continuous iterative learning and performance monitoring of the model based on the data pipeline configured to achieve optimized performance of the model in the most efficient structure according to the customer's environment and data.
3. How to build a great personalization AI ?
At the stage of incorporating quality data and customer insights into the model, we extract meaningful data from the mass data and adjust the importance weights for training. This is the most essential step, and we have built a personalized recommendation model optimized for BABATHE.COM after numerous experiments.
ETL & Learning Pipeline
In the ETL stage, the data required for training is generated by distributed processing of various data collected by the SDK. The data generated in regular intervals and batches is divided into offline and online feature stores. The data stored in the offline store is the data required for model training, which calculates various features for customers and products and actively utilizes them for learning. In the online feature store, only the most recent data for actual recommendations is stored from the offline feature store so that features can be searched faster in the API. The ETL process of processing SDK event data from the collected offline and online stores and the model training process are automated using Kubeflow and are repeatedly performed at regular intervals.
Model Architecture
The OMNICOMMERCE hyper-personalized recommendation model consists of two stages: candidate generation and ranking. In the candidate generation stage, various machine learning and deep learning models are used to extract candidates worthy of recommendation to the user, and the ranking stage adjusts the ranking of the extracted candidates. In addition, customer-specific product filtering is added to output the final recommendation results.
Models
There are a wide variety of well-documented and popular personalized recommendation models. From traditional tree-based models to deep learning-based models, there are many choices depending on the user's situation and data characteristics. We are implementing and testing various algorithms to make recommendations that are appropriate for the purpose of the recommendation area in e-commerce.
For example, we use a sequence-based model for product details, a graph neural network for shopping cart pages, and a text-based model for search recommendations. We also use an ensemble of strategies from multiple candidate generations and utilize a ranking model.
While all models show generally good recommendation performance when experimented with refined benchmark datasets and reported in papers, simple implementations of them on unrefined real-world data are lacking. Our team has tried various methods to optimize performance and minimize various data noise and imperfections that may occur in real-world commercial service scenarios, and we will share some of the effective methods below.
A. Dataset Reformation
The data used in personalized recommendations has traditionally had a problem called sparsity. That is, the total number of clicks is very small compared to the number of users or products. The test dataset used in papers for performance reports does not use all users and products, but users and products filtered by clicks. In addition, in real data, there are various click patterns and rapid changes in user interest over time, so it is necessary to properly remove noise other than necessary behavior patterns. Basically, after filtering users and products, we filtered the click data itself to ensure that our model learns recommendation strategies that are easy to use in real-world recommendations. In practice, we were able to continuously improve our performance by reorganizing the dataset.
B. Hyperparameter Optimization
Deep learning-based models, as well as recommendation models, inherently have a large number of hyperparameters. From the overall model design to the learning process, there are many control variables that ultimately affect the recommendation performance. Most papers provide a search space for only a subset of the total hyperparameters and do not specify the optimal settings among them. In addition, different datasets show different performance trends for different hyperparameters.
For these model design choices, it is essential to compare the performance of various design choices in a real online environment and select the optimal values. To this end, we compared the performance of various hyperparameters and model architectures in an A/B test environment supported by our Hyper-Personalization AI solution, and obtained a tuned model optimized for the BABATHE.COM
Recommendation results in BABATHE.COM
Conclusion
OMNICOMMERCE personalization team has been able to derive essential meaning from a large amount of data and optimize a high-performing recommendation model through numerous experiments, achieving results that far exceeded our targeted KPIs. Even within e-commerce, there are different data and algorithmic strategies to consider for different areas of application, such as the main page, product detail page, and shopping cart page. So we're excited to see the results of new experiments and performance improvements in the future.
Currently, many AI recommendation solutions are being used under the name of hyper-personalized recommendation, but their actual working principle is often nothing more than simple sequence-based recommendations, such as showing more similar products, showing more popular products. However, as customers come to the store with very different needs, we realized that differentiated recommendation results can only be experienced by deep understanding and predicting the needs of each individual customer (even those who are browsing without specific needs) and subtly adjusting the results.
OMNICOMMERCE Hyper-Personalization AI is ready to be applied to the fashion domain, where brands have their own product hierarchy (e.g., product segments for casual, amateur, and professional customers) such as those in sports brands, and to various other domains such as food, furniture, and beauty beyond fashion category, to provide a completely new personalized shopping experience. Stay tuned for the next steps in our journey to grow even bigger on the global stage!