TY - JOUR
T1 - Using machine learning to develop customer insights from user-generated content
AU - Mustak, Mekhail
AU - Hallikainen, Heli
AU - Laukkanen, Tommi
AU - Plé, Loïc
AU - Hollebeek, Linda
AU - Aleem, Majid
PY - 2024/11
Y1 - 2024/11
N2 - Uncovering customer insights (CI) is indispensable for contemporary marketing strategies. The widespread availability of user-generated content (UGC) presents a unique opportunity for firms to gain a nuanced understanding of their customers. However, the size and complexity of UGC datasets pose significant challenges for traditional market research methods, limiting their effectiveness in this context. To address this challenge, this study leverages natural language processing (NLP) and machine learning (ML) techniques to extract nuanced insights from UGC. By integrating sentiment analysis and topic modeling algorithms, we analyzed a dataset of approximately four million X posts (formerly tweets) encompassing 20 global brands across industries. The findings reveal primary brand-related emotions and identify the top 10 keywords indicative of brand-related sentiment. Using FedEx as a case study, we identify five prominent areas of customer concern: parcel tracking, small business services, the firm's comparative performance, package delivery dynamics, and customer service. Overall, this study offers a roadmap for academics to navigate the complex landscape of generating CI from UGC datasets. It thus raises pertinent practical implications, including boosting customer service, refining marketing strategies, and better understanding customer needs and preferences, thereby contributing to more effective, more responsive business strategies.
AB - Uncovering customer insights (CI) is indispensable for contemporary marketing strategies. The widespread availability of user-generated content (UGC) presents a unique opportunity for firms to gain a nuanced understanding of their customers. However, the size and complexity of UGC datasets pose significant challenges for traditional market research methods, limiting their effectiveness in this context. To address this challenge, this study leverages natural language processing (NLP) and machine learning (ML) techniques to extract nuanced insights from UGC. By integrating sentiment analysis and topic modeling algorithms, we analyzed a dataset of approximately four million X posts (formerly tweets) encompassing 20 global brands across industries. The findings reveal primary brand-related emotions and identify the top 10 keywords indicative of brand-related sentiment. Using FedEx as a case study, we identify five prominent areas of customer concern: parcel tracking, small business services, the firm's comparative performance, package delivery dynamics, and customer service. Overall, this study offers a roadmap for academics to navigate the complex landscape of generating CI from UGC datasets. It thus raises pertinent practical implications, including boosting customer service, refining marketing strategies, and better understanding customer needs and preferences, thereby contributing to more effective, more responsive business strategies.
KW - Customer insights
KW - User-generated content
KW - UGC
KW - Sentiment analysis
KW - Topic modeling
KW - Artificial intelligence
KW - Machine learning
KW - Natural language processing
KW - NLP
KW - Marketing
KW - Big data
U2 - 10.1016/j.jretconser.2024.104034
DO - 10.1016/j.jretconser.2024.104034
M3 - Article
SN - 1873-1384
VL - 81
JO - Journal of Retailing and Consumer Services
JF - Journal of Retailing and Consumer Services
M1 - 104034
ER -