Development for Sentiment and Content Classification on Social Big Data Using DEEP Q Network Embedding

Main Article Content

Nilam Deepak Padwal, Kamal Alaskar

Abstract

There is a rapidly increasing amount of user-generated content on platforms, such as Twitter, and this shift has led to an increasing demand for solid and scalable models to catch spam, fake news and sentiment trends in real time. In this work we introduce a hybrid approach which leverages the power of deep learning and reinforcement learning for discriminating between spam/fake Twitter posts and genuine ones using sentiment cues. We rely on a large-scale dataset of more than 788K distinct English-language tweets to design and contrast three models, a Bidirectional Long Short-Term Memory (BiLSTM) neural network, a Deep Q-Network (DQN) with LSTM embeddings, and a DQN featuring RoBERTa contextual embeddings. The BiLSTM model obtained the superior traditional performance indexes, accuracy of 81.86% and macro F1-score of 0.8186. The DQN with LSTM embeddings: the learning ability was attenuated by overfitting and lack of generalization (accuracy: 69.5%). In contrast, our RoBERTa-DQN model balanced precision and recall better, with the test accuracy and macro F1-score reaching 74.5% and 0.7387, respectively. This demonstrates the strength of combining contextualized transformer embeddings with reinforcement learning for sentiment-aware spam detection. We also assess model interpretability through real and synthetic message probing experiments and show the system’s ability to identify critical linguistic hints (e.g., “urgent”, “free”, “compromised”) found in spam or phishing content. Our findings underline the viability of hybrid architectures for real-time monitoring of social media and set the stage for future research in real-time content moderation, sarcasm detection and adaptive NLP systems.

Article Details

Issue
Section
Articles