Abstract
Data scarcity and stylistic heterogeneity pose major challenges for emotion intensity classification. This paper presents a cross-dataset augmentation framework that leverages prompt-conditioned generative models alongside deterministic and heuristic transformations to synthesize target-style examples for improved transfer learning. We introduce a unified taxonomy of augmentation strategies—Heuristic Lexical Perturbation (HLA), Prompt-Conditioned Generative Augmentation (CGA), Sequential Hybrid Pipeline (SHA), Rule-Guided Style Adaptation (DSGA), and Enhanced Hybrid Augmentation (EHA)—and detail an interpretability-oriented prompt engineering approach that conditions LLMs on authentic target exemplars and stylistic features extracted from the target dataset.
Augmented datasets were evaluated using multi-dimensional quality metrics (transformation quality, stylistic consistency, BLEU/CHRF, Self-BLEU, uniqueness) and downstream classification via a two-phase BERT-LSTM training with rigorous statistical testing. During source dataset pretraining and subsequent target dataset fine-tuning, CGA achieved the highest single-method gains in F1 and accuracy (F1 = 0.8816; accuracy = 0.8819, 95\% CI recalculated). HLA and SHA exhibited improved cross-domain stability, suggesting stronger domain-generalizable features. We observe systematic trade-offs between fluency, lexical diversity, and emotion fidelity: high surface similarity often correlates with classifier performance but does not fully capture affective authenticity.
We discuss methodological pitfalls, propose best practices for emotion-aware augmentation, and provide reproducible artifacts (prompts, example transformations, evaluation scripts) to facilitate further research in affective NLP.
Augmented datasets were evaluated using multi-dimensional quality metrics (transformation quality, stylistic consistency, BLEU/CHRF, Self-BLEU, uniqueness) and downstream classification via a two-phase BERT-LSTM training with rigorous statistical testing. During source dataset pretraining and subsequent target dataset fine-tuning, CGA achieved the highest single-method gains in F1 and accuracy (F1 = 0.8816; accuracy = 0.8819, 95\% CI recalculated). HLA and SHA exhibited improved cross-domain stability, suggesting stronger domain-generalizable features. We observe systematic trade-offs between fluency, lexical diversity, and emotion fidelity: high surface similarity often correlates with classifier performance but does not fully capture affective authenticity.
We discuss methodological pitfalls, propose best practices for emotion-aware augmentation, and provide reproducible artifacts (prompts, example transformations, evaluation scripts) to facilitate further research in affective NLP.
| Original language | English |
|---|---|
| Title of host publication | Generative AI in Intelligent Systems and Applications: Unleashing the Potential |
| Publisher | Springer Verlag |
| Publication status | Published - 2026 |
Keywords
- Cross-domain Transfer Learning
- Emotion-aware NLP
- LLM-based Data Augmentation
- Text Generation Evaluation
Fingerprint
Dive into the research topics of 'Leveraging Generative Artificial Intelligence for Enhanced Data Augmentation in Emotion Intensity Classification: A Comprehensive Framework for Cross-Dataset Transfer Learning'. Together they form a unique fingerprint.Cite this
- APA
- Standard
- Harvard
- Vancouver
- Author
- BIBTEX
- RIS