Jensen-Tsallis Divergence for Supervised Classification under Data Imbalance

In many real-world classification problems—such as in healthcare, fraud detection, or predictive maintenance—data imbalance is a major challenge. Conventional loss functions like Categorical Cross Entropy (CE) can lead to models that overfit to majority classes and produce overconfident predictions that fail to generalize.

In this work, we propose the Jensen-Tsallis Divergence (JTD) as a new loss function for deep learning models trained on imbalanced datasets. It generalizes the well-known Jensen-Shannon Divergence (JSD) by incorporating Tsallis entropy, introducing a tunable parameter q that directly influences regularization strength.

Why Jensen-Tsallis?

Built-in regularization: JTD discourages overconfident output distributions, leading to better generalization.
Tunable flexibility: The q parameter controls how strongly the model penalizes high-confidence predictions.
Superior performance: JTD consistently outperforms CE, JSD, and focal loss in tests with artificially imbalanced datasets (MNIST, CIFAR-10, SVHN, Fashion-MNIST).

Theoretical Insight

We demonstrate that JTD introduces an intrinsic confidence penalty on output predictions. By adjusting q, we control how conservative the model becomes, thus reducing overfitting.

Learning Curve Comparison

Accuracy Learning Curves
Test vs train accuracy using CE, JSD, and JTD with various q. JTD improves generalization without overfitting.

Regularization Term Behavior

Regularization Effect
Behavior of the JTD’s regularization term as a function of model confidence. Larger q values shift the regularization curve and enhance robustness.

Squicciarini, A., Trigano, T., Luengo, D.
Jensen-Tsallis Divergence for Supervised Classification under Data Imbalance, Machine Learning (Spinger, ECLM-PKDD 2025), 114(7), 162 (2025). https://doi.org/10.1007/s10994-025-06791-4 🔗 Project Repository

⬅️ Back to Projects

Jensen-Tsallis Divergence for Supervised Classification under Data Imbalance

Why Jensen-Tsallis?

Theoretical Insight

Learning Curve Comparison

Regularization Term Behavior

Related Preprint