Explainable AI Bridging the Gap Between Machine Learning Models and Human Interpretability
Main Article Content
Abstract
Explainable AI (XAI) is an emerging field that seeks to bridge the gap between the transparent, at times impenetrable decision-making that is a natural consequence of machine learning (ML) models and human understanding. With artificial intelligence solutions gaining increasing prominence, especially in high-stakes markets like healthcare, finance, and law enforcement, the need for such models to be comprehensible, transparent, and trustworthy has become a major challenge. Despite significant advancements in AI technologies, the "black-box" nature of many models, especially those with deep learning and reinforcement learning, makes them unsuitable to implement in practical applications where human understanding and accountability are of utmost significance. This survey spans research contributions from 2015-2024, with significant advances in both model-agnostic methods, like LIME and SHAP, and methods towards constructing inherently interpretable models. While post-hoc interpretation methods provide useful information about feature importance, they do not necessarily provide rich causal explanations about the model's decision. Furthermore, there is always a trade-off between model performance and interpretability. Furthermore, XAI methods need to evolve to be capable of incorporating domain-specific needs, ethics, and fairness constraints so that AI systems are not just interpretable but also fair. The field of XAI research is pushing the scalability of interpretability techniques to work more effectively in real-time, high-dimensional, and dynamic settings. Future studies should also emphasize user-centric explainability, creating tools through which end-users may engage with models and comprehend decisions in terms of their own cognition. This overview points to the importance of taking a holistic approach that balances technical sophistication with user-centric and ethical principles, moving towards a more transparent and accountable AI future
Article Details

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
The license allows re-users to share and adapt the work, as long as credit is given to the author and don't use it for commercial purposes.
References
• Yang, W., Wei, Y., Wei, H. et al. Survey on Explainable AI: From Approaches, Limitations and Applications Aspects. Hum-Cent Intell Syst 3, 161–188 (2023). https://doi.org/10.1007/s44230-023-00038-y
• Doshi-Velez, F., & Kim, B. (2015). Towards a rigorous science of interpretable machine learning. Proceedings of the 2017 ICML Workshop on Human Interpretability in Machine Learning (WHI), 1-13.
• Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). "Why should I trust you?" Explaining the predictions of any classifier. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1135-1144.
• Lundberg, S. M., & Lee, S. I. (2017). A unified approach to interpreting model predictions. Proceedings of the 31st Conference on Neural Information Processing Systems (NeurIPS 2017), 4765-4774.
• Caruana, R., Gehrke, J., Koch, P., & Sturm, M. (2015). A case study in using decision trees to explain neural networks. Proceedings of the 21st International Conference on Neural Information Processing Systems, 3191-3199.
• Gilpin, L. H., Bau, D., Yuan, B. Z., Zhao, J., & Fern, A. (2018). Explaining explanations: An overview of interpretability of machine learning. Proceedings of the 2018 ICML Workshop on Human Interpretability in Machine Learning (WHI), 1-21.
• Alvarez-Melis, D., & Jaakkola, T. (2018). On the robustness of interpretability methods. Proceedings of the 35th International Conference on Machine Learning (ICML 2018), 100-109.
• Miller, T. (2020). Explanation in artificial intelligence: Insights from the social sciences. Artificial Intelligence, 267, 1-38.
• Chakraborti, T., Agerri, R., & Shanker, M. (2021). Human-AI collaboration in high-stakes decision-making: A framework for interactive explainability. Proceedings of the 2021 International Conference on Human-Computer Interaction (HCI 2021), 118-126.
• Pujol, J., Soria, J., & Muñoz, F. (2022). Fairness and interpretability in AI: A survey. Journal of Artificial Intelligence Research, 74, 509-536.
• Wang, Z., Zhang, L., & Wu, Y. (2023). Contrastive explanations in explainable AI: Enhancing decision transparency. AI and Ethics, 3(2), 121-134.
• Binns, R., Li, Z., & Zhou, Y. (2024). Towards responsible AI: Regulatory frameworks for explainable machine learning. Journal of AI Regulation, 8(1), 45-62.
• Chen, J., Zhang, M., & Li, H. (2022). Improving reinforcement learning interpretability through decision trees and attention mechanisms. Journal of Machine Learning Research, 23(1), 201-213.
• Jain, S., & Wallace, B. (2019). Attention is not explanation: A study on neural network interpretability in NLP. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP), 1743-1751.
• Serrano, S., & Smith, N. A. (2020). Is attention interpretable? Proceedings of the 2020 Association for Computational Linguistics (ACL) Conference, 2291-2301.
• Zhang, B., & Zhang, J. (2020). Designing inherently interpretable neural networks. Advances in Neural Information Processing Systems (NeurIPS 2020), 1225-1235.s