About
👋 Hi there, I’m Bashar Talafha
I am a PhD candidate in the School of Information at the University of British Columbia (UBC), under the supervision of Dr. Muhammad Abdul-Mageed. My research spans Artificial Intelligence, Machine Learning, Deep Learning, Natural Language Processing, Speech Technologies, Large Language Models (LLMs), Multimodal LLMs, Sentiment and Social Media Analysis, and Multilingual/Multidialectal Information Processing. I hold MSc and BSc degrees in Computer Science from the Jordan University of Science and Technology. Prior to beginning my PhD, I worked as a Senior Data Scientist and NLU Team Leader in the Artificial Intelligence department at Mawdoo3 Ltd. in Amman, Jordan. Earlier, I worked as a Researcher and Machine Learning Engineer at Samsung Electronics in Amman.
Publications:
Talafha, B., Kadaoui, K., Magdy, S.M., Habiboullah, M., Chafei, C.M., El-Shangiti, A.O., Zayed, H., Alhamouri, R., Assi, R., Alraeesi, A. and Mohamed, H., 2024. Casablanca: Data and Models for Multidialectal Arabic Speech Recognition. arXiv preprint arXiv:2410.04527.
Jarrar, M., Hamad, N., Khalilia, M., Talafha, B., Elmadany, A. and Abdul-Mageed, M., 2024. WojoodNER 2024: The Second Arabic Named Entity Recognition Shared Task. arXiv preprint arXiv:2407.09936.
Talafha, B., Waheed, A., & Abdul-Mageed, M. (2023). N-Shot Benchmarking of Whisper on Diverse Arabic Speech Recognition. arXiv preprint arXiv:2306.02902.
Waheed, A., Talafha, B., Sullivan, P., Elmadany, A., & Abdul-Mageed, M. (2023). VoxArabica: A Robust Dialect-Aware Arabic Speech Recognition System. arXiv preprint arXiv:2310.11069.
Jarrar, M., Abdul-Mageed, M., Khalilia, M., Talafha, B., Elmadany, A., Hamad, N., & Omar, A. (2023). WojoodNER 2023: The First Arabic Named Entity Recognition Shared Task. arXiv preprint arXiv:2310.16153.
Ebrahimi, A., Mager, M., Wiemerslage, A., Denisov, P., Oncevay, A., Liu, D., , Talafha, B., … & Kann, K. (2022, August). Findings of the Second AmericasNLP Competition on Speech-to-Text Translation. In NeurIPS 2022 Competition Track (pp. 217-232). PMLR.
Za’ter, M. E., & Talafha, B. (2022). Benchmarking and Improving Arabic Automatic Image Captioning Through The Use Of Multi-Task Learning Paradigm. arXiv preprint arXiv:2202.05474.
Talafha, B., Za’Ter, M. E., Suleiman, S., Al-Ayyoub, M., & Al-Kabi, M. N. (2021, November). sarcasm detection and quantification in arabic tweets. In 2021 IEEE 33rd International Conference on Tools with Artificial Intelligence (ICTAI) (pp. 1121-1125). IEEE.
Talafha, B., Abuammar, A., & Al-Ayyoub, M. (2021). ATAR: Attention-based LSTM for Arabizi transliteration. International Journal of Electrical and Computer Engineering (IJECE), 11(3), 2327-2334.
Seelawi, H., Tuffaha, I., Gzawi, M., Farhan, W., Talafha, B., Badawi, R., … & Al-Natsheh, H. (2021, April). Alue: Arabic language understanding evaluation. In Proceedings of the Sixth Arabic Natural Language Processing Workshop (pp. 173-184).
Talafha, B., Ali, M., Za’ter, M. E., Seelawi, H., Tuffaha, I., Samir, M., … & Al-Natsheh, H. T. (2020). Multi-dialect arabic bert for country-level dialect identification. arXiv preprint arXiv:2007.05612.
Farhan, W., Talafha, B., Abuammar, A., Jaikat, R., Al-Ayyoub, M., Tarakji, A. B., & Toma, A. (2020). Unsupervised dialectal neural machine translation. Information Processing & Management, 57(3), 102181.
Talafha, B., Al-Ayyoub, M., Abuammar, A., & Jararweh, Y. (2019, November). Outperforming State-of-the-Art Systems for Aspect-Based Sentiment Analysis. In 2019 IEEE/ACS 16th International Conference on Computer Systems and Applications (AICCSA) (pp. 1-5). IEEE.
Talafha, B., Mohammad, A. S., Al-Ayyoub, M., Jararweh, Y., & Juola, P. (2019, November). Using a Hierarchical Softmax Based on the Huffman Coding Tree for Authenticating Arabic Tweets. In 2019 IEEE/ACS 16th International Conference on Computer Systems and Applications (AICCSA) (pp. 1-5). IEEE.
Talafha, B., Farhan, W., Altakrouri, A., & Al-Natsheh, H. (2019, August). Mawdoo3 ai at madar shared task: Arabic tweet dialect identification. In Proceedings of the Fourth Arabic Natural Language Processing Workshop (pp. 239-243).
Talafha, B., Fadel, A., Al-Ayyoub, M., Jararweh, Y., Mohammad, A. S., & Juola, P. (2019, August). Team just at the madar shared task on arabic fine-grained dialect identification. In Proceedings of the Fourth Arabic Natural Language Processing Workshop (pp. 285-289).
Ragab, A., Seelawi, H., Samir, M., Mattar, A., Al-Bataineh, H., Zaghloul, M., Mustafa, A., Talafha, B., Freihat, A.A. & Al-Natsheh, H. (2019, August). Mawdoo3 ai at madar shared task: Arabic fine-grained dialect identification with ensemble learning. In Proceedings of the Fourth Arabic Natural Language Processing Workshop (pp. 244-248).
Al-Smadi, M., Talafha, B., Al-Ayyoub, M., & Jararweh, Y. (2019). Using long short-term memory deep neural networks for aspect-based sentiment analysis of Arabic reviews. International Journal of Machine Learning and Cybernetics, 10(8), 2163-2175.
Al-Sadi, A., Talafha, B., Al-Ayyoub, M., Jararweh, Y., & Costen, F. (2019). JUST at ImageCLEF 2019 Visual Question Answering in the Medical Domain. In CLEF (Working Notes).
Talafha, B., & Al-Ayyoub, M. (2019, January). IoH-RCNN: Pursue the Ingredients of Happiness using Recurrent Convolutional Neural Networks. In AffCon@ AAAI.
Talafha, B., & Al-Ayyoub, M. (2018). JUST at VQA-Med: A VGG-Seq2Seq Model. In CLEF (Working Notes).
Mohammad, A. S., Qwasmeh, O., Talafha, B., Al-Ayyoub, M., Jararweh, Y., & Benkhelifa, E. (2016, December). An enhanced framework for aspect-based sentiment analysis of Hotels’ reviews: Arabic reviews case study. In 2016 11th International conference for internet technology and secured transactions (ICITST) (pp. 98-103). IEEE.
Albadarneh, J., Talafha, B., Al-Ayyoub, M., Zaqaibeh, B., Al-Smadi, M., Jararweh, Y., & Benkhelifa, E. (2015, December). Using big data analytics for authorship authentication of arabic tweets. In 2015 IEEE/ACM 8th International Conference on Utility and Cloud Computing (UCC) (pp. 448-452). IEEE.
Al-Smadi, M., Talafha, B., Qawasmeh, O., Alandoli, M. N., Hussien, W. A., & Guetl, C. (2015, October). A hybrid approach for Arabic named entity disambiguation. In Proceedings of the 15th International Conference on Knowledge Technologies and Data-driven Business (pp. 1-4).
Al-Smadi, M., Qawasmeh, O., Talafha, B. and Quwaider, M., 2015, August. Human annotated arabic dataset of book reviews for aspect based sentiment analysis. In 2015 3rd International conference on future internet of things and cloud (pp. 726-730). IEEE.
Patents:
- Multilingual translation device and method
- Method and apparatus for processing language based on trained network model
Honors & Awards:
- 4-times Four-Year Fellowships (4YF), UBC / Ph.D. Award
- 4-times International Tuition Award, UBC / Ph.D. Award
- 4-times President’s Academic Excellence Initiative PhD Award, UBC / Ph.D. Award
- R Howard Webster Foundation Fellowship, UBC / Ph.D. Award
- 1st place in the NADI shared task 1, Coling 2020
- 2nd place in the MADAR shared task 2, ACL 2019
- 3rd place in the MADAR shared task 1, ACL 2019
- 4th place in the CL-AFF shared task at AAAI 2019.
- Distinguished Employee Award, Samsung / Samsung R&D Institute (SRJO)
- Best Project Award, Samsung / Samsung R&D Institute (SRJO)
- Best Graduation Project Award, JUST