Samuel Cahyawijaya

The Hong Kong University of Science and Technology (HKUST)


A passionate machine learning researcher. Enjoy to work on data technology, machine learning, and autonomous system. Experienced in data analysis, forecasting, artificial intelligence, machine learning, deep learning, etc.

I want to get deeper understanding on how human minds work and I have a passion on how to build one myself.

You can take a look at my CV here.

If you are interested to collaborate or discuss, please feel free to reach out to me via email.


Sep 19, 2023 So proud of our latest IndoNLP :indonesia:‚Äôs collaboration project! ūüöÄ Introducing NusaWrites, our groundbreaking project accepted at AACL 2023. ūüďöūüĆć Dive deep into our analysis of corpora collection strategy and explore a comprehensive language modeling benchmark for underrepresented and extremely low-resource ūüáģūüá© local languages.
May 4, 2023 EACL 2023 Outstanding Paper Award for NusaX: Multilingual Parallel Sentiment Dataset for 10 Indonesian Local Languages.
May 1, 2023 NusaCrowd is published in ACL Findings 2023. So proud of our IndoNLP :indonesia: community! From a joint collaboration to 100+ datasets.

Selected Publications

  1. Samuel Cahyawijaya, Holy Lovenia, Tiezheng Yu, Willy Chung, and Pascale Fung.
    arXiv preprint arXiv:2305.13627. 2023.
  2. Ruochen Zhang,  Samuel Cahyawijaya, Jan Christian Blaise Cruz, and Alham Fikri Aji.
    arXiv preprint arXiv:2305.14235. 2023.
  3. Samuel Cahyawijaya, Holy Lovenia, Alham Fikri Aji, Genta Indra Winata, Bryan Wilie, Rahmad Mahendra, Christian Wibisono, Ade Romadhony, Karissa Vincentio, Fajri Koto, and others.
    In Findings of the Association for Computational Linguistics: ACL 2023, Toronto, Canada, 9-14 July 2023. 2023.
  4. Genta Indra Winata, Alham Fikri Aji,  Samuel Cahyawijaya, Rahmad Mahendra, Fajri Koto, Ade Romadhony, Kemal Kurniawan, David Moeljadi, Radityo Eko Prasojo, Pascale Fung, Timothy Baldwin, Jey Han Lau, Rico Sennrich, and Sebastian Ruder.
    In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics. May 2023.
  5. Muhammad Farid Adilazuarda,  Samuel Cahyawijaya, and Ayu Purwarianti.
    In Tiny Papers of Eleventh International Conference on Learning Representations (ICLR), Kigali, Rwanda, 5 May 2023. May 2023.
  6. Samuel Cahyawijaya, Bryan Wilie, Holy Lovenia, Huan Zhong, MingQian Zhong, Yuk-Yu Nancy Ip, and Pascale Fung.
    In Proceedings of the 13th International Workshop on Health Text Mining and Information Analysis (LOUHI). Dec 2022.
  7. Alham Fikri Aji, Genta Indra Winata, Fajri Koto,  Samuel Cahyawijaya, Ade Romadhony, Rahmad Mahendra, Kemal Kurniawan, David Moeljadi, Radityo Eko Prasojo, Timothy Baldwin, Jey Han Lau, and Sebastian Ruder.
    In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2022, Dublin, Ireland, May 22-27, 2022. Dec 2022.
  8. Samuel Cahyawijaya, Tiezheng Yu, Zihan Liu, Xiaopu Zhou, Tze Wing Tiffany Mak, Nancy Y. Ip, and Pascale Fung.
    In Proceedings of the 21st Workshop on Biomedical Language Processing, BioNLP@ACL 2022, Dublin, Ireland, May 26, 2022. Dec 2022.
  9. Samuel Cahyawijaya, Genta Indra Winata, Bryan Wilie, Karissa Vincentio, Xiaohong Li, Adhiguna Kuncoro, Sebastian Ruder, Zhi Yuan Lim, Syafri Bahar, Masayu Leylia Khodra, Ayu Purwarianti, and Pascale Fung.
    In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021, Virtual Event / Punta Cana, Dominican Republic, 7-11 November, 2021. Dec 2021.
  10. Bryan Wilie, Karissa Vincentio, Genta Indra Winata,  Samuel Cahyawijaya, Xiaohong Li, Zhi Yuan Lim, Sidik Soleman, Rahmad Mahendra, Pascale Fung, Syafri Bahar, and Ayu Purwarianti.
    In Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing, AACL/IJCNLP 2020, Suzhou, China, December 4-7, 2020. Dec 2020.