profile

Wojciech Kusa

Natural Language Processing ∎ Machine Learning
I am a Senior Research Engineer at Allegro ML Research where I work on the intersection of large language models and machine translation.
I have obtained my PhD from TU Wien under the supervision of Allan Hanbury and Petr Knoth. My PhD thesis was about automating systematic literature review workflow by improving datasets, evaluation methods, and automation approaches. I was a Marie Skłodowska-Curie Research Fellow under the EU Horizon 2020 project DoSSIER, focusing on domain specific systems for information extraction and retrieval. Before that I worked at Samsung R&D and interned at Sony CSL and UNINOVA.
My current research interests include:
  • Multi-modal and contextualised machine translation,
  • Clinical and biomedical NLP,
  • AI for scientific research discovery.
I review for, among others, SIGIR, ECIR, ACL Rolling Review, NeurIPS D&B, and COLING. Additionally, I am on the organising committee for the International Collaboration for the Automation of Systematic Reviews (ICASR) and was a program committee member for the ALTARS 2024 workshop.
I am an avid sailor and wayfarer. In 2019-2020, I had an incredible experience living on a sailing yacht with three friends. We set sail from England and covered a whopping 5,000 nautical miles, all the way to the stunning shores of Greece. If you're interested, we've documented our adventures on our YouTube channel.

News

Jul 01, 2024
May 07, 2024
I have successfully defended my PhD thesis! 🎉🎉🎉
Jan 04, 2024
I'm visiting UCL to work on LLM evaluation. I will be joining the Web Intelligence group and collaborating with Aldo Lipani.
Sep 23, 2023
Delighted to announce that our paper, CSMeD: Bridging the Dataset Gap in Automated Citation Screening for Systematic Literature Reviews, has been accepted to NeurIPS 2023 Track on Datasets and Benchmarks.
Sep 04, 2023
I'm helping in the organisation of this year's ICASR workshop. I'll also be presenting our work on citation screening metrics and datasets.
Aug 07, 2023
CRUISE-Screening was accepted as a demo at CIKM 2023. See you in Birmingham!

Selected publications

SDP
An Analysis of Tasks and Datasets in Peer Reviewing
Moritz Staudinger, Wojciech Kusa, Florina Piroi, Allan Hanbury
Proceedings of the Fourth Workshop on Scholarly Document Processing
   ACL
findings
AustroTox: A Dataset for Target-Based Austrian German Offensive Language Detection
Pia Pachinger, Janis Goldzycher, Anna Maria Planitzer, Wojciech Kusa, Allan Hanbury, Julia Neidhardt
Findings of the Association for Computational Linguistics: ACL 2024
ICTIR
Normalised Precision at Fixed Recall for Evaluating TAR
Wojciech Kusa, Georgios Peikos, Moritz Staudinger, Aldo Lipani, Allan Hanbury
The 14th International Conference on the Theory of Information Retrieval
NeurIPS
CSMeD: Bridging the Dataset Gap in Automated Citation Screening for Systematic Literature Reviews
Wojciech Kusa, Óscar E. Mendoza, Matthias Samwald, Petr Knoth, Allan Hanbury
37th Conference on Neural Information Processing Systems Track on Datasets and Benchmarks
CIKM
CRUISE-Screening: Living Literature Reviews Toolbox
Wojciech Kusa, Petr Knoth, Allan Hanbury
32nd ACM International Conference on Information and Knowledge Management
JBI
Effective Matching of Patients to Clinical Trials using Entity Extraction and Neural Re-ranking
Wojciech Kusa, Óscar E. Mendoza, Petr Knoth, Gabriella Pasi, Allan Hanbury
Journal of Biomedical Informatics, 104444
ICTIR
Outcome-based Evaluation of Systematic Review Automation
Wojciech Kusa, Guido Zuccon, Petr Knoth, Allan Hanbury
The 13th International Conference on the Theory of Information Retrieval
SIGIR
VoMBaT: A Tool for Visualising Evaluation Measure Behaviour in High-Recall Search Tasks
Wojciech Kusa, Aldo Lipani, Petr Knoth, Allan Hanbury
The 46th International ACM SIGIR Conference on Research and Development in Information Retrieval
ISWA
An Analysis of Work Saved over Sampling in the Evaluation of Automated Citation Screening in Systematic Literature Reviews
Wojciech Kusa, Aldo Lipani, Petr Knoth, Allan Hanbury
Intelligent Systems with Applications, pp. 200193, 2023, ISSN: 2667-3053
NeurIPS
BigBIO: A Framework for Data-Centric Biomedical Natural Language Processing
Jason Alan Fries, Leon Weber, Natasha Seelam, Gabriel Altay, Debajyoti Datta, Samuele Garda, Myungsun Kang, Ruisi Su, Wojciech Kusa, Samuel Cahyawijaya, Fabio Barth, Simon Ott, Matthias Samwald, Stephen Bach, Stella Biderman, Mario Sänger, Bo Wang, Alison Callahan, Daniel León Periñán, Théo Gigant, Patrick Haller, Jenny Chim, Jose David Posada, John Michael Giorgi, Karthik Rangasai Sivaraman, Marc Pàmies, Marianna Nezhurina, Robert Martin, Michael Cullan, Moritz Freidank, Nathan Dahlberg, Shubhanshu Mishra, Shamik Bose, Nicholas Michio Broad, Yanis Labrak, Shlok S. Deshmukh, Sid Kiblawi, Ayush Singh, Minh Chien Vu, Trishala Neeraj, Jonas Golde, Albert Villanova del Moral, Benjamin Beilharz
Thirty-sixth Conference on Neural Information Processing Systems (NeurIPS) Track on Datasets and Benchmarks
SDP
Benchmark for Research Theme Classification of Scholarly Documents
Óscar E. Mendoza, Wojciech Kusa, Alaa El-Ebshihy, Ronin Wu, David Pride, Petr Knoth, Drahomira Herrmannova, Florina Piroi, Gabriella Pasi, Allan Hanbury
Third Workshop on Scholarly Document Processing at COLING 2022
SIGIR
ORCAS-I: Queries Annotated with Intent using Weak Supervision
Daria Alexander, Wojciech Kusa, Arjen P. de Vries
The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval
ECIR
Automation of Citation Screening for Systematic Literature Reviews using Neural Networks: A Replicability Study
Wojciech Kusa, Petr Knoth, Allan Hanbury
44th European Conference on Information Retrieval