aalto1 untyped-item.component.html

Ethics in machine learning publications: Peer-review analysis using NLP methods

Loading...
Thumbnail Image

URL

Journal Title

Journal ISSN

Volume Title

Perustieteiden korkeakoulu | Master's thesis
Electronic archive copy is available via Aalto Thesis Database.

Department

Mcode

SCI3097

Language

en

Pages

44+1

Series

Abstract

Peer review is a critical component of the scientific publishing process, since its results directly influence the decision to publish a research work. Therefore, it is crucial to maintain ethical standards in the peer review process, and this work focuses on one important aspect: the appropriate use of citation recommendations in reviews. This study developed a classification model that identifies reviews with unjustified citation recommendations using NLP methods. To train the model, reviews from ICLR 2021, a top-tier machine learning conference, were manually annotated. It was found that the Multinomial Naive Bayes classifier performed the best among all the classifiers tested, and achieved 82% F1-score, 70% precision and 100% recall for the target class. Moreover, data augmentation techniques and optimal regularization strategies were explored to overcome the dataset's limited size. This classifier could serve as an assistive tool for conference organizers and reviewers. The results of this study provide a starting point for developing a comprehensive solution to ensure adherence to quality and ethical guidelines in peer review.

Description

Supervisor

Jung, Alex

Thesis advisor

Tian, Yu

Other note

Citation

Endorsement

Review

Supplemented By

Referenced By