Journal ArticleIntegrating Materials and Manufacturing Innovation · September 1, 2024
Advances in materials science require leveraging past findings and data from the vast published literature. While some materials data repositories are being built, they typically rely on newly created data in narrow domains because extracting detailed data ...
Full textCite
Journal ArticleVaccine · April 2024
We present VaxConcerns, a taxonomy for vaccine concerns and misinformation. VaxConcerns is an easy-to-teach taxonomy of concerns and misinformation commonly found among online anti-vaccination media and is evaluated to produce high-quality data annotations ...
Full textCite
Conference2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings · January 1, 2024
Data selection techniques, which adaptively select datapoints inside the training loop, have demonstrated empirical benefits in reducing the number of gradient steps to train neural models. However, these techniques have so far largely been applied to clas ...
Cite
ConferenceFindings of the Association for Computational Linguistics: NAACL 2024 - Findings · January 1, 2024
One way to personalize chatbot interactions is by establishing common ground with the intended reader. A domain where establishing mutual understanding could be particularly impactful is vaccine concerns and misinformation. Vaccine interventions are forms ...
Cite
ConferenceFindings of the Association for Computational Linguistics: NAACL 2024 - Findings · January 1, 2024
Sentence embedding models are typically trained using contrastive learning (CL), either using human annotations directly or by repurposing other annotated datasets. In this work, we explore the recently introduced paradigm of generating CL data using gener ...
Cite
Chapter · January 1, 2024
Vaccine concerns are an ever-evolving target, and can shift quickly as seen during the COVID-19 pandemic. Identifying longitudinal trends in vaccine concerns and misinformation might inform the healthcare space by helping public health efforts strategicall ...
Full textCite
ConferenceProceedings of the Annual Meeting of the Association for Computational Linguistics · January 1, 2024
This paper investigates the use of large language models (LLMs) for extracting sample lists of polymer nanocomposites (PNCs) from full-length materials science research papers. The challenge lies in the complex nature of PNC samples, which have numerous at ...
Cite
ConferenceProceedings of the Annual Meeting of the Association for Computational Linguistics · January 1, 2024
With the proliferation of LLM-integrated applications such as GPT-s, millions are deployed, offering valuable services through proprietary instruction prompts. These systems, however, are prone to prompt extraction attacks through meticulously designed que ...
Cite
Journal ArticlePlast Reconstr Surg Glob Open · September 2023
ChatGPT is a cutting-edge language model developed by OpenAI with the potential to impact all facets of plastic surgery from research to clinical practice. New applications for ChatGPT are emerging at a rapid pace in both the scientific literature and popu ...
Full textLink to itemCite
ConferenceConference on Human Factors in Computing Systems - Proceedings · April 19, 2023
Human data labeling is an important and expensive task at the heart of supervised learning systems. Hierarchies help humans understand and organize concepts. We ask whether and how concept hierarchies can inform the design of annotation interfaces to impro ...
Full textCite
ConferenceEACL 2023 - 17th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference · January 1, 2023
Identifying the difference between two versions of the same article is useful to update knowledge bases and to understand how articles evolve. Paired texts occur naturally in diverse situations: reporters write similar news stories and maintainers of autho ...
Cite
ConferenceEACL 2023 - 17th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference · January 1, 2023
Many adversarial attacks in NLP perturb inputs to produce visually similar strings ('ergo' → 'εrgo') which are legible to humans but degrade model performance. Although preserving legibility is a necessary condition for text perturbation, little work has b ...
Cite
ConferenceProceedings of the Annual Meeting of the Association for Computational Linguistics · January 1, 2023
Since the introduction of the SemEval 2020 Task 11 (Martino et al., 2020a), several approaches have been proposed in the literature for classifying propaganda based on the rhetorical techniques used to influence readers. These methods, however, classify on ...
Cite
ConferenceEMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings · January 1, 2023
Trustworthy language models should abstain from answering questions when they do not know the answer. However, the answer to a question can be unknown for a variety of reasons. Prior research has focused on the case in which the question is clear and the a ...
Cite
Journal ArticleTransactions of the Association for Computational Linguistics · March 18, 2022
AbstractMany facts come with an expiration date, from the name of the President to the basketball team Lebron James plays for. However, most language models (LMs) are trained on snapshots of data collected a ...
Full textOpen AccessCite
ConferenceNLP-Power 2022 - 1st Workshop on Efficient Benchmarking in NLP, Proceedings of the Workshop · January 1, 2022
With many real-world applications of Natural Language Processing (NLP) comprising of long texts, there has been a rise in NLP benchmarks that measure the accuracy of models that can handle longer input sequences. However, these benchmarks do not consider t ...
Full textCite
ConferenceProceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022 · January 1, 2022
An abundance of datasets and availability of reliable evaluation metrics have resulted in strong progress in factoid question answering (QA). This progress, however, does not easily transfer to the task of long-form QA, where the goal is to answer question ...
Cite
ConferenceCEUR Workshop Proceedings · January 1, 2021
The PAN 2021 authorship verification (AV) challenge focuses on determining if two texts are written by the same author or not, specifically when faced with new, unseen, authors. In our approach, we construct a Siamese network initialized with pretrained BE ...
Cite