Learning to Ask Like a Physician

E. Lehman, V. Lialin, et al., Clinical NLP 2022

Paper link

Full list of authors: Eric Lehman, Vladislav Lialin, Katelyn Y. Legaspi, Anne Janelle R. Sy, Patricia Therese S. Pile, Nicole Rose I. Alberto, Richard Raymund R. Ragasa, Corinna Victoria M. Puyat, Isabelle Rose I. Alberto, Pia Gabrielle I. Alfonso, Marianne TaliƱo, Dana Moukheiber, Byron C. Wallace, Anna Rumshisky, Jenifer J. Liang, Preethi Raghavan, Leo Anthony Celi, Peter Szolovits

Existing question answering (QA) datasets derived from electronic health records (EHR) are artificially generated and consequently fail to capture realistic physician information needs. We present Discharge Summary Clinical Questions (DiSCQ), a newly curated question dataset composed of 2,000+ questions paired with the snippets of text (triggers) that prompted each question. The questions are generated by medical experts from 100+ MIMIC-III discharge summaries. We analyze this dataset to characterize the types of information sought by medical experts. We also train baseline models for trigger detection and question generation (QG), paired with unsupervised answer retrieval over EHRs. Our baseline model is able to generate high quality questions in over 62% of cases when prompted with human selected triggers. We release this dataset (and all code to reproduce baseline model results) to facilitate further research into realistic clinical QA and QG: this https URL.

Schematic of the pipeline process used to generate and answer questions


    title = "Learning to Ask Like a Physician",
    author = "Lehman, Eric  and
      Lialin, Vladislav  and
      Legaspi, Katelyn Edelwina  and
      Sy, Anne Janelle  and
      Pile, Patricia Therese  and
      Alberto, Nicole Rose  and
      Ragasa, Richard Raymund  and
      Puyat, Corinna Victoria  and
      Tali{\~n}o, Marianne Katharina  and
      Alberto, Isabelle Rose  and
      Alfonso, Pia Gabrielle  and
      Moukheiber, Dana  and
      Wallace, Byron  and
      Rumshisky, Anna  and
      Liang, Jennifer  and
      Raghavan, Preethi  and
      Celi, Leo Anthony  and
      Szolovits, Peter",
    booktitle = "Proceedings of the 4th Clinical Natural Language Processing Workshop",
    month = jul,
    year = "2022",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2022.clinicalnlp-1.8",
    doi = "10.18653/v1/2022.clinicalnlp-1.8",
    pages = "74--86",