• 2022-09
  • 2022-08
  • 2022-07
  • 2022-06
  • 2022-05
  • 2022-04
  • 2021-03
  • 2020-08
  • 2020-07
  • 2020-03
  • 2019-11
  • 2019-10
  • 2019-09
  • 2019-08
  • 2019-07
  • br For manual annotations we


    For manual annotations, we performed an agreement analysis with a sub part of the training sentence-set. Four clinical researchers anno-tated a total of 100 sentences [MK, KK, JP, THB]. The Kappa score was 0.89 (see supplementary material 2). Based on the strong agreement between annotators, the remaining 208 sentences in the training set were annotated by a single clinician [KK]. The 100 note test-set was annotated by the research nurse [MF]. The objective of annotation was to identify whether a note had a positive or negative mention of a bone scan.
    The NLP pipeline first pre-processed each clinical note, which en-tailed splitting the note into individual sentences, removing capitali-zation, numbers and punctuation, and excluding words smaller than three letters, except the word “no” and the abbreviation “NM” (Nuclear Medicine). Through this process, a note corresponded to a list of sen-tences and a sentence corresponded to a list of words. “Bone scan” was the only target key term.
    The rule-based method applied a set of syntax rules to predict whether a sentence contained information related to a bone scan. The model used the ConText algorithm developed by Chapman et al [23]. ConText is an algorithm derived from the NegEx algorithm to identify negative results in a free text. From regular expressions, it DCFH-DA determines whether information in clinical reports are mentioned as negated, hy-pothetical, historical, or experienced by someone other than the pa-tient. For this study, if bone scan information is negated, hypothetical or historical then we concluded the patient did not receive a bone scan for this note. In addition, if no modifier could be apply to the sentence then, by default, we classified the sentence as negated. We used 90% of the training dataset to build the rules manually and the remaining 10% to validate the model. This iterated process of rule building was used to develop the model.
    2.9. Convolutional neural network method
    After notes were pre-processed, we used the word2vec method im-plemented in Gensim [24] to form word embeddings. [25] Word2Vec is a technique to create a vector representing the semantic context of a word for each word in our corpus. If similar words share common contexts in the corpus, then it is assumed they have similar vectors. The word2vec method is self-supervised machine learning method that trains a 2-layer neural network to form word embeddings. Word2vec has two different architectures (skip-gram and Continuous Bag of Words (CBOW)) and two different algorithms (hierarchical softmax and negative sampling). We chose to generate vectors with a dimension of
    300. We tried multiple configurations (described in supplementary material 3) and found that for our dataset the best configuration was a combination of the CBOW architecture and the hierarchical softmax algorithm. We also tried different window widths (i.e. the maximum distance between the current and predicted word within a sentence) and we chose a window width of 5. From the word embeddings, we created a two-dimensional matrix for each sentence where each row corresponded to a word in the sen-tence and each column to a dimension of the vector. Using this matrix, we applied the convolutional neural network (CNN) method to classify sentences. [26] CNN methods require a uniform size matrix as input. Therefore, we calculated that the maximum sentence size in the notes was 361. If the size of a sentence was smaller than 361, then we completed the sentence with a padding of “0”. Finally, each sentence corresponded to a matrix of 300 × 361.
    The model architecture was implemented with the library TensorFlow [27] and was trained on the training data set. We tuned the model using the strategy described by Zhang and Wallace. [28] We used