Intended for healthcare professionals

Letters

Electronic patient records in general practice

BMJ 2001; 323 doi: https://doi.org/10.1136/bmj.323.7322.1184a (Published 17 November 2001) Cite this as: BMJ 2001;323:1184

Published methods of measuring the accuracy of electronic records do exist

  1. Philip J Bayliss Brown, honorary senior lecturer in medical informatics (Phil{at}hicomm.demon.co.uk)
  1. Department of Diabetes and Endocrinology, St Thomas's Hospital, King's College, London SE1 7EH
  2. University of Wales College of Medicine, Cardiff CF14 4XN
  3. ICRF Medical Statistics Group, Centre for Statistics in Medicine, Institute of Health Sciences, Oxford OX3 7LF
  4. Information and Computing Division, School of Medicine, University of Southampton, MailPoint 820, Southampton General Hospital, Southampton SO16 6YD
  5. Fisher Medical Centre, Millfields, Skipton BD23 1EU
  6. School of Health and Community Studies, University of Derby, Derby DE22 3HL
  7. Research School of Medicine, University of Leeds, Leeds LS2 9LN

    EDITOR—Hassey et al have highlighted the importance of ensuring that electronic records are accurate.1 In their study they explored a method of measuring the validity and utility of electronic records in general practice, including whether the coding of 15 marker diagnoses was a true reflection of the actual prevalence.

    They are, however, wrong in their assertion that no published accounts of measuring the validity of electronic record contents exist. Hogan and Wagner performed a literature review and compared 20 articles that met certain quality criteria.2 They recommended (as did Hassey et al) that measures of completeness (sensitivity or detection rate) and correctness (positive predictive value) were valuable. These measures have also been shown to be valuable in measuring the quality of data retrieval.3

    Other measures derived from 2×2 contingency tables are less likely to be helpful because of the combination of a large total number of records and true negatives. To compensate for this, Hassey et al propose two new descriptive statistics. Previous reports have used Cohen's κ,4 a measure of the strength of agreement between the observed retrieval and the gold standard, against a result that might be expected by chance. Cohen's κ has the advantage of being a well validated single index and has been shown to be a useful index of measuring data retrieval from electronic records where performances of >0.9 can be achieved.3

    When Cohen's κ is applied to the data by Hassey et al, it highlights similar priority areas of data concern where the value is <0.9 (obesity 0.04, hypothyroidism 0.89, iron deficiency anaemia 0.86, asthma 0.86). Prescriptions generated were also compared with those dispensed by a local pharmacy. As they were computer generated, 99.7% were reported to be valid, but of the 10 handwritten prescriptions only 80% were accurately recorded. Perhaps a more suitable design would have been to check in a sample how many of the prescriptions reflected the correct dose and frequency. Hassey et al claim that the principal innovation of the study was the use of Read codes as the test for the true presence of a diagnosis, despite Gray et al's earlier account of identifying patients with ischaemic heart disease by using a similar technique and reporting exactly the same sensitivity rate (96%).5 The approach used by Hassey et al in triangulating disease codes with treatments and other findings has merit, but due consideration should have been given to existing literature.

    References

    1. 1.
    2. 2.
    3. 3.
    4. 4.
    5. 5.

    Methods of evaluation of electronic patient records entail dangers

    1. Robert G Newcombe, senior lecturer in medical statistics (Newcombe{at}cf.ac.uk),
    2. Douglas G Altman, professor of statistics in medicine (d.altman{at}icrf.icnet.uk),
    3. Trevor N Bryant, deputy director (T.N.Bryant{at}soton.ac.uk)
    1. Department of Diabetes and Endocrinology, St Thomas's Hospital, King's College, London SE1 7EH
    2. University of Wales College of Medicine, Cardiff CF14 4XN
    3. ICRF Medical Statistics Group, Centre for Statistics in Medicine, Institute of Health Sciences, Oxford OX3 7LF
    4. Information and Computing Division, School of Medicine, University of Southampton, MailPoint 820, Southampton General Hospital, Southampton SO16 6YD
    5. Fisher Medical Centre, Millfields, Skipton BD23 1EU
    6. School of Health and Community Studies, University of Derby, Derby DE22 3HL
    7. Research School of Medicine, University of Leeds, Leeds LS2 9LN

      EDITOR—Hassey et al indicate the need to validate electronic patient records in primary care.1 While findings are appropriately expressed in percentages as in this article, their EPR-Val toolkit yields incorrect confidence intervals. For the diabetes data, the calculated 95% confidence intervals are incorrect on two counts. Incorrect use of the table total as the denominator in calculating standard errors results in intervals that are too narrow for sensitivity and positive predictive value. Furthermore, the traditional method is inferior, especially for proportions near 100%. The table shows their results, recalculated using the traditional and the preferred Wilson method. 2 3

      Results of Hassey et al calculated by three methods

      View this table:

      Even with large samples the traditional method can give impossible values exceeding 100%, as for the positive predictive value here. The preferable Wilson method is available in confidence interval analysis software4 and for Microsoft Excel (www.uwcm.ac.uk/epidemiology_statistics/research/statistics/newcombe.htm). We are disturbed by the dissemination of the inadequately tested EPR-Val software, which should be withdrawn immediately from bmj.com. Potential users should check new software using data with known answers, as errors are quite common.5

      Furthermore, some of the measures displayed are redundant, whereas others, especially accuracy, are potentially misleading. The quoted accuracy of 99.9% conceals the fact that about one in 60 people diagnosed with diabetes is not coded as such on the database. There is a danger in using terms such as sensitivity, specificity, and predictive value, familiar from the clinical or screening context, in the validation of data. In the former situation, the gold standard is implicitly whether the individual really has the disease. In the context of data validation, these quantities measure how two parts of the record agree. Some of the 13 302 patients whose records do not indicate “diabetes” would have diagnosable disease, if sought using systematic diagnostic criteria. We are concerned that clinicians and managers believe that such figures indicate the practice has successfully identified all prevalent diabetic patients and is managing them proactively.

      The study showed that many diagnosed cases of asthma, iron deficiency anaemia, hypothyroidism, and ischaemic heart disease are not adequately identifiable within present standards of record keeping. It is helpful to show such deficiencies, complete the audit cycle, and correct them. But the converse is false: high sensitivity and specificity do not imply that all is well. High “accuracy” certainly does not. Even with improved consistency of record keeping for asthma, etc, practices could still have many patients with unidentified disease, just as for diabetes.

      References

      1. 1.
      2. 2.
      3. 3.
      4. 4.
      5. 5.

      Authors' reply

      1. Alan Hassey, general practitioner,
      2. David Gerrett, senior research fellow,
      3. Ali Wilson, senior associate lecturer
      1. Department of Diabetes and Endocrinology, St Thomas's Hospital, King's College, London SE1 7EH
      2. University of Wales College of Medicine, Cardiff CF14 4XN
      3. ICRF Medical Statistics Group, Centre for Statistics in Medicine, Institute of Health Sciences, Oxford OX3 7LF
      4. Information and Computing Division, School of Medicine, University of Southampton, MailPoint 820, Southampton General Hospital, Southampton SO16 6YD
      5. Fisher Medical Centre, Millfields, Skipton BD23 1EU
      6. School of Health and Community Studies, University of Derby, Derby DE22 3HL
      7. Research School of Medicine, University of Leeds, Leeds LS2 9LN

        EDITOR—We agree with Bayliss Brown that sensitivity and positive predictive value are useful measures of the validity of electronic patient record systems. We gave references in our paper to other studies that have used these methods, and we believe that we have given due consideration to the relevant literature.

        We did not use Cohen's κ because we were concerned with its reliance on symmetric marginal distributions and difficulties interpreting the statistic.1 This was recognised by Cohen in the qualifying statistical κ max, calculated by multiplying the marginal values of each column and row, and dividing by the total number of observations. The value is the maximum value that κ could achieve in the given circumstances. Thus, 1-κ max is the proportion of possibilities, excluding chance, which cannot be achieved as a consequence of differing marginals. There are no acceptable standards for balancing and interpreting κ, κ max, 1-κ max. Further difficulties have been noted in the literature, which increased our reluctance to use the statistic.2 We thought that the use of κ would have led us into difficulties of interpretation at a time when our goal was to provide a simple, easily usable tool.

        We are grateful to Newcombe et al for showing an error in our calculation of confidence intervals in the EPR-Val toolkit. We have already corrected the error and an updated version of the toolkit (EPR-Val2) is now available on bmj.com (www.bmj.com/cgi/content/full/322/7299/1401/DC1).

        We believe that there is no single best measure of validity for electronic patient records. Terms such as “accuracy” may be misleading, but the validity of electronic patient records has previously been reported by using sensitivity and positive predictive value as measures of completeness and accuracy, respectively.3 We recommend that in future studies, those measuring the validity of electronic patient records should say exactly what they mean by validity and state what measures they have calculated from their data. We have provided the ERP-Val2 toolkit to facilitate this process.

        We do not claim that measures of the validity of electronic patient records reflect the true prevalence of any diagnostic condition in the community, nor the effectiveness of our clinical management for these conditions. Our survey was designed to measure only the validity of the data we hold in the clinical records. The derived statistics TPFN ratio and DBFind10 000 are included to help healthcare workers understand how many true cases of the test condition remain undiagnosed in the database and help quantify the benefits of validating a clinical database for those conditions. Time will tell whether future researchers will find these measures useful.

        References

        1. 1.
        2. 2.
        3. 3.

        Log in

        Log in through your institution

        Subscribe

        * For online subscription