Abstract
To investigate whether forensic evaluations can be performed reliably using telemedicine, we compared the results on a standard competency assessment instrument using telemedicine (TM) and live interviews (LI). Two board-certified forensic psychiatrists used the Georgia Court Competency Test (GCCT) to evaluate 21 forensic psychiatric inpatients. Half of the patients were randomly assigned to a telemedicine interview and half were assigned to a live interview. Total scores on the GCCT were similar for both raters, indicating high levels of agreement between telemedicine and live interviews. Patient and provider satisfaction were measured and indicated that, although patients did not express a preference for a particular interview modality, providers reported greater satisfaction with live interviews. Findings suggest that one aspect of competency to stand trial can be reliably evaluated using telemedicine and that patients perceive telemedicine as an acceptable alternative to a standard live interview. The limited sample size precludes definite conclusions and further studies involving a larger forensic study population are warranted.
Forensic mental health professionals are frequently called on to provide an informed opinion regarding a defendant's competency to stand trial. In fact, trial competency issues are raised substantially more frequently than other types of legal issues, such as sanity.1 According to national estimates, questions regarding a defendant's competency to stand trial are introduced in five to eight percent of all cases.2 Approximately 16 percent of defendants referred for competency evaluation are ultimately found to be incompetent,3 although the prevalence of incompetence is higher among those with specific psychiatric comorbidity.4 Studies have demonstrated high levels of agreement between clinicians who make determinations of a defendant's competency to stand trial, with reliability estimates generally reported to be above 80 percent.1 Interrater agreement is generally higher when standardized competency assessments are employed5 and research using the Georgia Court Competency Test (GCCT) indicates high levels of interrater agreement.6
Because of their remote locations and reliance on specialty providers, correctional settings have developed and implemented extensive telecommunications networks to provide health care services to defendants, and several extensive descriptions of these programs have been published.7,8 Telemedicine, provided through videoconferencing and other modalities, is the delivery of medical services and the exchange of medical information when distance separates participants.9 The National Library of Medicine refers to telemedicine as it applies to psychiatry as telepsychiatry and defines it as the use of electronic communication and information technologies to provide or support clinical psychiatric care at a distance.10
Recent investigations have strongly supported the efficiency, cost-effectiveness, and diagnostic reliability of telepsychiatry.11–15 Investigations of clinical psychiatric assessments conducted using telepsychiatry indicate high levels of reliability for standardized psychiatric assessments. Using standardized rating scales for obsessive-compulsive, depressive, and anxiety symptoms in a sample of psychiatric outpatients with obsessive-compulsive disorder, Baer and colleagues11 reported high levels of interrater reliability for telemedicine (TM) and live interviews (LI).
Similarly, Baigent and associates12 used the Brief Psychiatric Rating Scale (BPRS) to evaluate patients randomly assigned to a live interview with patient, observer, and interviewer present; a telemedicine interview with patient and observer in person and the interviewer at remote location; or a telemedicine interview with the patient alone and the observer and interviewer at remote location. Results were generally positive, indicating equivalent reliability in the telepsychiatry and live settings, although several behavioral ratings and “degree of concern” for patients were slightly less reliable in the telepsychiatry setting. Research examining the diagnostic reliability of the Structured Clinical Interview for DSM-III-R (SCID) yielded high concordance rates between telemedicine and face-to-face interviews.14,15
Published reports examining the use of telepsychiatry in forensic psychiatric populations have until recently been limited to program descriptions,16–18 case reports,19 and comparisons of patient satisfaction.20 Yellowlees19 described the successful use of telemedicine to perform psychiatric assessments for magisterial hearings in compliance with Australia's Mental Health Act. Research evaluating the reliability of forensic telepsychiatry evaluations has only recently been conducted, and we are aware of only one other study evaluating the interrater reliability of forensic evaluations. Lexcen and colleagues21 examined the interrater reliability of the BPRS and the MacArthur Competence Assessment Tool–Criminal Adjudication in three conditions: (1) in-person administration, observation via telemedicine; (2) telemedicine administration, in-person observation; and (3) in-person administration and observation. Results indicated high levels of interrater reliability between conditions, providing support for the comparability of telemedicine interviews and those performed in person.
A review of the cogent research literature points to the widespread use of telemedicine to conduct clinical psychiatric evaluations. Research indicates that clinical assessments conducted via telepsychiatry are generally as reliable as those conducted in person for the majority of adult psychiatric patients.9,11–15 Although telepsychiatry is frequently used in correctional and forensic mental health settings, few studies have evaluated the reliability of forensic telepsychiatry evaluations. Therefore, the primary purpose of this study is to expand the telepsychiatry literature to include an empirical evaluation of telemedicine in a forensic mental health setting. To that end, we investigated the reliability of a standardized competency assessment tool using a telemedicine format in a sample of pretrial forensic inpatients.
Materials and Methods
Participants
The study was conducted in accordance with approvals by the institutional review boards at Tulane University and the Louisiana Department of Health and Human Services. Since the protocol involved research in a highly vulnerable population (i.e., psychiatric patients and prisoners), study procedures were conducted under the oversight of an independent prison representative. The procedures were introduced and explained to patients by their treating psychiatrist. Patients were informed that the competency assessment was for research purposes only and that results of the interview would not be shared with hospital staff or become part of the patients’ clinical records. Participants provided written informed consent before enrollment. Ten patients refused to participate in the study, citing either concerns about privacy (e.g., a study physician was also a member of the original sanity commission) or expressing a lack of interest. Because of site-specific IRB regulations, participants were not allowed to receive any incentives for participation. Twenty-one inpatients from the forensic division of the Eastern Louisiana Mental Health System (ELMHS), a 235-bed maximum security forensic hospital, participated in the study. Twelve of the 21 participants were pretrial (i.e., deemed incompetent to proceed) and 9 were in the postadjudication phase (i.e., adjudicated NGRI). The average age for participants was 42 years (range, 23–62 years). Six participants had nonviolent index offenses (arson, burglary/theft, drug possession), whereas 15 had violent index offenses (battery, second degree murder, rape). The GCCT portions of the interviews were approximately 30 minutes in duration.
All forensic patients at ELHMS were eligible to participate as long as they met the following inclusion criteria as determined by their treating psychiatrist: (1) able to provide written informed consent; (2) eligible for participation in telemedicine clinic (i.e., no unit restrictions involving transportation); and (3) at low risk for severe violence or elopement. Patients of the two study raters were not eligible for inclusion, as we wanted to ensure raters’ blindness to patients’ legal status and clinical history. Patients who were legally interdicted, were under legally appointed guardianship, or those with an index offense involving first degree murder were also excluded because of the increased security and legal mandates involved in their clinical care.
Materials
Competency to Stand Trial
The Georgia Court Competency Test–Mississippi State Hospital revision (GCCT-MSH)22 was employed to provide an objective index of competency to stand trial. The GCCT-MSH is a 21-item interview used to evaluate a respondent's competency to stand trial and is one component of a functional psychiatric assessment of competency. The GCCT is relatively brief to administer and has a wealth of psychometric data supporting its use (see Ref. 23 for a cogent review). The GCCT-MSH was utilized in the present study because it is routinely used at ELMHS to measure competency, is a psychometrically sound instrument,23,24 and minimizes the respondent's burden and the overall effort required of participants.
The GCCT-MSH consists of two sections that evaluate various components of competency to stand trial, including the basic aspects and functions of the court and legal system. The first seven items require respondents to identify the location of various participants in the courtroom by pointing to a schematic drawing. (In the telemedicine condition the schematic was enlarged and dry mounted on white poster board to permit standard administration and allow the remote rater to view the participants’ responses.) Seven additional items require a description of the basic functions these individuals perform. Respondents are then asked seven additional questions designed to evaluate the extent to which they understand behavioral expectation in the courtroom, are capable of assisting counsel and can describe the charges against them and appreciate the consequences of adjudication. Space is also provided to record behavioral observations and raters’ overall clinical impressions. As in any comprehensive clinical assessment, behavioral observations and clinical impressions are relied upon in the scoring and interpretation of GCCT items. Total scores for the GCCT are computed by summing the 21 individual items and multiplying the raw score sum by 2. Scores can range from 0 to 100. Scores above 69 are considered passing and indicative of competence to stand trial; scores ranging from 60 to 69 are considered marginal and should be evaluated in the context of the respondent's specific clinical and legal considerations; scores below 60 are generally considered failing and indicative of incompetence to stand trial.
Raters were required to attend two hour-long training sessions conducted by a forensic psychologist at ELMHS with extensive GCCT experience. They performed two live practice evaluations under the direct supervision of the forensic psychologist.
Satisfaction
To evaluate patient satisfaction, we constructed several items designed to assess general satisfaction with the interview (items available upon request from the first author). The on-site rater verbally administered the satisfaction items to patients after administration of the GCCT to avoid misunderstandings due to differences in participants’ reading ability. Items consisted of 10 statements such as “Overall, were you satisfied with your evaluation today?” “Were you able to see and hear the doctor clearly?” “Was there enough time to tell the doctor all of your concerns/questions?” A dichotomous response format (yes/agree, 1; or no/disagree, 0) was used for patients to simplify response options, as it was noted during pretesting that patients experienced difficulty responding to survey items using a Likert scale response format. Patients also had the option of responding “not sure” (9) to any item. The mean was calculated by summing the score on each of the 10 statements (items with a score of 9 were treated as missing) and dividing by the total number of items.
Provider satisfaction was assessed by having providers respond to five statements such as, “I was comfortable using telemedicine to evaluate this patient,” “I was able to evaluate competency adequately in the patient using telemedicine,” and “Evaluating patients using telemedicine is an efficient use of my time.” Response categories ranged from strongly agree (1) to strongly disagree (5). Scoring was reversed for one item in which agreement indicated dissatisfaction (e.g., “My ability to establish/maintain rapport with this patient was adversely impacted.”). The total score for the provider satisfaction scale was calculated by summing the score for each of the five statements and dividing by the total number of statements.
Equipment
The videoconferencing equipment used was the Polycom ViewStation, model VSX 5000 (Polycom Inc., Pleasanton, CA) which sits on top of a standard television monitor. The ViewStation possesses scanning and zoom functions. The primary video output device was a 27-inch color television monitor equipped with a window feature. A Polycom Digital Tabletop Microphone was used for audio transmission. Both locations were equipped with identical audiovisual components. Although picture-in-picture television viewing was available, this feature was not used due to the possibility of patients being distracted by seeing themselves on the video monitor. The total cost of the videoconferencing equipment including two Polycom ViewStations ($3,300 each) and two Sony 27-inch televisions ($350 each) was $7,300.
Audiovisual connections between locations were established through an IP-based network using the H.323 videoconferencing protocol for LAN-based multimedia communications. Transmission was conducted using 768-kbp bandwidth over a private hospital-based Ethernet network. There were no costs associated with the IP-based lines, since these costs were already assumed by using the shared network. Cost estimates for establishing an independent IP line are estimated to be approximately $500 per month.
Procedure
The study was conducted from December 2004 to March 2005 and was open to all forensic patients at ELMHS. Participants in the study were randomly assigned to the live interview or the telemedicine interview. In the live interview, the primary rater (JFA) conducted the interview while in the room with the patient and the secondary rater (JWP). In the telemedicine condition, the primary rater (JFA) conducted the interview from the remote location (Tulane School of Medicine) via television monitor, while the patient and the secondary rater (JWP) sat together on-site at the forensic hospital. Therefore, the primary rater conducted all interviews in both of the study conditions.
Consistent with the methodology used by Baer and colleagues,11 who employed a simultaneous video reliability interview, all participants were interviewed by the same rater who evaluated the patient by using a standardized competency assessment instrument (Georgia Court Competency Test-Mississippi State Hospital revision; GCCT-MSH22). We elected to have the same rater interview participants in both conditions to minimize rater variance across conditions and to decrease participant burden. Both raters have extensive clinical experience in forensic psychiatry and in performing standardized competency evaluations, including the GCCT-MSH.
Live Interview Condition
Participants were assessed by both raters, who were physically present during the interview. Consistent with clinical procedures in place at ELMHS, the patient's social worker, a nurse, or security officer was also present. Participants were interviewed by rater 1 (JFA) in a private office on hospital grounds. Rater 2 (JWP) was present in the room, but did not participate in the interview. Both raters made independent ratings without conferring with one other, as verified by random audiotaping. Upon completion of the interview, rater 2 verbally administered the satisfaction survey to participants, since pilot testing indicated that participants had difficulty comprehending the written material. Both raters also completed the clinician satisfaction surveys. GCCT ratings and satisfaction surveys were placed in separate envelopes and given to the study coordinator.
Telemedicine Condition
Participants were rated by both raters, although in this condition rater 1 (JFA) was at an off-site location and used video teleconferencing to conduct the interview. Rater 2 (JWP) was present in the room with the participant. Participants were transported from their housing unit to a nearby office on hospital grounds. In addition to rater 2, one member of the hospital staff (i.e., nurse, social worker, or security guard) was present in the room with the patient. Rater 1, who conducted the interview via video, was located at Tulane University School of Medicine, approximately 140 miles from the patient's location. As in the LI format, rater 2 (JWP) did not participate in interviewing the patient, although he administered the satisfaction survey to the patient at the conclusion of the interview. Both raters completed clinician satisfaction surveys, which were placed in separate envelopes for the study coordinator.
Results
Both raters provided a total competency score for each patient according to standard GCCT scoring guidelines. The average value of each rater's score was calculated under each of the two experimental conditions, as were the correlations between the raters’ scores. Comparisons of mean values across assessment conditions are not meaningful, since participants were assessed on only one occasion, using either the telemedicine or the live interview format. Results of within-group comparisons are shown in Table 1.
To examine the pattern of scores under varying assessment conditions (LI versus TM), Pearson's correlations were computed. The correlations between raters in both conditions were very high, indicating that similarities in scores were robust to interview modality. In other words, raters were consistent within assessment conditions and tended to provide similar ratings when present in the room with the patient (LI condition) and when one rater was located offsite (TM condition).
Mean rating differences were also compared, to examine whether the agreement of ratings varied as a function of interview modality. The absolute value of the difference between the GCCT-MSH scores from raters 1 and 2 was calculated for each patient. The mean absolute differences from the study conditions (see Table 1) were then compared using a two-tailed independent-samples t test, and no significant difference was found (t19 = 0.32, p = .75). Results indicate that the level of disagreement between the raters could not be attributed to interview modality.
To summarize strength of association, we calculated the value of omega squared (ω2), which was .102. Therefore, the method of assessment (LI versus TM) accounted for 10.2 percent of the variance in the differences between raters’ scores. Using the guidelines provided by Kirk,25 this level of association between modality (live interview compared with telemedicine) and degree of rater agreement would be considered medium in size if the underlying ANOVA were significant. However, since the result of the ANOVA used to derive the omega-squared value was not significant (F1,19 = 3.39, p = .08), no reliable association between modality and rater agreement was found. This result further supports the conclusion that raters’ scores were not influenced by the format of the assessment.
The demographic characteristics of participants are presented in Table 2 and indicate that background characteristics were similar across study conditions. Using an α level .05, no significant differences in age, t19 = 1.66, p = .11, gender, χ2(1,n = 21) = .06, p = .8, ethnicity, χ2(1,n = 21) = .40, p = .52, or psychiatric diagnosis were found, χ2 (1,n = 21) = .15, p = .7, between patients in the LI condition compared with the TM condition. Two defendants were rated as incompetent to stand trial as a result of the LI, and three were rated as incompetent by the TM interview.
Patient Satisfaction
A score of 1 indicated 100 percent satisfaction. Patients reported high levels of satisfaction for both interview modalities. The average satisfaction score (standard deviations in parentheses) for the LI was .85 (.10) and .83 (.12) for the TM interview. No significant difference in satisfaction scores was found between conditions using a two-tailed independent samples t test, t19 = −0.48, p = .63. Thus, patient satisfaction was quite similar for the two interview conditions.
Provider Satisfaction
Provider satisfaction scores ranged from 2.25 to 5.00. A score of 5 indicated 100 percent satisfaction. Provider ratings for the LI and TM conditions were found to differ significantly by a two-tailed independent-samples t test (t19 = −3.79, p = .009). Providers reported slightly less overall satisfaction with the TM interview (mean 3.93, SD 0.71) compared with the LI (mean 4.79, SD 0.12).
Discussion
The results of this study demonstrate that telemedicine appears to be a reliable method of assessing competency to stand trial among pretrial forensic psychiatric patients. Results are consistent with other studies demonstrating acceptable levels of interrater reliability using standardized psychiatric scales.9,11,12,15,26 It adds to the burgeoning empirical literature on telepsychiatry by examining the reliability of clinical assessments performed by telemedicine in a forensic psychiatric setting.
Although it has been suggested that telepsychiatry encounters may be less reliable than in-person interviews since important aspects of the clinical assessment process, such as rapport building and behavioral observation are constrained,12,27,28 controlled research has not borne this out.11,13,16 Conversely, anecdotal information implies that the interposition of videoconferencing equipment places patients and providers on equal footing, thereby altering the power differential that often arises in traditional office encounters.10 Results from the present study demonstrate that similar competency ratings were obtained between observers, regardless of whether the interview was conducted in person or by telemedicine. Therefore, among raters, we observed no degradation in the reliability of competency ratings under conditions in which the primary rater was not present in the room with the patient.
The present study provides preliminary support for the utility of telemedicine in the evaluation of competency to stand trial and is consistent with the results of a recent meta-analysis by Hyler and colleagues.29 Correlations between the raters in the present study were very high in both conditions, indicating that when one evaluator gave a high rating the other evaluator did so, too, and that when one gave a low rating, the other also did so. The format of the interview (live versus telemedicine) did not systematically affect this pattern. If conducting the evaluation by telemedicine had introduced an additional source of rater variance, reductions in the correlations between raters in this condition would have been observed.
Satisfaction
Satisfaction with both interview modalities was evaluated through the use of a brief self-report scale. Patients expressed no preference for one interview modality over another and were as satisfied with the telemedicine as with the live interview. These results mirror those previously reported (see Ref. 30 for a cogent review). We are aware of the potential influence of social desirability on these results; however, we attempted to account for this by having staff unknown to patients perform the competency evaluations. In addition, we informed participants at the outset that data collected during research sessions were for research purposes only and were kept separate from their clinical records.
Clinicians reported greater satisfaction with live interviews than with telemedicine. This trend has been reported by other researchers.31,32 Technical difficulties (e.g., scheduling, audio/verbal lag, feedback and echo, poor audiovisual quality, and environmental inadequacies such as lighting and positioning of equipment) as well as interpersonal barriers (e.g., inability to establish rapport, difficulty evaluating interpersonal or nonverbal cues) have been cited by other researchers32,33 as reducing overall provider satisfaction. To get an idea of the potential reasons for reduced satisfaction in the present study, we examined the raters’ written comments for cases in which the overall satisfaction score was less than four. In three cases, the audiovisual quality of the transmission itself (e.g., transmission lags, sound quality, and clarity) was cited. In three cases, intrapersonal communication difficulties were mentioned (e.g., the patient had trouble hearing the remote physician, the provider had trouble understanding the patient's distorted or rambling speech, the participant was grossly psychotic). On three occasions, the provider present in the room with the patient expressed less satisfaction with telemedicine in response to the patient's preference for live interactions. In future studies, these speculative explanations could be examined in greater detail using a larger sample of raters.
Study Limitations
The limitations of the study are the small sample size and the reliance on a single measure to evaluate competency to stand trial. We acknowledge that the limited sample size precludes definite conclusions and further studies involving a larger forensic study population are warranted. We would like to point out that due to the potential for Type-II error using an omnibus test, we examined the data by alternative methods (i.e., Pearson's correlation, effect size). In so doing, our goal was to examine the relationship between ratings and study condition, using multiple methods to evaluate statistical convergence. Data from the t test, correlational method, and measures of effect size in conjunction with the omnibus test (ANOVA) all support the finding that there were no notable differences between study conditions. In addition, the present results are consistent with those of other studies examining the reliability of telemedicine compared with in-person interviews.9,11,15,30,33–36 We want to reiterate, however, that the results of this study are preliminary and should not be broadly applied without further empirical validation.
Interviews using standardized tools for competency evaluation, such as the GCCT, constitute only one facet of forensic mental health and should not be considered in isolation in routine practice. A full assessment of competency to stand trial would be likely to include the use of additional assessment instruments, including a comprehensive clinical evaluation involving a much more thorough evaluative process. In addition, the choice of the GCCT to evaluate competency to stand trial was made for primarily practical reasons. There are several other competency assessment tools in the literature that evaluate equally important aspects of competency to stand trial,37 such as decisional (e.g., IFI, FIT) or adjudicative competence (e.g., MacCAT-CA), which were not used. Expanding on the preliminary work offered by this study, future researchers may wish to examine the feasibility and cost-effectiveness of a comprehensive competency evaluation conducted by telemedicine, including record reviews and additional clinical assessments, perhaps using other standardized competency tools.
The limitations of the study notwithstanding, the data support recent findings using other structured competency tools21 and demonstrate the reliability of standardized competency evaluations conducted via telemedicine. Although limited in scope, the findings are promising for conducting standardized competency evaluations using telepsychiatry. These preliminary findings need to be confirmed with a larger sample size and should be examined in light of other standardized approaches to the evaluation of competency to stand trial. Future studies should be undertaken to investigate whether other aspects of forensic competency evaluations are also appropriate for telepsychiatry. Given the widespread development of telemedicine in correctional institutions in the United States8 and the encouraging preliminary results of the current study, additional empirical research examining the use of telemedicine in correctional and forensic settings is warranted. Empirical examinations should investigate the appropriateness of telemedicine for other types of forensic psychiatric assessments, such as clinical evaluations; the establishment of psychiatric diagnoses; specific evaluations, such as those to determine sanity or dangerousness; and the detection of malingering.
Although we did not evaluate cost comparisons for live versus telemedicine formats, the cost effectiveness of telemedicine in correctional settings has been well documented.8 If telemedicine technology is found to be useful for other types of psychiatric evaluations or the delivery of care in forensic settings, it may prove to be cost effective and widely applicable because of its ability to bring expert psychiatric consultation to an underserved population.
Legal and Ethics-Related Considerations
The prolific growth of telemedicine in forensic settings has been evident over the past decade with over 50 percent of state correctional institutions and 39 percent of federal institutions using some form of telemedicine.38 Seventy-three percent of these programs involved mental health care/telepsychiatry. Although the research is in its infancy, acceptance of telepsychiatry by the legal and mental health community has been cited as one of the barriers to the widespread use of telemedicine.38–40 Other concerns are the legal and ethics-related issues raised by forensic telepsychiatry, such as client privacy and confidentiality, professional liability and medical malpractice, scope of practice and medical licensure, and the legal admissibility of forensic telepsychiatry evaluations.40–43 Telepsychiatry evaluations have been routinely used in both civil (e.g., involuntary commitment hearings) and criminal (e.g., competency to stand trial, sanity, expert testimony) cases.19,38,43 There has been only one report of a case in which the use of videoteleconferencing was an issue on appeal.43,44 In that case, the court ruled that the use of videoteleconferencing during a mental competency hearing did not violate due process and that there was no legal basis for appeal based on interview modality. The development of consistent and appropriate guidelines and evidence-based standards governing the use of telepsychiatry should be considered in light of the potential benefits offered.
Acknowledgments
We acknowledge the assistance of the forensic psychiatry staff of ELMHS and the Tulane Health Sciences Center in recruiting research participants, without whom the study could not have been conducted. We also thank Alan Newman, MD, for his assistance with the telemedicine equipment, and Denny Nelson for lending his invaluable telemedicine and telecommunications expertise. We thank the ELMHS clinical staff, particularly Elaine Guillot, for administrative support, and Darla Burnett, PhD, for assistance in training the study raters. The authors thank all of the anonymous reviewers who provided helpful comments on two earlier versions of the manuscript.
- American Academy of Psychiatry and the Law