Supervisors’ self-assessment of feedback skills: a psychometric validation study of the English version of the SwissSETQ questionnaire for supervisors

Fabienne Schwitz; Joy Backhaus; Monika Brodmann Maeder; Eva K. Hennel

doi:10.57187/s.4178

Original article

Vol. 155 No. 5 (2025)

Supervisors’ self-assessment of feedback skills: a psychometric validation study of the English version of the SwissSETQ questionnaire for supervisors

Fabienne Schwitz⁺⁻
Joy Backhaus⁺⁻
Monika Brodmann Maeder⁺⁻
Eva K. Hennel⁺⁻

Cite this as:: Swiss Med Wkly. 2025;155:4178
Published: 30.05.2025

Summary

STUDY AIMS: We created an instrument to assess the supervisors’ perspective on their feedback behaviour to residents and investigated its validity. Our instrument is based on the SETQsmart, a Dutch instrument for assessing the quality of supervision in clinical training and the SwissSETQ, its German adaptation for residents. Our instrument is in English to ensure relevance across all Swiss language regions. The study specifically sought: to replicate the factor structure of the original trainee questionnaire for supervisors; to verify the alignment of SwissSETQ and SETQsmart domains with the factor structure; and to evaluate the psychometric properties of the English version.

METHODS: The original SwissSETQ was translated into English, maintaining the Swiss context and local language usage. The questionnaire was adjusted to reflect the supervisor’s perspective. The translated questionnaire was then distributed among supervisors in all Swiss cardiology training sites, and data were collected using the SoSci Survey platform between March and April 2024. The statistical analysis, including exploratory factor analysis (EFA) with promax rotation, Bartlett’s test of sphericity, Kaiser-Meyer-Olkin (KMO) coefficient and psychometric evaluation, was conducted using R software.

RESULTS: Of approximately 600 cardiology supervisors in Switzerland, 207 responded, with 135 valid cases remaining after data cleaning. The factor analysis identified three factors: Teaching structure, Attitude of the supervisor and Role modelling, resulting in a shortened 23-item questionnaire. The Kaiser-Meyer-Olkin coefficient was 0.83, and Bartlett’s test was significant, confirming data suitability for factor analysis. The factors demonstrated high internal consistency, with Cronbach’s α values of 0.89, 0.77 and 0.87, respectively. The partial credit model indicated the need for a revised 5-point Likert scale for better response distribution. No significant differences were found between factors and sociodemographic variables, suggesting the English version’s applicability across all Swiss language regions.

CONCLUSIONS: The study investigated the English-translated and supervisor-adapted version of the SwissSETQ, demonstrating good psychometric properties and a clear factor structure. The instrument is suitable for use across different Swiss language regions, enhancing its utility in a multilingual context. The findings support the potential of the SwissSETQ to facilitate cross-cultural and cross-linguistic collaboration in medical training. Future research should explore additional factors influencing teaching quality, such as work environment and supervisor motivation.

References

Artino AR Jr, La Rochelle JS, Dezee KJ, Gehlbach H. Developing questionnaires for educational research: AMEE Guide No. 87. Med Teach. 2014 Jun;36(6):463–74. 10.3109/0142159X.2014.889814
Thomet C, Moons P, Schwerzmann M, Schwitz F. Development of quality indicators of transfer and transition in adolescents and young adults with congenital heart disease. BMC Health Serv Res. 2023 Oct;23(1):1154. 10.1186/s12913-023-10183-6
Breckwoldt J, Marty AP, Stricker D, Stolz R, Thomasin R, Seeholzer N, et al. Bottom-up feedback to improve clinical teaching: validation of the Swiss System for Evaluation of Teaching Qualities (SwissSETQ). Swiss Med Wkly. 2022 Mar;152(1112):w30137. 10.4414/SMW.2022.w30137
Lombarts KM, Ferguson A, Hollmann MW, Malling B, Arah OA; SMART Collaborators. Redesign of the System for Evaluation of Teaching Qualities in Anesthesiology Residency Training (SETQ Smart). Anesthesiology. 2016 Nov;125(5):1056–65. 10.1097/ALN.0000000000001341
SoSci Survey (Version 3.5.02) 2024. Internet: https://www.soscisurvey.de
de Winter JC, Dodou D, Wieringa PA. Exploratory Factor Analysis With Small Sample Sizes. Multivariate Behav Res. 2009;44(2):147–81. 10.1080/00273170902794206
Rules FE. Standard practice in social science. In: Osborne J, editor. Best Practices in Quantitative Methods. SAGE Publications, Inc; 2008.
McNeish D. Exploratory Factor Analysis With Small Samples and Missing Data. J Pers Assess. 2017;99(6):637–52. 10.1080/00223891.2016.1252382
Goretzko D, Pham TT, Bühner M. Exploratory factor analysis: current use, methodological developments and recommendations for good practice. Curr Psychol. 2021;40(7):3510–21. 10.1007/s12144-019-00300-2
Masters GN. Partial credit model. Handbook of Item Response Theory. 1st ed. Chapman and Hall/CRC; 2016. pp. 137–54.
Hambleton RK, Jones RW. An NCME Instructional Module on Comparison of classical test theory and item response theory and their applications to test development. Educ Meas. 1993;12(3):38–47. 10.1111/j.1745-3992.1993.tb00543.x
R Core Team. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. 2021. Internet: https://www.R-project.org/
Wickham H, Francois R, Henry L, Müller K, Vaughan D. dplyr: A Grammar of Data Manipulation. Internet: https://dplyr.tidyverse.org/
Revelle W. psych: Procedures for Psychological, Psychometric, and Personality Research. 2024. Available from: https://cran.rediris.es/web/packages/psych/psych.pdf
Zeileis A, Grothendieck G. zoo: S3 Infrastructure for Regular and Irregular Time Series. J Stat Softw. 2005;14(6):1–27. 10.18637/jss.v014.i06
Wickham H. ggplot2: Elegant Graphics for Data Analysis. Cham: Springer International Publishing; 2016. https://doi.org/10.1007/978-3-319-24277-4
Yanagida T. misty: Miscellaneous Functions ’T. Yanagida’. Internet: https://rdrr.io/cran/misty/
Mair P, Hatzinger R. Extended Rasch Modeling: The eRm Package for the Application of IRT Models in R. J Stat Softw. 2007;20(9):1–20. 10.18637/jss.v020.i09
Juhola M, Laurikkala J. Missing values: how many can they be to preserve classification reliability? Artif Intell Rev. 2013;40(3):231–45. 10.1007/s10462-011-9282-2
Gravetter FJ, Wallnau LB. Statistics for The Behavioral Sciences. Cengage; 2016.
Beavers AS, Lounsbury JW, Richards JK, Huck SW, Skolits GJ, Esquivel SL. Practical Considerations for Using Exploratory Factor Analysis in Educational Research. Pract Assess Res Eval. 2013;18:1–13. 10.7275/qv2q-rk76
Williams B, Onsman A, Brown T. Exploratory Factor Analysis: A Five-Step Guide for Novices. Australas J Paramed. 2010;8:1–13. 10.33151/ajp.8.3.93
Stevens JP, Stevens JP. Applied Multivariate Statistics for the Social Sciences. 4th ed. New York: Psychology Press; 2001. 10.4324/9781410604491
Marsh HW, Hau KT, Balla JR, Grayson D. Is More Ever Too Much? The Number of Indicators per Factor in Confirmatory Factor Analysis. Multivariate Behav Res. 1998 Apr;33(2):181–220. 10.1207/s15327906mbr3302_1
Nunnally JC. An Overview of Psychological Measurement. In: Wolman BB, editor. Clin. Diagn. Ment. Disord. Handb. Boston (MA): Springer US; 1978. pp. 97–146. 10.1007/978-1-4684-2490-4_4
Masters GN. A Rasch model for partial credit scoring. Psychometrika. 1982;47(2):149–74. 10.1007/BF02296272
Luo G. The relationship between the Rating Scale and Partial Credit Models and the implication of disordered thresholds of the Rasch models for polytomous responses. J Appl Meas. 2005;6(4):443–55.
Peterson RA. A Meta-Analysis of Variance Accounted for and Factor Loadings in Exploratory Factor Analysis. Mark Lett. 2000;11(3):261–75. 10.1023/A:1008191211004
Nanda A, Mohapatra DB, Mahapatra AP, Mahapatra AP, Mahapatra AP. Mohapatra DrBB, Mahapatra APK, Mahapatra APK, Mahapatra APK. Multiple comparison test by Tukey’s honestly significant difference (HSD): do the confident level control type I error. Int J Stat Appl Math. 2021;6(1):59–65. 10.22271/maths.2021.v6.i1a.636
Dillman DA, Bowker DK. The Web Questionnaire Challenge to Survey Methodologists. In: Reips UD, Bosnjak M, editors. Dimens. Internet Sci., Lengerich. Pabst Science Publishers; 2001. pp. 159–78.
Deutskens E, De Ruyter K, Wetzels M, Oosterveld P. Response Rate and Response Quality of Internet-Based Surveys: An Experimental Study. Mark Lett. 2004;15(1):21–36. doi: https://doi.org/10.1023/B:MARK.0000021968.86465.00
Denscombe M. Item non‐response rates: a comparison of online and paper questionnaires. Int J Soc Res Methodol. 2009;12(4):281–91. 10.1080/13645570802054706
Harel D, Steele RJ. An Information Matrix Test for the Collapsing of Categories Under the Partial Credit Model. J Educ Behav Stat. 2018;43(6):721–50. 10.3102/1076998618787478
Rolstad S, Adler J, Rydén A. Response burden and questionnaire length: is shorter better? A review and meta-analysis. Value Health. 2011 Dec;14(8):1101–8. 10.1016/j.jval.2011.06.003
Galesic M, Bosnjak M. Effects of Questionnaire Length on Participation and Indicators of Response Quality in a Web Survey. Public Opin Q. 2009;73(2):349–60. 10.1093/poq/nfp031

How to Cite

1.

Supervisors’ self-assessment of feedback skills: a psychometric validation study of the English version of the SwissSETQ questionnaire for supervisors. Swiss Med Wkly [Internet]. 2025 May 30 [cited 2025 Dec. 26];155(5):4178. Available from: https://www.smw.ch/index.php/smw/article/view/4178

Funding data

Novartis Stiftung für Medizinisch-Biologische Forschung
Grant numbers CH 2312295853
University of Bern
Grant numbers FILmed Grant

Share

This work is licensed under a Creative Commons Attribution 4.0 International License.

DOI:: https://doi.org/10.57187/s.4178

[1] Artino AR Jr, La Rochelle JS, Dezee KJ, Gehlbach H. Developing questionnaires for educational research: AMEE Guide No. 87. Med Teach. 2014 Jun;36(6):463–74. 10.3109/0142159X.2014.889814

[2] Thomet C, Moons P, Schwerzmann M, Schwitz F. Development of quality indicators of transfer and transition in adolescents and young adults with congenital heart disease. BMC Health Serv Res. 2023 Oct;23(1):1154. 10.1186/s12913-023-10183-6

[3] Breckwoldt J, Marty AP, Stricker D, Stolz R, Thomasin R, Seeholzer N, et al. Bottom-up feedback to improve clinical teaching: validation of the Swiss System for Evaluation of Teaching Qualities (SwissSETQ). Swiss Med Wkly. 2022 Mar;152(1112):w30137. 10.4414/SMW.2022.w30137

[4] Lombarts KM, Ferguson A, Hollmann MW, Malling B, Arah OA; SMART Collaborators. Redesign of the System for Evaluation of Teaching Qualities in Anesthesiology Residency Training (SETQ Smart). Anesthesiology. 2016 Nov;125(5):1056–65. 10.1097/ALN.0000000000001341

[5] SoSci Survey (Version 3.5.02) 2024. Internet: https://www.soscisurvey.de

[6] de Winter JC, Dodou D, Wieringa PA. Exploratory Factor Analysis With Small Sample Sizes. Multivariate Behav Res. 2009;44(2):147–81. 10.1080/00273170902794206

[7] Rules FE. Standard practice in social science. In: Osborne J, editor. Best Practices in Quantitative Methods. SAGE Publications, Inc; 2008.

[8] McNeish D. Exploratory Factor Analysis With Small Samples and Missing Data. J Pers Assess. 2017;99(6):637–52. 10.1080/00223891.2016.1252382

[9] Goretzko D, Pham TT, Bühner M. Exploratory factor analysis: current use, methodological developments and recommendations for good practice. Curr Psychol. 2021;40(7):3510–21. 10.1007/s12144-019-00300-2

[10] Masters GN. Partial credit model. Handbook of Item Response Theory. 1st ed. Chapman and Hall/CRC; 2016. pp. 137–54.

[11] Hambleton RK, Jones RW. An NCME Instructional Module on Comparison of classical test theory and item response theory and their applications to test development. Educ Meas. 1993;12(3):38–47. 10.1111/j.1745-3992.1993.tb00543.x

[12] R Core Team. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. 2021. Internet: https://www.R-project.org/

[13] Wickham H, Francois R, Henry L, Müller K, Vaughan D. dplyr: A Grammar of Data Manipulation. Internet: https://dplyr.tidyverse.org/

[14] Revelle W. psych: Procedures for Psychological, Psychometric, and Personality Research. 2024. Available from: https://cran.rediris.es/web/packages/psych/psych.pdf

[15] Zeileis A, Grothendieck G. zoo: S3 Infrastructure for Regular and Irregular Time Series. J Stat Softw. 2005;14(6):1–27. 10.18637/jss.v014.i06

[16] Wickham H. ggplot2: Elegant Graphics for Data Analysis. Cham: Springer International Publishing; 2016. https://doi.org/10.1007/978-3-319-24277-4

[17] Yanagida T. misty: Miscellaneous Functions ’T. Yanagida’. Internet: https://rdrr.io/cran/misty/

[18] Mair P, Hatzinger R. Extended Rasch Modeling: The eRm Package for the Application of IRT Models in R. J Stat Softw. 2007;20(9):1–20. 10.18637/jss.v020.i09

[19] Juhola M, Laurikkala J. Missing values: how many can they be to preserve classification reliability? Artif Intell Rev. 2013;40(3):231–45. 10.1007/s10462-011-9282-2

[20] Gravetter FJ, Wallnau LB. Statistics for The Behavioral Sciences. Cengage; 2016.

[21] Beavers AS, Lounsbury JW, Richards JK, Huck SW, Skolits GJ, Esquivel SL. Practical Considerations for Using Exploratory Factor Analysis in Educational Research. Pract Assess Res Eval. 2013;18:1–13. 10.7275/qv2q-rk76

[22] Williams B, Onsman A, Brown T. Exploratory Factor Analysis: A Five-Step Guide for Novices. Australas J Paramed. 2010;8:1–13. 10.33151/ajp.8.3.93

[23] Stevens JP, Stevens JP. Applied Multivariate Statistics for the Social Sciences. 4th ed. New York: Psychology Press; 2001. 10.4324/9781410604491

[24] Marsh HW, Hau KT, Balla JR, Grayson D. Is More Ever Too Much? The Number of Indicators per Factor in Confirmatory Factor Analysis. Multivariate Behav Res. 1998 Apr;33(2):181–220. 10.1207/s15327906mbr3302_1

[25] Nunnally JC. An Overview of Psychological Measurement. In: Wolman BB, editor. Clin. Diagn. Ment. Disord. Handb. Boston (MA): Springer US; 1978. pp. 97–146. 10.1007/978-1-4684-2490-4_4

[26] Masters GN. A Rasch model for partial credit scoring. Psychometrika. 1982;47(2):149–74. 10.1007/BF02296272

[27] Luo G. The relationship between the Rating Scale and Partial Credit Models and the implication of disordered thresholds of the Rasch models for polytomous responses. J Appl Meas. 2005;6(4):443–55.

[28] Peterson RA. A Meta-Analysis of Variance Accounted for and Factor Loadings in Exploratory Factor Analysis. Mark Lett. 2000;11(3):261–75. 10.1023/A:1008191211004

[29] Nanda A, Mohapatra DB, Mahapatra AP, Mahapatra AP, Mahapatra AP. Mohapatra DrBB, Mahapatra APK, Mahapatra APK, Mahapatra APK. Multiple comparison test by Tukey’s honestly significant difference (HSD): do the confident level control type I error. Int J Stat Appl Math. 2021;6(1):59–65. 10.22271/maths.2021.v6.i1a.636

[30] Dillman DA, Bowker DK. The Web Questionnaire Challenge to Survey Methodologists. In: Reips UD, Bosnjak M, editors. Dimens. Internet Sci., Lengerich. Pabst Science Publishers; 2001. pp. 159–78.

[31] Deutskens E, De Ruyter K, Wetzels M, Oosterveld P. Response Rate and Response Quality of Internet-Based Surveys: An Experimental Study. Mark Lett. 2004;15(1):21–36. doi: https://doi.org/10.1023/B:MARK.0000021968.86465.00

[32] Denscombe M. Item non‐response rates: a comparison of online and paper questionnaires. Int J Soc Res Methodol. 2009;12(4):281–91. 10.1080/13645570802054706

[33] Harel D, Steele RJ. An Information Matrix Test for the Collapsing of Categories Under the Partial Credit Model. J Educ Behav Stat. 2018;43(6):721–50. 10.3102/1076998618787478

[34] Rolstad S, Adler J, Rydén A. Response burden and questionnaire length: is shorter better? A review and meta-analysis. Value Health. 2011 Dec;14(8):1101–8. 10.1016/j.jval.2011.06.003

[35] Galesic M, Bosnjak M. Effects of Questionnaire Length on Participation and Indicators of Response Quality in a Web Survey. Public Opin Q. 2009;73(2):349–60. 10.1093/poq/nfp031