# Evaluation of a BDI-based Virtual Agent for Training Child Helpline Counsellors
Authors: Sharon Afua Grundmann & Mohammed Al Owayyed

contact: m.alowayyed@tudelft.nl

This dataset is created as part of an evaluation study of a BDI-based conversational agent for training counsellors a child helpline. The design of this evaluation study was pre-registered under the Open Science Framework (OSF) registries and is publicly available at https://osf.io/hkxzc. The dataset contains participants' survey responses with regards to the measures - counselling self-efficacy and perceived usefulness of the agent. The markdown script contains data-specific information on how to make use of this dataset. The data was collected through an online survey hosted on Qualtrics.  The questionnaires for the surveys mentioned below can be found in the appendices of the main thesis report: https://repository.tudelft.nl/islandora/object/uuid%3Af04f8f0b-9ab9-4f1c-a19c-43b164d45cce

All statistical analyses were done using R software (version 4.0.5). This work is licensed under CC BY 4.0


## Files
- final experiment analysis.Rmd: Markdown script for data analysis.
- final-experiment-analysis.html: Script for data analysis in HTML.
- self-efficacy_scores.csv: contains the counselling self-efficacy scores measured pre- and post- the training interventions.
- useful_usab_scores.csv: contains the perceived usefulness and usability scores for participants.
- BDI_scores.csv: BDI scores of participants during the first and the third interaction sessions.
- inter-reliability.csv: coding data for a thematic analysis on participants answers of participants on the qualitative questions. 
- Double coding training.pdf: training materials used to train the second coder into how to do a thematic analysis.
- themes_data.csv: contains the thematic analysis and their corresponding quotes.
- draft-ChecklistDataRepositoryReview.pdf: Checklist for Review of Dataset.


## The excel/csv files contain these columns:

### self-efficacy_scores.csv
- ResponseID: Randomly generated id of the participants
- Q95_1 - Q102_1: Pre-counselling self efficacy survey* 
- Q111_1 - Q118_1: Post-couselling self-efficacy survey after session 1*
- Q120_1 - Q127_1: Post-couselling self-efficacy survey session 2*
- Group: to identify which order did they get the interventions. Group A started with the chatbot condition then the text condition and B did the group condition then chatbot condition.
- experience: number of years of experience for each counselor.

*The self-efficacy survey is the same for the 3 cases (pre, post condition 1 and post condition 2). The scale was from -5 ‘strongly disagree’, 0 ‘neutral’ to +5 ‘strongly agree’. They are ordered like this with the same order mentioned above (e.g., Q95_1 and Q111_1 correspondes to the first question in the list). This is the list of questions (in their original language):
	- Ik kan contact maken het kind aan het begin van het gesprek.
	- Het lukt me om de last die het kind ervaart te verhelderen.
	- Het lukt mij om de situatie van het kind concreet te krijgen.
	- Ik kan de gewenste toestand van het kind naar boven brengen.
	- Ik kan het kind ondersteunen bij het bedenken van een volgende stap.
	- Ik kan het gesprek afronden.
	- Ik kan empathie tonen door de meningen en gevoelens van het kind te erkennen.
	- Ik kan de informatie afstemmen aan het niveau van het kind.


### useful_usab_scores.csv
- ResponseID: Randomly generated id of the participants
- Q3_1 - Q10_1: PILO (perceived usefulness) questions**
- Q11_1 - Q11_10: SUS (usability) questions***
- Q12 - Q16: Qualititive questions****

**The PILO survey scale was from -5 ‘negative’, 0 ‘neutral’ to +5 ‘positive’. The questions are mentioned in the file in the second row. 

***The items were adapted from a translation of the SUS questionnaire to Dutch: Koning, D. (2016). Ontwerp van een online zelf-assessment voor het meten van de fysieke activiteit bij ouderen tussen de 55 en 75 jaar (Master's thesis, University of Twente).
The scale is form 1 ‘strongly disagree’ to 5 ‘strongly agree’. The questions are mentioned in the file in the second row. 

****The questions are mentioned in the file in the second row (in their original language)



### BDI_scores.csv
- ResponseID: Randomly generated id of the participants
- Session 1: BDI score of the bot of the first session at the end of the converation. The scale is from 0 to 10. 0 being the lowest beliefs values, and 10 the highest beliefs values possible.
- Session 3: BDI score of the bot in the third session at the end of the converation. The scale is from 0 to 10. 0 being the lowest beliefs values, and 10 the highest beliefs values possible.


### inter-reliability.csv
- Participant: participant number.
- Coder1 question1: the first coders assigned themes name for the first question (in text)*
- coder1nq1: a theme id number linked with a theme for the first question by coder 1 (in numbers) 
- Coder2 question1: the second coders assigned themes name for the first question (in text)*
- coder2nq1: a theme id number linked with a theme for the first question by coder 2 (in numbers) 
- Coder1 question2: the first coders assigned themes name for the second question (in text)*
- coder1nq2: a theme id number linked with a theme for the second question by coder 1 (in numbers) 
- Coder2 question2: the second coders assigned themes name for the second question (in text)*
- coder2nq2: a theme id number linked with a theme for the second question by coder 2 (in numbers) 
- Coder1 question3: the first coders assigned themes name for the third question (in text)*
- coder1nq3: a theme id number linked with a theme for the second question by coder 3 (in numbers) 
- Coder2 question3: the second coders assigned themes name for the third question (in text)*
- coder2nq3: a theme id number linked with a theme for the third question by coder 3 (in numbers) 

* "N.A" theme means that the coder indicated that there is no theme for the comment, and "empty" theme means the coder left the field theme blank.

The coding scheme for question 1 (coded in Coder1 question1) is:
- Fast response (timing) : the agent's timely response
- Realistic:  how realistic does the agent simulate a real child
- Insight into how the child thinks: about how the agent thinking is changing 
- Self-directed learning: the counselor takes intitiative in learning
- Reflective: providing reflection or feedback

The coding scheme for question 2 (coded in Coder1 question2) is:
- Not converse naturally: the agent being unrealistic in their interactions
- Repetitive answers: the agent repeats its conversation
- Emoticons: using of emoticons (e.g., emojis)
- Backtracking: ability to go back in the conversation
- Segmentation: the agents ability to understand different user intents in one input

The coding scheme for question 3 (coded in Coder1 question3) is:
- reflective: the feedback gives a good/bad reflection on ones performance
- insightful: the information given in the feedback is insightful or not
- unclear: the clearity of the given feedback
- no added value: value added by the agent's feedback
- unrealistic: realism of the feedback
- No feedback: the participant indicating they didn't get feedback

### themes_data.csv
- Question1_answers: the participants quotes as answers to question1
- theme1q1: the first assigned theme (if any) for the answers on question 1
- theme2q1: the second assigned theme (if any) for the answers on question 1
- Question2_answers: the participants quotes as answers to question1
- theme1q2: the first assigned theme (if any) for the answers on question 2
- theme2q2: the second assigned theme (if any) for the answers on question 2
- Question3_answers: the participants quotes as answers to question1
- theme1q3: the first assigned theme (if any) for the answers on question 3
- theme2q3: the second assigned theme (if any) for the answers on question 3
- RecommendLilo: which group do participants recommend lilobot to. (1: counselors-in-training, 2: novice counselors, 3: experienced counselors, 4: supervisors of the helpline)
