Tobias Hallmen M.Sc.
Phone: | +49 821 598 2328 |
Fax: | +49 821 598 2349 |
Email: | tobias.hallmen@uni-auni-a.de () |
Room: | 2022 (N) |
Open hours: | Upon Request |
Address: | Universitätsstraße 6a, 86159 Augsburg |
Research Interests
I carry out multimodal (audio, video, text) conversation analyses using methods from the field of machine learning and artificial intelligence (AI). I am investigating whether and how these methods can be used to evaluate and assess different conversational situations.
The conversations take place in the context of psychotherapy sessions (TherapAI project), human medicine and teacher training (KodiLL project). The aim is to find correlations and use them to make the quality of these conversations measurable and (automatically) assessable, as well as to improve them in the long term. This benefits both sides - the therapists/medical practitioners/teachers and the patients/parents.
For example, it is conceivable that the characteristics found could be used to intervene in therapies, or that AI-supported feedback and recommendations for action could be given to students during training in order to conduct better conversations in the future.
Final Theses
Here are topics that I envision suitable for final theses. Ideally, the results will be implemented as a module for Nova - this makes it easy to reuse and to correlate and evaluate the different characteristics on existing data sets. You are also welcome to contribute your own thematically appropriate suggestions:
- Speaker diarisation: Often there are no audio recordings separated by speaker, or if there are, you can also hear the other speakers (quieter) in your own recording. This distorts the assignment of audio-based features, e.g. transcription or emotion recognition.
- Existing methods use purely audio here - it would be conceivable to supplement this modality with video or text or other derived features (points of view) and thus improve speaker classification.
- Reception signals: While someone is speaking, listeners usually give reception signals (yes, mhm, head nod, etc.). These are indications of whether and to what extent someone is involved in the conversation. Existing methods need to be implemented, improved and evaluated.
- Remote photoplethysmography: Usually the filmed persons do not wear sensors, but some values would still be interesting, e.g. the “cuff-free blood pressure measurement” via video to determine heart rate and variability. These can be signs of excitement in the conversation and useful for evaluations.
- Language models as experts: Can (small) language models relieve people of time-consuming annotation and evaluation work, or at least support them? Due to sensitive data, these models must be executable locally, at best on end-user hardware.
- Language models as training partners: Language models are often used for synthetic data generation. Can they also be used (locally) as a useful training partner for practicing parental conversations at different levels of difficulty?
Publications
2024 |
Tobias Hallmen, Silvan Mertes, Dominik Schiller and Elisabeth André. in press. An efficient multitask learning architecture for affective vocal burst analysis. preprint. DOI: 10.48550/arXiv.2209.13914 |
Dominik Schiller, Tobias Hallmen, Daksitha Withanage Don, Elisabeth André and Tobias Baur. in press. DISCOVER: a Data-driven Interactive System for Comprehensive Observation, Visualization, and ExploRation of human behaviour. preprint. DOI: 10.48550/arXiv.2407.13408 |
Moritz Bauermann, Kathrin Gietl, Tobias Hallmen and Karoline Hillesheim. 2024. Förderung der Beratungskompetenz von Studierenden durch simulierte Lernumgebungen und KI-basiertes Feedback: ein Verbundprojekt im Rahmen des interdisziplinären KodiLL Teilprojekts 4 [Poster]. In Forschungstag der Philospohisch-Soziologischen Fakultät, 17. April 2024, Universität Augsburg. Universität Augsburg, Augsburg |
Brian Schwartz, A. Vehlen, S. T. Eberhardt, Tobias Baur, Dominik Schiller, Tobias Hallmen, Elisabeth André and W. Lutz. in press. Going multimodal and multimethod using different data layers of video recordings to predict outcome in psychological therapy. Clinical Psychological Science (special issue on Multidisciplinary Clinical Psychological Science) . |
Moritz Bauermann, Kathrin Gietl, Tobias Hallmen and Karoline Hillesheim. 2024. KI in Beratungsgesprächen: Zukunft der Kommunikation [Abstract]. In Katrin Bauer (Ed.). Campus meets Castle: Vernetzt in die Zukunft durch kompetenzorientierte Lehre in den Fächern, Symposium des Verbundprojektes von PLP, Bayziel und VHB, 18.–20. März 2024, Bayreuth; ein kurzer Rückblick. Universität Augsburg, Augsburg, 13 |
Moritz Bauermann, Kathrin Gietl, Karoline Hillesheim, Tobias Hallmen and Andreas Hartinger. 2024. KI-basiertes Feedback für simulierte Elterngespräche: eine qualitative Analyse studentischer Wahrnehmung und Gestaltungsperspektiven – KI-WaGen [Abstract]. In Krisen und Transformationen: 29. DGfE-Kongress 2024, 10. bis 13. März 2024, Halle (Saale). |
Moritz Bauermann, Ann-Kathrin Schindler, Tobias Hallmen, Miriam Kunz, Elisabeth André and Thomas Rotthoff. 2024. Studienprotokoll: "AI Effect – Untersuchung der lernwirksamen Annahme von KI-generierten und durch Avatare vermittelten Feedback und Feedforward zur ärztlichen Kommunikation bei Medizinstudierenden in einer Simulationsumgebung" [Abstract]. In Raphaël Bonvin (Ed.). "Über Lernen, Lehren und Prüfen hinaus... der Mensch!": Jahrestagung der Gesellschaft für Medizinische Ausbildung (GMA), 05.-09.08.2024, Freiburg, Schweiz; Abstractband. Gesellschaft für Medizinische Ausbildung (GMA), Erlangen, 115-117 |
Daksitha Senel Withanage Don, Dominik Schiller, Tobias Hallmen, Silvan Mertes, Tobias Baur, Florian Lingenfelser, Mitho Müller, Lea Kaubisch, Corinna Reck and Elisabeth André. 2024. Towards automated annotation of infant-caregiver engagement phases with multimodal foundation models. In ICMI '24: International Conference on Multimodel Interaction, San Jose, Costa Rica, November 4-8, 2024. ACM, New York, NY, 428-438 DOI: 10.1145/3678957.3685704 |
Tobias Hallmen, Fabian Deuser, Norbert Oswald and Elisabeth André. 2024. Unimodal multi-task fusion for emotional mimicry intensity prediction. In 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 17-18 June 2024, Seattle, WA, USA. IEEE, Piscataway, NJ, 4657-4665 DOI: 10.1109/cvprw63382.2024.00468 |
Antonia Vehlen, Steffen Eberhardt, Brian Schwartz, Tobias Baur, Dominik Schiller, Tobias Hallmen, Elisabeth André and Wolfgang Lutz. 2024. Verstehst du mich? Die Qualität automatischer Transkriptionen von Therapievideos im Kontext von Emotionsanalysen [Abstract]. In Ulrich Ansorge, Daniel Gugerell, Ulrich Pomper, Bence Szaszkó, Lena Werner (Eds.). 53rd DGPs Congress / 15th ÖGP Conference, September 16-19, 2024, Vienna, Austria: abstracts. Universität Wien, Wien, 740-741 |
2023 |
Tobias Hallmen, Silvan Mertes, Dominik Schiller, Florian Lingenfelser and Elisabeth André. 2023. Phoneme-based multi-task assessment of affective vocal bursts. In Donatello Conte, Ana Fred, Oleg Gusikhin, Carlo Sansone (Eds.). Deep Learning Theory and Applications: 4th International Conference, DeLTA 2023, Rome, Italy, July 13–14, 2023, proceedings. Springer Nature, Cham, 209-222 DOI: 10.1007/978-3-031-39059-3_14 |
Pia Schneider, Philipp Reicherts, Gulia Zerbini, Tobias Hallmen, Elisabeth André, Thomas Rotthoff and Miriam Kunz. 2023. Smiling doctor, happy patient: the role of facial expressions in patient-doctor communication [Abstract]. In Jan Born, Max Harkotte, Lisa Bastian, Julia Fechner (Eds.). 48. Annual Conference Psychologie und Gehirn, 08.06.2023-10.06.2023, Tübingen: abstract booklet. Eberhard Karls Universität Tübingen, Tübingen, 273 |