Profile
I am an enthusiastic researcher bridging the realms of language and AI.
With a strong penchant for knowledge mining,
I excel in deriving structured insights from unstructured data.
My expertise extends to DevOps for Large Language Models (LLM),
showcasing my commitment to harnessing cutting-edge AI solutions.
Work Experience
Research Assistant
Darmstadt University of Technology      08.2021 - Present
BMBF-funded project
InsightsNet, emphasizing academic publications knowledge mining.
- Developed an annotation process pipeline for parsing these publications.
- Established cross-disciplinary connections based on the extracted content.
Machine Learning Intern
Robert Bosch GmbH      04.2020 - 10.2020
Crafted time series models with RNNs, GANs, and VAEs for bicycle sensor predictions.
- Optimized data preprocessing, model tuning, stayed abreast of ML and time series trends.
- Partnered with cross-functional teams for insights and model refinement.
Publications
LATEX Rainbow: Universal LATEX to PDF Document Semantic & Layout Annotation Framework
Changxu Duan, Zhiyin Tan, Sabine Bartsch,
In Proceedings of the second Workshop on Information Extraction from Scientific Publications at IJCNLP-AACL 2023.
Presenting an Annotation Pipeline for Fine-grained Linguistic Analyses of Multimodal Corpora
Elena Volkanovska, Sherry Tan, Changxu Duan, Debajyoti Paul Chowdhury, Sabine Bartsch,
In Proceedings of the first Workshop on Linguistic Insights from and for Multimodal Language Processing at KONVENS 2023.
The InsightsNet Climate Change Corpus (ICCC)
Sabine Bartsch, Changxu Duan, Sherry Tan, Elena Volkanovska, Wolfgang Stille,
Journal of Datenbank-Spektrum, pp. 1610-1995. 2023.
Projects & Tools
Explaining the idea of language model generation through Jabber and entity linking
Poster presentation at Machine Learning Operations Summer School 2022
- The generation of language models is often hallucinatory and imprecise.
- Attempt to validate generation by controlling the language model to babble on the same problem, together with entity linking to knowledge graph to validate the output of the language model.
Semi-supervised Event-centered Emotion Analysis and Performance Prediction
Master Thesis at Robert Bosch GmbH
Investigated event-centered emotion analysis applicability in Semi-supervised Learning.
- Designed a method using similarity to select unlabeled data, manage clustering attributes.
- Identified key attributes of SSL Emotion Analysis tasks for performance prediction.
SPARQL Autocompletion
https://github.com/Fireblossom/sparql-auto-completion
A language server tailored for enhancing SPARQL query functionalities.
- Syntax Highlighting: improved the syntax visualization of VSCode SPARQL extension.
- Advanced Features: auto-completion for prefixes & IntelliSense for classes and properties.
CRNN Speech Emotion Detector
https://github.com/Fireblossom/DeepDarkHomework
A Neural Network to predict emotion from MFCC features of time-series speech data.
- Based on VAD (Valence-Arousal-Dominance) model, which is a regression task.
- Model consists of a time distributional 1D CNN and GRU in Kera.
- The accuracy of the model is about 71%.
Wiener filter for microphone arrays
Speech enhancement and noise reduction on a 6-microphone array system in MATLAB.
- Determine the speaking direction by the sound delay recorded by the microphone array.
- Filter background noise and enhance speech in the direction of speaking with Wiener filters.