- Home
- Vacatures
- Vacatures Veenendaal
- Vacaturedetails
Leerling Monteur Installatietechniek
Vacature doormailen
Masters thesis in Data & AI: Scaling a Machine Learning Data Quality framework for Generalizability Veenendaal • Info Support
- Vacature rapporteren
Gevraagd
-
Fulltime
Aanbod
-
Loondienst (vast)
-
1.000 p/m (bruto)
-
Auto v/d zaak
Vacature in het kort
Over het bedrijf
Volledige vacaturetekst
Challenging assignment with €1000 compensation or €500 + lease car or €600 + housing, professional guidance, training sessions, knowledge events, brainstorming with colleagues and 2 vacation days p/m.We know that the quality of a training dataset is an important indicator of model performance, but how well does a data quality framework developed on a simple machine learning task generalize to real-world ML scenarios? In this thesis, you’ll extend and test an existing framework on diverse datasets, models and tasks, including LLMs. You’ll explore new quality dimensions and benchmark generalizability, building towards a practical tool that helps teams evaluate and improve their data in complex AI pipelines
ð¡Areas of Interest: data quality, machine learning, LLMs, statistics, data science
The impact of data quality on machine learning performance is well established, yet most frameworks are tested only in limited, controlled environments. Previously, we developed an automatic data quality framework (Automatic Assessment of Dataset Quality for ML), which showed promising results by quantifying data quality across three core dimensions: completeness, consistency and accuracy using synthetic data and a small set of machine learning models.
However, real-world systems operate in far more varied and complex contexts. Today’s AI models range from classical algorithms to advanced LLMs, and datasets span structured tables, text, sensor streams, and more. Without validating how such a framework performs across these environments, its insights remain confined to the lab. The question is not just does it work, but how well does it generalize?
The Assignment
This thesis explores the generalizability of the developed data quality assessment framework across a wide spectrum of machine learning use cases. You will:
· Extend dataset coverage using both real-world and synthetic data from diverse domains (e.g., healthcare, finance, social media, e-commerce, public benchmarks).
· Diversify task types, including classification, regression and clustering.
· Broaden algorithmic scope by comparing a range of machine learning models.
· Evaluate the role of LLMs in assessing data quality across different data domains.
· Compare model responses to varying data quality degradations across model sizes and architectures.
· Add quality dimensions such as uniqueness, timeliness, accessibility, believability and statistical measures.
· Benchmark generalizability by measuring the framework’s reliability across tasks, models, and datasets.
The final deliverable is an empirically validated, modular extension of the original framework - capable of guiding users towards improving their datasets for machine learning.
About Info Support
Info Support specializes in custom software, data/AI solutions, management, and training and is active in the Finance, Industry, Agriculture, Food & Retail, Mobility & Public, and Healthcare sectors. We provide solid and innovative solutions for complex and critical software issues. Our headquarters are located in Veenendaal (NL) and Mechelen (BE). At present, approximately 500 employees are employed by Info Support.
Info Support's working method is characterized by a number of core values: solidity, integrity, craftsmanship, and passion. These core values are intertwined in our work and the way we interact with each other.
To ensure that all employees are always up to date with the latest developments, Info Support has an in-house knowledge center that eagerly satisfies the hunger for more or different knowledge and skills.
B2 language proficiency in Dutch is required.
Vanaf nu ontvang je automatisch de best passende vacatures automatisch in je mailbox.
Jouw inschrijving
Emailadres:
Functie:
Plaats:
Frequentie:
Wijzig je inschrijving
Ontvang als eerste nieuwe vacatures voor Leerling Monteur in Veenendaal
Vind nieuw personeel op Werkzoeken.nl