Posted May 18, 2026

AI Quality Evaluator (Polish)

Responsibilities • Evaluate AI model responses for personalization quality, including grounding, integration, and helpfulness. • Design and execute multi-turn prompts based on personal context to test AI capabilities. • Analyze responses for hallucinations, incorrect personalization, and poor inferences. • Perform side-by-side comparison of model outputs to determine quality and effectiveness. • Write clear and structured rationales for response evaluations and rankings. • Extract and verify debug information to ensure proper use of data sources. • Maintain strict data hygiene and ensure accurate documentation of evaluations. • Collaborate with cross-functional teams to improve AI model performance. Requirements • Strong proficiency in Polish with excellent reading and writing skills. • Experience in data annotation, AI evaluation, content moderation, or a related role. • Strong analytical thinking and ability to assess nuanced AI responses. • Ability to design creative, multi-turn prompts based on personal context. • Understanding of personalization concepts, including identifying incorrect or forced personalization. • High attention to detail in evaluating subtle differences in model outputs. • Excellent written communication and structured reasoning skills. • Ability to work independently in a remote environment. • Willingness to use a personal Google account for evaluation purposes. • Full-time availability with at least 4 hours overlap with PST. • Bachelor’s degree or equivalent experience in a relevant analytical field.

Apply for This Position

AI Quality Evaluator (Polish)

Similar Remote Roles