Remote Opportunity

AI Evaluation Engineer (Data Analysis & Multi-Agent Systems)

Join C-Serv as a senior professional working remotely from Worldwide. Explore the role, benefits, and apply in one place.

Contract
$120,000 - $180,000*
22 hours ago
Worldwide
AI Governance & Programs
Senior
Python
SQL
Docker
+2 more

Job Description

About Us Gramian Consultancy is a boutique consultancy specializing in IT professional services and engineering talent solutions. With a strong background in software engineering and leadership, we help companies build high-performing teams by matching them with professionals who truly fit their needs. Role overview We are looking for an AI Evaluation Engineer specialized in data analysis to design benchmark tasks that simulate real-world analytical workflows. You will create scenarios where AI systems must analyze large, messy, multi-source datasets, decompose tasks across multiple agents, and produce clear, verifiable conclusions. Commitments Required: 8 hours per day with an overlap of 4 hours with PST. Employment type: Contractor assignment (no medical/paid leave) Duration of contract: 4 weeks+ Location: Bangladesh, Brazil, Colombia, Egypt, Ghana, India, Indonesia, Kenya, Nigeria,Turkey, Vietnam Interview: take home assessment (60min) Responsibilities Design and develop multi-agent benchmark tasks focused on complex data analysis workflows Create or curate realistic datasets (CSV, JSON, logs, reports, financial or operational data) Build tasks requiring: Cross-referencing across multiple data sources Anomaly detection and contradiction identification Statistical analysis and interpretation Define task decomposition strategies across specialized sub-agents (e.g., financial, technical, operational analysis) Develop verification logic to validate precise analytical outputs (not generic summaries) Implement evaluation pipelines using Python and SQL Create reproducible environments using Docker Analyze task performance and refine for clarity, difficulty, and scoring accuracy 5+ years of experience in data analysis or analytics-heavy roles Strong proficiency in Python (pandas, NumPy) and SQL Experience working with real-world, messy datasets (CSV, JSON, logs, reports) Ability to design analytical problems with clear, verifiable answers Solid understanding of statistics (distributions, correlations, outliers) Familiarity with AI benchmarks or evaluation environments (e.g., SWE-bench or similar) Hands-on experience with Docker (Dockerfiles, image builds, debugging) Nice to Have Experience in financial analysis, operations analytics, or risk analysis Exposure to data pipelines or ETL workflows Experience with data quality validation or anomaly detection systems Familiarity with AI/ML data workflows or evaluation frameworks

Requirements

  • 8 hours per day with an overlap of 4 hours with PST
  • Design and develop multi-agent benchmark tasks focused on complex data analysis workflows
  • Create or curate realistic datasets (CSV, JSON, logs, reports, financial or operational data)
  • Build tasks requiring: Cross-referencing across multiple data sources
  • Anomaly detection and contradiction identification
  • Statistical analysis and interpretation
  • Define task decomposition strategies across specialized sub-agents (e.g., financial, technical, operational analysis)
  • Develop verification logic to validate precise analytical outputs (not generic summaries)

Skills

Python
SQL
Docker
Pandas
NumPy

About AI-Estimated Salary

The salary range shown was not provided by the employer. Our AI has estimated it based on the job title, required experience, location, and industry standards (confidence: 80%). This estimate should be used as a general guide only and may not reflect the actual compensation. Always confirm salary details directly with the employer during the application process.

Ready to Apply?

Join C-Serv today

Salary Range (AI-Estimated)*
$120,000 - $180,000
80% confidence
Posted 22 hours ago

More AI Governance & Programs roles you might like

Discover similar opportunities from companies that are also hiring remotely.

Full Time
$230k - $280k
9 hours ago
United States
Worldwide
AI Governance & Programs
Senior
Agentic Trust Framework
OWASP
NIST AI RMF
+5 more
Full Time
$230k - $280k
9 hours ago
United States
Worldwide
AI Governance & Programs
Senior
Agentic Trust Framework
OWASP
NIST AI RMF
+5 more
Full Time
$120,000 - $180,000*
17 hours ago
United Kingdom
Worldwide
AI Governance & Programs
Senior
AI
Machine Learning
Data Science
+3 more

Explore more remote openings

Browse fresh listings from our global community of remote-friendly teams.

Full Time
$120,000 - $180,000*
15 hours ago
Worldwide
AI Security & Privacy
Senior
OWASP Top 10
LLM Applications
API Security
+4 more
Contract
$120,000 - $180,000*
22 hours ago
Worldwide
AI Governance & Programs
Senior
Python
SQL
Docker
+2 more
Contract
$120,000 - $180,000*
22 hours ago
Worldwide
AI Governance & Programs
Senior
Python
Pandas
NumPy
+2 more
Contract
$120,000 - $180,000*
22 hours ago
Worldwide
AI Governance & Programs
Senior
Python
SQL
Docker
+2 more
Contract
$120,000 - $180,000*
22 hours ago
Worldwide
AI Governance & Programs
Senior
Python
SQL
Docker
+2 more
Contract
$80,000 - $160,000*
22 hours ago
Worldwide
AI Governance & Programs
Senior
Python
SQL
Docker
+2 more
Full Time
$120,000 - $180,000*
1 day ago
Worldwide
AI Governance & Programs
Senior
AI
Machine Learning
Python
+2 more
Full Time
$120,000 - $180,000*
1 day ago
Worldwide
AI Governance & Programs
Senior
Data governance
Data Architecture
Data Storage
+3 more
Full Time
$120,000 - $180,000*
1 day ago
United States
Worldwide
AI Governance & Programs
Senior
Cybersecurity
Data Analytics
Artificial Intelligence
+3 more
Full Time
$120,000 - $180,000*
1 day ago
Worldwide
Model Risk Management & Validation
Senior
Python
SQL
Scala
+3 more
Full Time
$120,000 - $180,000*
1 day ago
Worldwide
AI Security & Privacy
Senior
AI Security
Python
Java
+3 more
Full Time
PLN 175.75k - PLN 245k
1 day ago
Worldwide
AI Risk & Controls
Mid
Python
VBA
Excel
+2 more
Full Time
PLN 175.75k - PLN 245k
1 day ago
Worldwide
AI Risk & Controls
Mid
Python
VBA
Excel
+2 more
Full Time
$120,000 - $180,000*
2 days ago
Australia
New Zealand
AI Governance & Programs
Senior
AI
Machine Learning
Data Science
+5 more
Full Time
$7048.161k - $1061.802k
2 days ago
United States
Worldwide
AI Governance & Programs
Senior
Python
SQL
AI/ML
+4 more
Full Time
$120,000 - $180,000*
2 days ago
Worldwide
Americas
AI Security & Privacy
Senior
AI
Machine Learning
Cyber Security
+3 more
Full Time
$135k - $150k
2 days ago
Worldwide
AI Governance & Programs
Mid
Python
Machine Learning
LLM
+4 more
Full Time
$150,000 - $250,000*
2 days ago
Worldwide
AI Security & Privacy
Senior
OWASP ZAP
Nmap
Postman
+5 more
Full Time
$120,000 - $180,000*
2 days ago
Worldwide
AI Governance & Programs
Senior
Data governance
AI Policy
Risk Management
+5 more
Full Time
$150k - $200k
2 days ago
Worldwide
AI Governance & Programs
Mid
AI
Python
Clinical AI
+5 more
Full Time
$120,000 - $180,000*
3 days ago
Australia
Worldwide
AI Governance & Programs
Senior
Data governance
AI Ethics
Regulatory Compliance
+3 more
Full Time
$120,000 - $180,000*
3 days ago
Australia
Worldwide
AI Governance & Programs
Senior
Data governance
AI Ethics
Regulatory Compliance
+3 more
Full Time
$85k - $95k
3 days ago
United States
Model Risk Management & Validation
Senior
Model Risk Management
Quantitative Risk Management
Financial Modeling
+4 more
Full Time
$85k - $95k
3 days ago
United States
Model Risk Management & Validation
Senior
Model Risk Management
Quantitative Risk Management
Financial Modeling
+5 more
Full Time
$80,000 - $140,000*
3 days ago
United States
AI Risk & Controls
Mid
Excel
SQL
Python
+1 more
Full Time
$80,000 - $120,000*
3 days ago
United States
Model Risk Management & Validation
Mid
Excel
SQL
Python
+1 more
Full Time
$129k - $175k
3 days ago
Worldwide
AI Audit / Assurance / Controls Testing
Senior
API
Automation
Python
+3 more
Full Time
$129k - $175k
3 days ago
Worldwide
AI Audit / Assurance / Controls Testing
Senior
API
Automation
Python
+3 more
Full Time
$119.7k - $191.1k
3 days ago
Worldwide
AI Governance & Programs
Senior
Risk Management
Model Risk
Governance
+5 more
Full Time
$120,000 - $180,000*
3 days ago
Ireland
Worldwide
AI Compliance & Legal
Senior
Data Protection
AI Compliance
Regulatory Requirements
+3 more
Full Time
$100,000 - $150,000*
3 days ago
Worldwide
AI Governance & Programs
Mid
AI/ML Concepts
Tableau
JIRA
+1 more
Full Time
$204k - $255k
3 days ago
Worldwide
AI Policy, Enablement & Training
Senior
AI
Machine Learning
Policy Development
+4 more
Full Time
$120,000 - $180,000*
3 days ago
Worldwide
AI Security & Privacy
Staff
Python
ISO 27001
ISO 27701
+4 more
Full Time
$120,000 - $180,000*
3 days ago
Worldwide
AI Security & Privacy
Staff
Python
ISO 27001
ISO 27701
+4 more
Full Time
$120,000 - $180,000*
3 days ago
Worldwide
AI Security & Privacy
Staff
Python
Adversarial Machine Learning
AI Deployment Architectures
+4 more
Full Time
Up to PHP 150k
4 days ago
Worldwide
AI Security & Privacy
Senior
PyTorch
TensorFlow
Containerized Environments
+4 more