MANPOWER STAFFING SERVICES (SINGAPORE) PTE LTD

Site Reliability Engineer, Machine Learning Operations

Location
D02 Anson, Tanjong Pagar
Job Type
Full-time
Experience
Mid
Category
General
Salary
$7,500 - $8,500
Posted
3 weeks ago
Expires
Apr 4, 2026
Views
5

Job Details

Vacancies

1 position

Experience Required

No experience required

Job Description

Purpose of Role:

  • Frontline On-Call Ownership: Serve as the primary responder for the Applied Machine Learning Engine, taking ownership of system availability, health monitoring, and immediate incident response to ensure high reliability.
  • Incident Lifecycle Management: Manage the end-to-end feedback loop for incidents, including rapid triage, effective resolution, and the facilitation of post-incident reviews to ensure closure and prevent recurrence.
  • SOP Execution & Optimization: Execute upgrades and deployments strictly adhering to Standard Operating Procedures (SOPs), while actively leveraging Machine Learning and Infrastructure expertise to refine, automate, and improve these processes for greater efficiency.

Responsibilities:

  • Analyse all kinds of user needs related to machine learning systems provided by AML department , through oncall shifting or any other mechanisms, then propose customer oriented solutions .
  • Work with other software engineers to implement and deploy customer-oriented machine learning framework related solutions which are proposed by oneself or not .
  • Update software, enhances existing software capabilities, and develops or deploy software testing 、deployment 、capacity management and validation procedures.
  • Work with computer hardware engineers to integrate hardware and software systems and trouble-shooting specifications and performance requirements.

Minimum requirements:

  • Bachelor’s degree in Computer Science or equivalent with 3+ years of relevant experience
  • Proven experience in analyzing and troubleshooting distributed systems.
  • Prior experience designing or maintaining large-scale systems.
  • Scripting skills in at least one major language (Python, Go, or Shell/Bash) to automate repetitive operational tasks.

Nice to have:

  • Experience defining and managing Service Level Indicators (SLIs), Service Level Objectives (SLOs), error budgets, and practicing Chaos Engineering.
  • Experience operating MLOps platforms and toolkits such as Kubeflow, MLflow, Feast, or Ray.
  • Deep understanding of Linux operating system internals or container technologies (Docker/Containerd) and orchestration platforms (Kubernetes) in a production environment.
  • Basic understanding of Machine Learning concepts and familiarity with frameworks like TensorFlow Serving, TorchServe, or Triton Inference Server

Similar Jobs

PASONA SINGAPORE PTE. LTD.

(Japanese speaking) Customer Service Assistant (Medical industry)

PASONA SINGAPORE PTE. LTD. D01 Cecil, Marina, People’s Park, Raffles Place 9 hours ago

Assistant relationship manager (Private Bank - North Asia Team)

BEATHCHAPMAN (PTE. LTD.) Islandwide 9 hours ago
PASONA SINGAPORE PTE. LTD.

Development Executive (Gaming Industry/ Japanese Speaking) – JK

PASONA SINGAPORE PTE. LTD. Islandwide 9 hours ago

Market Risk Analyst – Asset Management

BEATHCHAPMAN (PTE. LTD.) D02 Anson, Tanjong Pagar 9 hours ago

Cleaning Operation Manager

BESTWAY CLEANING SERVICES PTE LTD Islandwide 9 hours ago

Response Reality Check

Quality: 60%
Response N/A
Company Stats
Response metrics N/A
Platform Spread
mycareersfuture
60%
Quality Score
N/A
Response Rate
MANPOWER STAFFING SERVICES (SINGAPORE) PTE LTD

MANPOWER STAFFING SERVICES (SINGAPORE) PTE LTD

About MANPOWER STAFFING SERVICES (SINGAPORE) PTE LTD

MANPOWER STAFFING SERVICES (SINGAPORE) PTE LTD Manpower is the global leader in contingent and permanent recruitment workforce solutions. It is part...

Ready to Apply?

This is a direct application to MANPOWER STAFFING SERVICES (SINGAPORE) PTE LTD. No recruitment agencies involved.

Apply for this Position

Response rate not available - Direct application to employer