Specialised AI Engineer

NScale
London, United Kingdom
3 days ago
Job Type
Permanent
Work Pattern
Full-time
Work Location
On-site
Seniority
Senior
Education
Degree
Posted
27 May 2026 (3 days ago)

About Nscale

Nscale is taking on the hyperscalers by building a vertically integrated GenAI cloud platform. We own the data centres, software, and applications that power today's AI stack using sustainable technology solutions. We thrive on a culture of relentless innovation, ownership, and accountability, where every team member takes pride in their work and drives it with excellence and urgency. As a Nscaler, you’ll build trust through openness and transparency, where everyone is inspired to do their best work. Collaboration is key, and we work together swiftly and respectfully, embracing adaptability and resilience in all we do.

About the Role

Nscale is looking forSenior / Staff AI Engineers to join our core AI team and build the systems that power our GenAI cloud platform.

This role sits at the heart of our AI services platform, designing and optimising distributed systems that power large-scale training, post-training, evaluation, and low-latency, high-throughput inference under strict performance and efficiency constraints.

You may specialise deeply in areas such asinference optimisation, large-scale training, post-training (fine-tuning, alignment), or evaluation systems, or operate across multiple parts of the stack. In all cases, you’ll work onhard systems problems at scale, where performance, efficiency, and developer experience are critical.

This is a hands-on role for engineers who want to push the boundaries of how AI systems are built, optimised, and consumed by other AI engineers.

Responsibilities

  • Design, build, and optimisescalable AI platform systems spanning (one or more):
    • Driveinference performance and efficiency, including:
      • KV cache management, continuous batching, speculative decoding
      • Quantisation (INT8/4, FP8), sparsity, pruning, and model compression
    • Build and improvepost-training services, including:
      • Fine-tuning (LoRA, QLoRA, adapters, full fine-tuning)
      • Alignment (RLHF, DPO, reward modelling)
      • Agentic RL (tool calling, off-policy training, parallel thinking, decoupled sampling and updating)
      • Dataset curation and data processing workflows
    • Developevaluation and benchmarking systems to measure:
      • Model quality, safety, and regression
      • System performance (latency, throughput, cost)
      • Real-world behaviour and feedback loops
  • Develop and optimisedistributed systems for GPU/accelerator workloads, focusing on scalability, reliability, and efficiency
  • Conduct performance analysis and bottleneck investigations across multiple components and stacks spanning training, post-training, and inference
  • Collaborate with research, infrastructure, and product teams to build the right platform components based on customer demand and industry direction
  • Builddeveloper-facing APIs, SDKs, and tooling that enable other engineers to effectively use Nscale’s AI services

Requirements

  • 5+ years of experience building production systems in machine learning, distributed systems, or high-performance infrastructure
  • 4+ years of hands-on experience in at least one core area, within large-scale, production AI environments (e.g., AI labs, hyperscalers), such as:
    • Inference optimisation
    • Large-scale training / pre-training systems
    • Post-training (fine-tuning, alignment, distillation)
    • Evaluation and benchmarking frameworks
  • Strong hands-on expertise inat least one of the above areas, with working knowledge across others
  • Proven ability todesign, optimise, and operate systems at scale, with a strong understanding of performance trade-offs across latency, throughput, cost, and model quality
  • Deep understanding oftransformer architectures, LLMs, and/or multimodal models, including their behaviour in production systems
  • Strong proficiency inPython and PyTorch, with a track record of building production-grade ML systems
  • Experience withdistributed compute and training paradigms (e.g., data/model parallelism, sharding, scheduling)
  • Experience working close to thehardware/software boundary, such as:
    • GPU/accelerator optimisation (CUDA, ROCm, or similar)
    • Memory management and system-level performance tuning
  • Experience building or operatingproduction inference or training systems at scale
  • Ability to designclean abstractions, APIs, and reusable systems for other engineers
  • Strong engineering fundamentals, with a track record of writingmaintainable, well-tested, production-quality code

Preferred

  • Experience developing large-scale and high-load production systems.
  • Experience working in containerised, distributed environments (e.g., Kubernetes, large-scale clusters)
  • Experience contributing to or working with widely used/open-source AI frameworks or systems is strongly preferred
  • Hands-on experience with advanced inference optimisation techniques, such as KVCache, MoE, adaptive batching, or gradient checkpointing.
  • Experience developing APIs using OpenAPI 3.0+ specifications.
  • Knowledge of efficient training and inference evaluation strategies, with demonstrated success in improving model efficiency.

At Nscale, we are committed to fostering an inclusive, diverse, and equitable workplace. We believe that a variety of perspectives enriches our work environment, and we encourage applications from candidates of all backgrounds, experiences, and abilities. We strongly encourage applications from people of colour, the LGBTQ+ community, people with disabilities, neurodivergent people, parents, carers, and people from lower socio-economic backgrounds.

If there’s anything we can do to accommodate your specific situation, please let us know.

The responsibilities outlined in this job description are not exhaustive and are intended to provide a general overview of the position. The employee may be required to perform additional duties, tasks, and responsibilities as assigned by management, consistent with the skills and qualifications required for the role

For information on how Nscale handles candidate personal data, please see our Employee & Candidate Privacy Notice: Here.

Related Jobs

View all jobs

Embedded Systems Engineer - AI innovation

Safran Llantarnam, Gwent, United Kingdom
On-site

Software Engineer

Technify Talent Limited Middleton, Lancashire, Lancashire, LA3 3LE, United Kingdom
£45,000 – £50,000 pa On-site

Senior Edge GTM EMEA, AWS WWSO EC2 and Networking

Amazon London, United Kingdom
Hybrid

Embedded Systems Engineer

Ascend Consulting Southampton, Hampshire, SO19 8NJ, United Kingdom
£35,000 – £65,000 pa Hybrid

Senior Embedded Engineer

Ascend Consulting So147Ay, SO14 7AY, United Kingdom
£50,000 – £60,000 pa Hybrid

Electronics Project Engineer

Honeywell Yeovil, United Kingdom
Hybrid

Industry Insights

Discover insightful articles, industry insights, expert tips, and curated resources.

Where to Advertise Edge Computing Jobs in the UK (2026 Guide)

Where to advertise edge computing jobs UK in 2026: the specialist boards and channels that reach embedded, IoT, 5G MEC and edge AI engineering talent. Edge computing sits at the intersection of embedded systems, networking, cloud infrastructure and real-time data processing — and the professionals who specialise in it are a small, highly technical community not well served by general job boards. Candidates with genuine edge and IoT expertise are rarely browsing general platforms, and roles in this space are frequently misunderstood or miscategorised by non-specialist recruiters. This guide, published by EdgeComputingJobs.co.uk, covers where to advertise edge computing roles in the UK in 2026, how the main platforms compare, what employers should expect to pay, and what the data says about hiring across different role types.

Edge Computing Jobs UK 2026: What to Expect Over the Next 3 Years

Edge Computing Jobs UK 2026: roles, salaries and the IoT, 5G and edge AI hiring trends shaping UK edge computing careers over the next three years. Edge computing is quietly becoming one of the most consequential technology shifts of the decade — and the jobs market is starting to reflect that. As the limitations of centralised cloud infrastructure become apparent across industries that require real-time processing, ultra-low latency, and data sovereignty, the demand for professionals who can design, build, and manage computing at the edge has moved from niche to mainstream. But the edge computing jobs market of 2026 is not yet the mature, well-defined landscape that cloud computing has become. It is still forming. New architectures are emerging, standards are being established, and the range of industries deploying edge infrastructure is expanding rapidly — from manufacturing and telecommunications to healthcare, retail, autonomous vehicles, and smart cities. That creates a particular kind of opportunity for job seekers: the chance to build deep expertise in a discipline that is growing faster than the talent pipeline serving it. The candidates who will thrive over the next three years are those who understand where edge computing is heading — which use cases are driving commercial deployment, which technologies are defining the architecture of distributed systems, and how the skills required to work at the edge differ meaningfully from those that served professionals well in centralised cloud environments. This article breaks down what the UK edge computing jobs market is likely to look like through to 2028 — covering the titles emerging right now, the technologies driving employer demand, the skills that will matter most, and how to position your career ahead of the curve.