Senior Software Programmers
Company: Annapurna Labs (U.S.) Inc.
Location: Federal Way
Posted on: September 24, 2024
|
|
Job Description:
Would you enjoy identifying, defining, and building software
solutions that revolutionize how businesses operate?
AWS Utility Computing (UC) provides product innovations that
continue to set AWS's services and features apart in the industry.
As a member of the UC organization, you'll support the development
and management of Compute, Database, Storage, Platform, and
Productivity Apps services in AWS, including support for customers
who require specialized security solutions for their cloud
services. Additionally, this role may involve exposure to and
experience with Amazon's growing suite of generative AI services
and other cutting-edge cloud computing offerings across the AWS
portfolio.
Annapurna Labs (our organization within AWS UC) designs silicon and
software that accelerates innovation. Customers choose us to create
cloud solutions that solve challenges that were unimaginable a
short time ago-even yesterday. Our custom chips, accelerators, and
software stacks enable us to take on technical challenges that have
never been seen before, and deliver results that help our customers
change the world.
The Annapurna Labs team at Amazon Web Services (AWS) is looking for
a Software Development Engineer II to build, deliver, and maintain
complex products that delight our customers and raise our
performance bar. You'll design fault-tolerant systems that run at
massive scale as we continue to innovate best-in-class services and
applications in the AWS Cloud.
Our org covers multiple disciplines including silicon engineering,
hardware design and verification, software, and operations. AWS
Neuron is the complete software stack for the AWS Inferentia and
Trainium cloud-scale machine
This role is for a senior software engineer in the Machine Learning
Applications (ML Apps) team for AWS Neuron. This role is
responsible for development, enablement and performance tuning of a
wide variety of ML model families, including massive scale large
language models like GPT2, GPT3 and beyond, as well as stable
diffusion, Vision Transformers and many more.
The ML Distributed Training team works side by side with chip
architects, compiler engineers and runtime engineers to create ,
build and tune distributed training solutions with Trn1. Experience
training these large models using Python is a must. FSDP, Deepspeed
and other distributed training libraries are central to this and
extending all of this for the Neuron based system is key.
This role will help lead the efforts building distributed training
support into Pytorch, Tensorflow using XLA and the Neuron compiler
and runtime stacks. This role will help tune these models to ensure
highest performance and maximize the efficiency of them running on
the customer AWS Trainium and Inferentia silicon and the TRn1 ,
Inf1 servers. Strong software development and ML knowledge are both
critical to this role.
We have a broad mix of experience levels and tenures, and we're
building an environment that celebrates knowledge-sharing and
mentorship. Our senior members enjoy one-on-one mentoring and
thorough, but kind, code reviews. We care about your career growth
and strive to assign projects that help our team members develop
your engineering expertise so you feel empowered to take on more
complex tasks in the future.
Amazon Web Services (AWS) is the world's most comprehensive and
broadly adopted cloud platform. We pioneered cloud computing and
never stopped innovating - that's why customers from the most
successful startups to Global 500 companies trust our robust suite
of products and services to power their businesses.
Our employee-led affinity groups foster a culture of inclusion that
empower us to be proud of our differences. Ongoing events and
learning experiences, including our Conversations on Race and
Ethnicity (CORE) and AmazeCon (gender diversity) conferences,
inspire us to never stop embracing our uniqueness.
When we feel supported in the workplace and at home, there's
nothing we can't achieve in the cloud.
We're continuously raising our performance bar as we strive to
become Earth's Best Employer. That's why you'll find endless
knowledge-sharing, mentorship and other career-advancing resources
here to help you develop into a better-rounded professional.
Our team affords employees options to work in the office every day
or in a flexible, hybrid work model near one of our US Amazon
offices. Our hybrid models allow you the freedom to work from home
whenever in-office collaboration isn't necessary.
3+ years of non-internship professional software development
experience
- 3+ years of non-internship design or architecture (design
patterns, reliability and scaling) of new and existing systems
experience
- Experience programming with at least one software programming
language
- Deep Learning industry experience
3+ years of full software development life cycle, including coding
standards, code reviews, source control management, build
processes, testing, and operations experience
- Bachelor's degree in computer science or equivalent
- Preferred previous software engineer expertise with
Pytorch/Jax/Tensorflow, Distributed libraries and Frameworks,
End-to-end Model Training. The group presents lot of opportunity
for optimization and scaling large deep learning models on Trainium
architecture.
Amazon is an equal opportunity employer and does not discriminate
on the basis of race, national origin, gender, gender identity,
sexual orientation, protected veteran status, disability, age, or
other legally protected status. For individuals with disabilities
who would like to request an accommodation, please visit
Dependent on the position offered, equity, sign-on payments, and
other forms of compensation may be provided as part of a total
compensation package, in addition to a full range of medical,
financial, and/or other benefits.
Keywords: Annapurna Labs (U.S.) Inc., Federal Way , Senior Software Programmers, IT / Software / Systems , Federal Way, Washington
Click
here to apply!
|