- Working directly withing the development teams on the design, deployment, capacity needs, and operations of microservices and supporting them as they transition to production
- Monitoring the availability, performance, and health of production systems in support of meeting service level objectives
- Using automation and tooling to continuously improve the reliability, scalability, and velocity of services deployed on AWS
- Being available to support on-call teams in emergency incident response as needed
- Practicing blameless postmortems that leads to improvements in resiliency of supported products
- Experience in the fields of Computer Science, Software Engineering, or related fields
- 2-5 years of relevant experience
- Demonstrable scripting experience, preferably in Python, Typescript, or Ruby
- Expertise with analyzing and troubleshooting large-scale, multi-region deployments in a public cloud (e.g. AWS)
- Experience with cloud deployment and management tools (e.g. Cloudformation and Ansible)
- Experience with monitoring and alerting tools (e.g. Cloudwatch, New Relic, PagerDuty)
- Ability to solve complex problems, optimize code, and automate routine tasks
- Self-driven and ability to lead objectives to completion
- Ability to coach junior team members
- Fluency in written and spoken English at CEF B2 level or above
- A bachelor's degree in Computer Science or related field
- Experience with container technology (e.g. Kubernetes, Docker)
- Experience with cost analysis and optimization techniques
- Experience with network and/or application security
DevOps Engineer II - Bogotá, Colombia - Anthology Inc
Descripción
Description
Site Reliability Engineer
Bogota, Colombia
*ONLY CVs SUBMITTED IN ENGLISH WILL BE CONSIDERED*
The Opportunity:
Anthology offers the largest EdTech ecosystem on a global scale, supporting over 150 million users in 80 countries. Our mission is to provide dynamic, data-informed experiences to the global education community so that learners and educators can achieve their goals.
We believe in the power of a truly diverse and inclusive workforce. As we expand globally, we are committed to making diversity, inclusion, and belonging a foundational part of not only our hiring practices but who we are as a company.
For more information about Anthology and our career opportunities, please visit anthology.
As a member of the Site Reliability Engineering team, you will combine software and systems engineering to help build and run large-scale, distributed and fault-tolerant systems. This is a driven, creative, and energetic team that works in a flexible and agile fashion to deliver world-class products to the education market. You will become a core contributing member to the Site Reliability Engineering team delivering eLearning services to over a thousand clients, comprising almost 4 million users worldwide.
Specific responsibilities include:
The Candidate:
Required skills/qualifications:
Preferred skills/qualifications:
This job description is not designed to contain a comprehensive listing of activities, duties, or responsibilities that are required. Nothing in this job description restricts management's right to assign or reassign duties and responsibilities at any time.
Anthology is an equal employment opportunity/affirmative action employer and considers qualified applicants for employment without regard to race, gender, age, color, religion, national origin, marital status, disability, sexual orientation, gender identity/expression, protected military/veteran status, or any other legally protected factor.