Candidate Solutions

Search Jobs

  • Share this Job

Sr. Site Reliability Engineer- Internal Commercial System

Location : Remote
Job Type : Temp/Contract
Reference Code : 13123767
Compensation : 89.66 USD/HOUR
Start Date : 05/20/2022
End Date : 05/19/2023
Hours : Full Time
Required Years of Experience : 5
Required Education : Bachelor's Degree or Equivalent
Travel : No
Relocation : No

Job Description :
Job Description:


Site Reliability Engineering (SRE) combines software and systems engineering disciplines to build and run large-scale, massively distributed, fault-tolerant systems. As a Site Reliability Engineer, you will be creating innovative solutions to ensure new services continue to perform reliably at scale.


  • Build monitoring and automation to quickly triage and discover failures across hardware, software, applications and network

  • In-depth analysis of service trends and implements adjustments to mitigate risk and prevent issue recurrence

  • Provide guidance to software engineers related to design patterns that are resistant to failure

  • Collaborate with organizational partners to ensure services are designed scalable and operable

  • Participate in 24/7 On call rotation to support for Major Incident response

Required Qualifications :
Basic Qualifications: 

  • Bachelor's Degree or Equivalent

  • Technical knowledge of digital environments including: Mobile, Web, APIs, Messaging, Databases and Networks

  • Understanding of AWS Cloud solutions and product offerings and container technologies (i.e. Docker, Kubernetes)

  • Strong understanding of proactive monitoring methodologies using APM (i.e. AppDynamics, New Relic) solutions or other monitoring tools

  • Understanding of technical architecture, application systems design and integration in a large heterogeneous enterprise environment with hands on experience in SOA, Angular/Node, Java/J2EE, Oracle or MySQL/MariaDB programming

Preferred Qualifications:

  • 5+ years programming in one or more of: Java, Node, Python, Perl or C

  • 5+ years UNIX systems knowledge and/or systems administration background

  • Passion for designing, analyzing and troubleshooting large-scale distributed systems

  • Systematic problem-solving approach, coupled with strong communication skills and a sense of ownership

  • Experience contributing to open source code using version control

Additional Information:

***TOP 3 MUST HAVE SKILLS: 1. Proactive Monitoring, 2. Proficiency with Python or Golang, 3. Deep understanding of AWS Cloud

***Resource will be remote to start, but must be local/commutable distance to Orlando due to potential of hybrid schedule that will require them to report to the office 1 or 2 times per week in the future.

***Resource will need to work "on-call" on occasion (rotating schedule - ex. On Call once every 2 months)

Skills :
AWS java Network Node Perl Python
Powered by AkkenCloud